Network File Access Performance [Fortran]

Prev: order statistics of the Normal distribution
Next: Decline in posting counts

From: GaryScott on 5 Mar 2010 16:41

I think I understand this sufficiently but wanted to hear other's
understanding of the performance issues associated with network access
of a normal disk area. I am specifically interested in CVF or IVF
behavior (or IVF if there is significant differences from CVF). I am
unconcerned with portability to other platforms at this time.

I have prevented myself from implementing any sort of rapid search
function in an application because of fear of excessive network
traffic each time I issue the next record read in a loop searching for
specific content in the record. The particular file is direct access
with something like 10240 bytes (don't remember exactly but a multiple
of 512 bytes). So at least 10240 bytes is read each time.

So, does CVF/IVF do anything special to lessen network traffic such as
downloading a large chunk of the file on first access and performing
many subsequent reads from a local copy? that seems unlikely because
the file is multiple-access shared. Is there any thing I can do to
remove this concern (I understand I could use a database server...but
not for this application). Would a different record size be preferred
(e.g. one more efficient for TCP/IP)?

From: glen herrmannsfeldt on 5 Mar 2010 17:13

GaryScott <garylscott(a)sbcglobal.net> wrote:

> I think I understand this sufficiently but wanted to hear other's
> understanding of the performance issues associated with network access
> of a normal disk area. I am specifically interested in CVF or IVF
> behavior (or IVF if there is significant differences from CVF). I am
> unconcerned with portability to other platforms at this time.

It isn't supposed to depend on the compiler. Well, I know NFS somewhat
better than some of the other network disk protocols...

> I have prevented myself from implementing any sort of rapid search
> function in an application because of fear of excessive network
> traffic each time I issue the next record read in a loop searching for
> specific content in the record. The particular file is direct access
> with something like 10240 bytes (don't remember exactly but a multiple
> of 512 bytes). So at least 10240 bytes is read each time.

This is a complicated question. Both networks and disks are getting
faster, so that maybe it doesn't matter so much as it used to.
NFS works fine with direct access. The usual transfer size
is 8K, so that might work a little better than 10K.

> So, does CVF/IVF do anything special to lessen network traffic such as
> downloading a large chunk of the file on first access and performing
> many subsequent reads from a local copy? that seems unlikely because
> the file is multiple-access shared.

The basic idea behind most network file protocols is that they
should be transparent. Things should work just the same is
they would for a local disk. NFS is well known, at least in
the Sun implementations, to do read ahead (the assumption being
sequential access). Also, Unix and many Unix-like systems will
do local disk buffering.

About 10 years ago I was working on a project reading and writing
large files, mostly NFS mounted. Once testing a program, I realized
that it was running faster than the 100 megabit ethernet could do,
though the file was about 600 megabytes. Then I remembered that
this was running on a 4G byte machine, which could cache the
whole 600 MB file.

With NFS and Unix-like systems, I believe that there is a good
chance of a local cache for reading. It is NFS tradition that
writes not be acknowledged until the data has been written to
the actual disk. That is less true now, partly because disks
have internal cache, because the OS might cache writes, and
because the client might cache them. In all three cases,
precautions are supposed to be taken, but there are overrides.

> Is there any thing I can do to
> remove this concern (I understand I could use a database server...but
> not for this application). Would a different record size be preferred
> (e.g. one more efficient for TCP/IP)?

How fast does it need to be? I would go at least to gigabit
etherent, very affordable these days for small networks.
NFS has some tunable parameters, such as the read and write size.

-- glen

From: GaryScott on 5 Mar 2010 18:51

On Mar 5, 5:03 pm, glen herrmannsfeldt <g...(a)ugcs.caltech.edu> wrote:
> GaryScott <garylsc...(a)sbcglobal.net> wrote:
>
snip

> You mean multiple writers? Unix pretty much allows programs to
> do whatever they want, including have more than one program write
> at the same time. Windows does more locking, even not allowing
> you to read some files that you should be able to read.
>
> NFS includes a lock mechanism, but it never worked very well,
> and as far as I know, still doesn't.

I handle locking in the application itself. The application must
signal other applications that it wants exclusive write access on a
record-by-record basis. Only one user has write access at a time, but
only to the level of an individual direct access record (and its
associated blobs). However, as far as the OS is concerned, it knows
nothing of this locking mechanism and considers the file fully shared
and unlocked.

>
snip
>

> -- glen

From: GaryScott on 5 Mar 2010 18:55

On Mar 5, 5:51 pm, GaryScott <garylsc...(a)sbcglobal.net> wrote:
> On Mar 5, 5:03 pm, glen herrmannsfeldt <g...(a)ugcs.caltech.edu> wrote:> GaryScott <garylsc...(a)sbcglobal.net> wrote:
>
> snip
>
> > You mean multiple writers? Unix pretty much allows programs to
> > do whatever they want, including have more than one program write
> > at the same time. Windows does more locking, even not allowing
> > you to read some files that you should be able to read.
>
> > NFS includes a lock mechanism, but it never worked very well,
> > and as far as I know, still doesn't.
>
> I handle locking in the application itself. The application must
> signal other applications that it wants exclusive write access on a
> record-by-record basis. Only one user has write access at a time, but
> only to the level of an individual direct access record (and its
> associated blobs). However, as far as the OS is concerned, it knows
> nothing of this locking mechanism and considers the file fully shared
> and unlocked.
>
>

P.S. I fibbed, I actually have implemented this search function (I
forgot - oh no, the brain cells are going). What I did was slow it
down by adding a delay between read operations. :) it appears to show
performance similar to a normal oracle database search function - lol,
the secret of oracle unmasked...

>
>
>
>
>
> snip
>
> > -- glen- Hide quoted text -
>
> - Show quoted text -

From: glen herrmannsfeldt on 5 Mar 2010 19:18

GaryScott <garylscott(a)sbcglobal.net> wrote:
(snip)

> I handle locking in the application itself. The application must
> signal other applications that it wants exclusive write access on a
> record-by-record basis. Only one user has write access at a time, but
> only to the level of an individual direct access record (and its
> associated blobs). However, as far as the OS is concerned, it knows
> nothing of this locking mechanism and considers the file fully shared
> and unlocked.

It seems to me that is the unix way.

-- glen

| Next | Last
Pages: 1 2
Prev: order statistics of the Normal distribution
Next: Decline in posting counts