From: Gary L. Scott on
On 3/5/2010 6:18 PM, glen herrmannsfeldt wrote:
> GaryScott<garylscott(a)sbcglobal.net> wrote:
> (snip)
>
>> I handle locking in the application itself. The application must
>> signal other applications that it wants exclusive write access on a
>> record-by-record basis. Only one user has write access at a time, but
>> only to the level of an individual direct access record (and its
>> associated blobs). However, as far as the OS is concerned, it knows
>> nothing of this locking mechanism and considers the file fully shared
>> and unlocked.
>
> It seems to me that is the unix way.

Its a windows application.
>
> -- glen

From: Louis Krupp on
GaryScott wrote:
> On Mar 5, 4:13 pm, glen herrmannsfeldt <g...(a)ugcs.caltech.edu> wrote:
>> GaryScott <garylsc...(a)sbcglobal.net> wrote:
> <snip>
>>> I have prevented myself from implementing any sort of rapid search
>>> function in an application because of fear of excessive network
>>> traffic each time I issue the next record read in a loop searching for
>>> specific content in the record. The particular file is direct access
>>> with something like 10240 bytes (don't remember exactly but a multiple
>>> of 512 bytes). So at least 10240 bytes is read each time.
>> This is a complicated question. Both networks and disks are getting
>> faster, so that maybe it doesn't matter so much as it used to.
>> NFS works fine with direct access. The usual transfer size
>> is 8K, so that might work a little better than 10K.
>
> 8K would force me to break the record into two records as designed,
> but I could do that with a redesign. I was reserving additional lines
> for related "blobs" (e.g. attachments) but could define a new record
> type easily enough.

I would try reading dummy data with 10240-byte records (or whatever size
is convenient) and see if performance is OK. If it is, I would think
twice before redesigning the code around the current transfer size.

Louis
From: JB on
On 2010-03-05, GaryScott <garylscott(a)sbcglobal.net> wrote:
> So, does CVF/IVF do anything special to lessen network traffic such as
> downloading a large chunk of the file on first access and performing
> many subsequent reads from a local copy?

Most runtime libraries do some amount of I/O buffering. Typically the
buffer is not particularly large as it's mostly meant for masking
syscall overhead rather than network latency. Gfortran, for instance,
uses a 8 kB buffer per unit. However, my reading of the ifort manual
seems to suggest it doesn't do any buffering for direct access
files. Then again, since your records are 10 kB enabling buffering in
the I/O library would probably not make much difference.

> that seems unlikely because
> the file is multiple-access shared. Is there any thing I can do to
> remove this concern (I understand I could use a database server...but
> not for this application).

This IMHO is the issue to worry about. As you're on windows you're
presumably using the CIFS protocol. CIFS by default is uncached,
meaning that every read or write go directly to the server. However
there is a feature called opportunistic locks, where the CIFS client
can lock a file for exclusive access, allowing aggressive client-side
caching. But if another client accesses the file, the server must
recall any such locks, meaning that the client having an oplock must
flush it's dirty data and revert to the default uncached
behavior. Note that this locking is not visible to applications in any
way.

The solution to this problem would be to have all I/O go through a
server component rather than directly to the file. And of course, many
search problems can be solved by using indexing rather than raw
scanning through all the data. But at that point it's probably better
to use some database anyway instead of reinventing the wheel.

> Would a different record size be preferred
> (e.g. one more efficient for TCP/IP)?

Maybe. Unfortunately the other subthread went a bit off the rails
wrt. transfer sizes, so lets see:

- CIFS supports a maximum of 64 kB transfer size, experiments suggests
the windows cifs client uses at most around 60 kB for reads and 32
kB for writes.

- Newer versions of windows (Vista+, w2k3r2+ IIRC) use the newer SMB2
protocol, which supports much larger maximum transfer sizes.

- NFS is irrelevant to the current discussion, but just to set the
record straight: NFSv2 over UDP allowed a maximum 8 kB. A decade or
so ago most shops switched to NFSv3 over TCP, and at the time most
implementations defaulted to a maximum 32 kB transfer size which was
the "standard" for a long time. Currently, many implementations
support and default to up to 1 MB max transfer size. Wrt. the
caching discussion, NFS differs from CIFS by using a weaker
consistency model called "close-to-open" consistency; see the NFS
RFC for details.

Wrt. your application, I'd say use as big records as makes sense from
an application perspective, the bigger the better. If the records have
to be split up at some point, so be it. Well, within reason of course;
multi-GB records might make you run out of memory on the client. But
records of, say, several MB's is certainly reasonable for today's
networks and servers.

--
JB
From: glen herrmannsfeldt on
JB <foo(a)bar.invalid> wrote:
(snip)

> - NFS is irrelevant to the current discussion, but just to set the
> record straight: NFSv2 over UDP allowed a maximum 8 kB. A decade or
> so ago most shops switched to NFSv3 over TCP, and at the time most
> implementations defaulted to a maximum 32 kB transfer size which was
> the "standard" for a long time. Currently, many implementations
> support and default to up to 1 MB max transfer size. Wrt. the
> caching discussion, NFS differs from CIFS by using a weaker
> consistency model called "close-to-open" consistency; see the NFS
> RFC for details.

MS has supplied SFU (services for unix) for W2K and XP, which
includes both NFS client and server. It comes as a separate,
free, download from microsoft.com.

For Vista and, as far as I can tell, Windows 7, there is SUA,
subsystem for unix applications. SUA, for versions of Vista
and W7, includes NFS client, but not server.

NFS server, according to some web sites, is included with
(or maybe optional for) Windows Server 2008.

Otherwise, for unix utilities but not NFS there is a port
of the GNU utilities to Win32, UNXUTILS.ZIP, which I have
used on many different systems.

-- glen