From: Danmath on
On 11 ago, 04:42, David Schwartz <dav...(a)webmaster.com> wrote:

> That won't solve your problem, since another application can open the
> file for write a split second after you open it.

There is only one application reading from the input directory as each
application has it's own input directory. This is a batch process
reading always from the same directory. What you say could happen in a
case of open/write/close/open/write/close, which I'm not taking into
account, so I'm not particularly worried about this case. Although I
did mention in my original post that I was also interested in knowing
how to keep other applications away once I opened a file:

I quote myself:

"...although I would be interested in knowing how to keep other
applications (not programmed by me) from opening the file once I
opened it"

Spilling out a little bit more of information.... the current process
implements this by reading the last modification time, it waits for a
few seconds, then it checks the last modification date again. Of
course there is a validation done when the file is opened for
processing that checks fopen() doesn't return error, in which case it
carries on with the next file, after logging. Now, if I could get this
fopen() call to return error if the file is being written to, or
replace this call for whatever mechanism for file opening would do
this, then the modification time check could be wiped of the program.
It wouldn't fix the OWCOWC problem, but the current version doesn't
either. I just don't like this modification time checking. If I could
open the file knowing it will return error if some other application
is no writing to it already, that would be good enough, being able to
keep it from being opened in write mode by other applications would be
great.

Talking about modification times, when are they set? every times data
gets written even without closing the file? What if the writing
program writing the file only writes every 1 minute.... I don't think
the current algorithm is reliable if writes can be that sporadic, and
waiting for larger amounts of time is a waist of time.

By the way, currently the process use fread and fwrite to access
files. If you where to replace fopen() by another function like
open(). How could you instance a valid FILE structure associated with
the new file descriptor so that the rest of the program does't have to
be changed to read() and write(), in a secure way.

I agree with you that the only totally fool proof solution requires
cooperation if exclusive file opening is not possible. It's just that
this wood require modifications of various processes and file
transmission tools in use, and I'm not saying that it wouldn't be
worth it, I's just that I know what happens in this company when I
make suggestions that can make better something that isn't causing
deaths... "yeah...mmm.. we will look into it...mmm...".

From: Gordon Burditt on
>There is only one application reading from the input directory as each
>application has it's own input directory. This is a batch process

I strongly recommend that the application filter out the files ".",
"..", directories, special files, sockets, FIFOs, and anything
matching "*.core" as candidates for input files. While you're at
it, you could filter out "*.tmp" as well.

>reading always from the same directory. What you say could happen in a
>case of open/write/close/open/write/close, which I'm not taking into
>account, so I'm not particularly worried about this case. Although I
>did mention in my original post that I was also interested in knowing
>how to keep other applications away once I opened a file:
>
>I quote myself:
>
>"...although I would be interested in knowing how to keep other
>applications (not programmed by me) from opening the file once I
>opened it"
>
>Spilling out a little bit more of information.... the current process
>implements this by reading the last modification time, it waits for a
>few seconds, then it checks the last modification date again. Of

Assuming the files are being transferred in with FTP, this isn't enough.
It is easy for a network hiccup, like one dropped packet, to cause
the modification time to get a few seconds old. It could get a minute
old before the TCP connection starts giving errors on either end.

Question: with your method of file transfer, imagine a file is
halfway transferred, then the network cable is cut. Does the partial
file get left there indefinitely, or does the (FTP, perhaps) daemon
eventually detect that the transfer has failed and *DELETE* the
partial incoming file? If not, can it be made to do so? How quickly
do failed partial incoming files vanish? That's a ballpark figure
for any timestamp age threshholds.

>course there is a validation done when the file is opened for
>processing that checks fopen() doesn't return error, in which case it
>carries on with the next file, after logging. Now, if I could get this
>fopen() call to return error if the file is being written to, or
>replace this call for whatever mechanism for file opening would do
>this, then the modification time check could be wiped of the program.

"the file is being written to" is not the condition you are looking
for, especially if the network is slower than the disk. It will
"flicker" on and off while the file is being transferred.

>It wouldn't fix the OWCOWC problem, but the current version doesn't
>either. I just don't like this modification time checking. If I could
>open the file knowing it will return error if some other application
>is no writing to it already,

You want it to return error if some other application is *NOT* writing
to it already?

>that would be good enough, being able to
>keep it from being opened in write mode by other applications would be
>great.

>Talking about modification times, when are they set? every times data
>gets written even without closing the file?

Yes. That's write() on the receiving end.

>What if the writing
>program writing the file only writes every 1 minute....

Then the modification time can easily get to be a minute old (watch
the modification time on a log file sometime). With a network file
transfer, this can easily happen before the sending process gets
an error, if it ever gets an error. The network congestion could
pass and the file transfer could eventually be completed, well after
your process read what it thought was a complete file (but wasn't).

If the sending process inserts long pauses between sending data
(perhaps it's reading data from something really slow, like a card
reader manually loaded by a human operator who has to order the
correct card deck from overseas, after getting budget approval to
continue, and the operator has to be replaced often because they
keep dying of boredom), the modification date could get really,
really, really old, like days, months, years, or decades while the
file is still open for write.

>I don't think
>the current algorithm is reliable if writes can be that sporadic, and
>waiting for larger amounts of time is a waist of time.

If it's unreliable, then waiting for longer amounts of time is not
wasteful.

>By the way, currently the process use fread and fwrite to access
>files. If you where to replace fopen() by another function like
>open(). How could you instance a valid FILE structure associated with
>the new file descriptor so that the rest of the program does't have to
>be changed to read() and write(), in a secure way.

Look up fdopen(). The point of this would most likely be to call open()
with various exotic flags, then proceed with the file copy using stdio
functions if the open succeeded. You can also go the other way with
fileno() to get the underlying file descriptor number if, say, you want
to put locks on it after fopen()ing it.

>I agree with you that the only totally fool proof solution requires
>cooperation if exclusive file opening is not possible. It's just that
>this wood require modifications of various processes and file
>transmission tools in use, and I'm not saying that it wouldn't be
>worth it, I's just that I know what happens in this company when I
>make suggestions that can make better something that isn't causing
>deaths... "yeah...mmm.. we will look into it...mmm...".

*What* file transfer tools are in use? I wonder if this could be
handled by just modifying the FTP daemon *on the receiving end*
(the receiving machine, only) to put exclusive locks on files while
they are being transferred, *regardless* of what's on the other end
of the connection. It could be as simple as adding O_EXLOCK to the
open flags for opening the received file. Remember that only the
receiving machine has to support that.

I still think the transfer-and-rename approach has a lot to be said
for it. That could also include initially transferring the file
into a subdirectory, then renaming it to the top-level directory.

Another approach, used by things like print spoolers and UUCP, is
to transfer one or more data files, then transfer a "job" file which
names the data files to use and what to do with them. The "job"
file doesn't get created until the associated data files have been
transferred successfully. The "job" file also tends to be very
short (fits in one packet, contains a few lines mostly consisting
of filename(s) ).


From: Rainer Weikusat on
Danmath <danmath06(a)gmail.com> writes:
> On 11 ago, 04:42, David Schwartz <dav...(a)webmaster.com> wrote:
>> That won't solve your problem, since another application can open the
>> file for write a split second after you open it.
>
> There is only one application reading from the input directory as each
> application has it's own input directory. This is a batch process
> reading always from the same directory. What you say could happen in a
> case of open/write/close/open/write/close, which I'm not taking into
> account, so I'm not particularly worried about this case. Although I
> did mention in my original post that I was also interested in knowing
> how to keep other applications away once I opened a file:
>
> I quote myself:
>
> "...although I would be interested in knowing how to keep other
> applications (not programmed by me) from opening the file once I
> opened it"
>
> Spilling out a little bit more of information.... the current process
> implements this by reading the last modification time, it waits for a
> few seconds, then it checks the last modification date again. Of
> course there is a validation done when the file is opened for
> processing that checks fopen() doesn't return error, in which case it
> carries on with the next file, after logging. Now, if I could get this
> fopen() call to return error if the file is being written to, or
> replace this call for whatever mechanism for file opening would do
> this, then the modification time check could be wiped of the
> program.

I assume that you are targetting Linux since you are using it for
posting. Have you considered taking really drastic measures such as
'consulting the reference documentation'? This can be done by trying
to acquire a write lease on this file and the detailed description how
to do that is in the fcntl manpage. Exceprt:

F_WRLCK

Take out a write lease. This will cause the caller to be
notified when the file is opened for reading or writing or is
truncated. A write lease may be placed on a file only if
there are no other open file descriptors for the file.

#define _GNU_SOURCE
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

static int die(char const *msg)
{
perror(msg);
exit(1);
return -1;
}

int main(void)
{
pid_t pid;
int fd, rc;

fd = open("file", O_RDWR | O_CREAT, 0666);
fd != -1 || die("open/ create");

pid = fork();
pid != -1 || die("fork");

if (pid == 0) {
close(fd);

fd = open("file", O_RDONLY | O_NONBLOCK, 0);
fd != -1 || die("open/ read");

while (fcntl(fd, F_SETLEASE, F_RDLCK) == -1) {
perror("fnctl");
sleep(1);
}

write(1, "I kid you not!", sizeof("I kid you not!") - 1);
_exit(0);
}

sleep(5);
return 0;
}
From: Rainer Weikusat on
Rainer Weikusat <rweikusat(a)mssgmbh.com> writes:

[...]

> by trying to acquire a write lease on this file

[...]

> while (fcntl(fd, F_SETLEASE, F_RDLCK) == -1) {

This should of course have been 'a read lease', as done in the sample
code.
From: Danmath on
On 12 ago, 11:32, Rainer Weikusat <rweiku...(a)mssgmbh.com> wrote:

> I assume that you are targetting Linux since you are using it for
> posting.

No. I post from personal computers. Running Windows in this case.

This application runs on a server, AIX currently. But I don't want to
program specificaly for a single OS.