|
Prev: Anyone happen to have optimization hints for this loop?
Next: trouble building Python 2.5.1 on solaris 10
From: writeson on 9 Jul 2008 11:52 Hi all, I'm writing some code that monitors a directory for the appearance of files from a workflow. When those files appear I write a command file to a device that tells the device how to process the file. The appearance of the command file triggers the device to grab the original file. My problem is I don't want to write the command file to the device until the original file from the workflow has been copied completely. Since these files are large, my program has a good chance of scanning the directory while they are mid-copy, so I need to determine which files are finished being copied and which are still mid-copy. I haven't seen anything on Google talking about this, and I don't see an obvious way of doing this using the os.stat() method on the filepath. Anyone have any ideas about how I might accomplish this? Thanks in advance! Doug
From: Larry Bates on 9 Jul 2008 12:09 writeson wrote: > Hi all, > > I'm writing some code that monitors a directory for the appearance of > files from a workflow. When those files appear I write a command file > to a device that tells the device how to process the file. The > appearance of the command file triggers the device to grab the > original file. My problem is I don't want to write the command file to > the device until the original file from the workflow has been copied > completely. Since these files are large, my program has a good chance > of scanning the directory while they are mid-copy, so I need to > determine which files are finished being copied and which are still > mid-copy. > > I haven't seen anything on Google talking about this, and I don't see > an obvious way of doing this using the os.stat() method on the > filepath. Anyone have any ideas about how I might accomplish this? > > Thanks in advance! > Doug The best way to do this is to have the program that copies the files copy them to a temporarily named file and rename it when it is completed. That way you know when it is done by scanning for files with a specific mask. If that is not possible you might be able to use pyinotify (http://pyinotify.sourceforge.net/) to watch for WRITE_CLOSE events on the directory and then process the files. -Larry
From: Manuel Vazquez Acosta on 9 Jul 2008 12:09 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 This seems a synchronization problem. A scenario description could clear things up so we can help: Program W (The workflow) copies file F to directory B Program D (the dog) polls directory B to find is there's any new file F In this scenario, program D does not know whether F has been fully copied, but W does. Solution: Create a custom lock mechanism. Program W writes a file D/F.lock to indicate file F is not complete, it's removed when F is fully copied. I program W crashes in mid-copy both F and F.lock are kept so program D does not bother to process F. Recovery from the crash in W would another issue to tackle down. Best regards, Manuel. writeson wrote: > Hi all, > > I'm writing some code that monitors a directory for the appearance of > files from a workflow. When those files appear I write a command file > to a device that tells the device how to process the file. The > appearance of the command file triggers the device to grab the > original file. My problem is I don't want to write the command file to > the device until the original file from the workflow has been copied > completely. Since these files are large, my program has a good chance > of scanning the directory while they are mid-copy, so I need to > determine which files are finished being copied and which are still > mid-copy. > > I haven't seen anything on Google talking about this, and I don't see > an obvious way of doing this using the os.stat() method on the > filepath. Anyone have any ideas about how I might accomplish this? > > Thanks in advance! > Doug > -- > http://mail.python.org/mailman/listinfo/python-list > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkh04skACgkQI2zpkmcEAhi0eQCgsVqg51fWiwi47jxqtbR8Gz2U UukAoKm15UAm3KpEyjhsIGQ+68rq8WuU =UFHi -----END PGP SIGNATURE-----
From: norseman on 9 Jul 2008 13:07 Also available: pgm-W copies/creates-fills whatever B/dummy when done, pgm-W renames B/dummy to B/F pgm-D only scouts for B/F and does it thing when found Steve norseman(a)hughes.net Manuel Vazquez Acosta wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > This seems a synchronization problem. A scenario description could clear > things up so we can help: > > Program W (The workflow) copies file F to directory B > Program D (the dog) polls directory B to find is there's any new file F > > In this scenario, program D does not know whether F has been fully > copied, but W does. > > Solution: > Create a custom lock mechanism. Program W writes a file D/F.lock to > indicate file F is not complete, it's removed when F is fully copied. > I program W crashes in mid-copy both F and F.lock are kept so program D > does not bother to process F. Recovery from the crash in W would another > issue to tackle down. > > Best regards, > Manuel. > > writeson wrote: >> Hi all, >> >> I'm writing some code that monitors a directory for the appearance of >> files from a workflow. When those files appear I write a command file >> to a device that tells the device how to process the file. The >> appearance of the command file triggers the device to grab the >> original file. My problem is I don't want to write the command file to >> the device until the original file from the workflow has been copied >> completely. Since these files are large, my program has a good chance >> of scanning the directory while they are mid-copy, so I need to >> determine which files are finished being copied and which are still >> mid-copy. >> >> I haven't seen anything on Google talking about this, and I don't see >> an obvious way of doing this using the os.stat() method on the >> filepath. Anyone have any ideas about how I might accomplish this? >> >> Thanks in advance! >> Doug >> -- >> http://mail.python.org/mailman/listinfo/python-list >> > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.9 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org > > iEYEARECAAYFAkh04skACgkQI2zpkmcEAhi0eQCgsVqg51fWiwi47jxqtbR8Gz2U > UukAoKm15UAm3KpEyjhsIGQ+68rq8WuU > =UFHi > -----END PGP SIGNATURE----- > -- > http://mail.python.org/mailman/listinfo/python-list >
From: writeson on 9 Jul 2008 14:48 Guys, Thanks for your replies, they are helpful. I should have included in my initial question that I don't have as much control over the program that writes (pgm-W) as I'd like. Otherwise, the write to a different filename and then rename solution would work great. There's no way to tell from the os.stat() methods to tell when the file is finished being copied? I ran some test programs, one of which continously copies big files from one directory to another, and another that continously does a glob.glob("*.pdf") on those files and looks at the st_atime and st_mtime parts of the return value of os.stat(filename). From that experiment it looks like st_atime and st_mtime equal each other until the file has finished being copied. Nothing in the documentation about st_atime or st_mtime leads me to think this is true, it's just my observations about the two test programs I've described. Any thoughts? Thanks! Doug
|
Next
|
Last
Pages: 1 2 3 4 Prev: Anyone happen to have optimization hints for this loop? Next: trouble building Python 2.5.1 on solaris 10 |