From: Charles on 4 May 2010 09:02
"Gregory Ewing" <greg.ewing(a)canterbury.ac.nz> wrote in message
> Charles wrote:
>> In the OP's case, references to the directory have been removed
>> from the file system, but his process still has the current working
>> directory reference to it, so it has not actually been deleted.
>> When he opens "../abc.txt", the OS searches the current directory
>> for ".." and finds the inode for /home/baz/tmp,
> This doesn't seem to be quite correct. An experiment I just did
> reveals that the link count on the parent directory goes down by
> one when the current directory is deleted, suggesting that the ..
> link has actually been removed... yet it still works!
> I think what must be happening is that the kernel is maintaining
> an in-memory reference to the parent directory, and treating ".."
> as a special case when looking up a name.
> (This probably shouldn't be too surprising, because ".." is special
> in another way as well -- at the root of a mounted file system, it
> leads to the parent of the mount point, even though the actual ".."
> link on disk just points back to the same directory. Probably it
> simplifies the name lookup logic to always treat it specially.)
I am by no means an expert in this area, but what I think
happens (and I may well be wrong) is that the directory
is deleted on the file system. The link from the parent
is removed, and the parent's link count is decremented,
as you observed, but the directory itself is still intact with
it's original contents, including the "." and ".." names and
associated inode numbers. Unix does not normally zero
out files on deletion - the file's blocks usually retain their
contents, and I would not expect directories to be
a special case.
The blocks from the directory will be re-cycled
when the memory reference (the process's current
working directory) disappears, but until then the directory
and its contents are still accessible via the process's current
directory. This is all supposition, and based on distant
memories form the mid 1980s, I could very well be
From: Baz Walter on 4 May 2010 09:03
On 04/05/10 03:19, Grant Edwards wrote:
> On 2010-05-03, Baz Walter<bazwal(a)ftml.net> wrote:
>> On 03/05/10 19:12, Grant Edwards wrote:
>>> Even though the user provided a legal and openable path?
>> that sounds like an operational definition to me: what's the
>> difference between "legal" and "openable"?
> Legal as in meets the syntactic requirements for a path (not sure if
> there really are any requirements other than it being a
> null-terminated string). Openable meaning that it denotes a path file
> that exists and for which the caller has read permissions on the file
> and execute premissions on the directories within the path.
openable is not the same as accessible. a file can still openable, even
though a user may not have permission to access it.
a better definition of "legal path" might be whether any useful
information can be gained from a stat() call on it.
From: Ben Finney on 4 May 2010 09:23
Baz Walter <bazwal(a)ftml.net> writes:
> On 04/05/10 02:12, Ben Finney wrote:
> > Baz Walter<bazwal(a)ftml.net> writes:
> >> yes, of course. i forgot about hard links
> > Rather, you forgot that *every* entry that references a file is a
> > hard link.
> i'm not a frequent poster on this list, but i'm aware of it's
> reputation for pointless pedantry ;-)
Only pointless if you view this as a conversation entirely for the
benefit of you and I. I, on the other hand, am trying to make this
useful to whoever may read it now and in the future.
> note that i said hard links (plural) - i think a more generous reader
> would assume i was referring to additional hard links.
The point, though, was that this is normal operation, rather than
exceptional. Files have zero or more hard links, and “this file has
exactly one hard link” is merely a common case among that spectrum.
I'm glad you already knew this, and hope you can appreciate that it's
better explicit than implicit.
\ “The difference between religions and cults is determined by |
`\ how much real estate is owned.” —Frank Zappa |
From: Nobody on 4 May 2010 09:29
On Tue, 04 May 2010 23:02:29 +1000, Charles wrote:
> I am by no means an expert in this area, but what I think happens (and I
> may well be wrong) is that the directory is deleted on the file system.
> The link from the parent is removed, and the parent's link count is
> decremented, as you observed, but the directory itself is still intact
> with it's original contents, including the "." and ".." names and
> associated inode numbers. Unix does not normally zero out files on
> deletion - the file's blocks usually retain their contents, and I would
> not expect directories to be a special case.
You are correct.
System calls don't "delete" inodes (files, directories, etc), they
"unlink" them. Deletion occurs when the inode's link count reaches zero
and no process holds a reference to the inode (a reference could be a
descriptor, or the process' cwd, chroot directory, or an mmap()d file, etc).
IOW, reference-counted garbage collection.
From: Baz Walter on 4 May 2010 09:36
On 04/05/10 09:23, Gregory Ewing wrote:
> Grant Edwards wrote:
>> In your example, it's simply not possible to determine the file's
>> absolute path within the filesystem given the relative path you
> Actually, I think it *is* theoretically possible to find an
> absolute path for the file in this case.
> I suspect that what realpath() is doing for a relative path is
> something like:
> 1. Use getcwd() to find an absolute path for the current
> 2. Chop off a trailing pathname component for each ".."
> on the front of the original path.
> 3. Tack the filename on the end of what's left.
> Step 1 fails because the current directory no longer has
> an absolute pathname -- specifically, it has no name in
> what used to be its parent directory.
> What realpath() is failing to realise is that it doesn't
> actually need to know the full path of the current directory,
> only of its parent directory, which is still reachable via
> ".." (if it weren't, the file wouldn't be reachable either,
> and we wouldn't be having this discussion).
> A smarter version of realpath() wouldn't try to find the
> path of the *current* directory, but would follow the
> ".." links until it got to a directory that it did need to
> know an absolute path for, and start with that.
> Unfortunately, there is no C stdlib routine that does the
> equivalent of getcwd() for an arbitrary directory, so
> this would require realpath() to duplicate much of
> getcwd()'s functionality, which is probably why it's
> done the way it is.
actually, this part of the problem can be achieved using pure python.
given the basename of a file, all you have to do is use os.stat and
os.listdir to recursively climb up the tree and build a dirpath for it.
start by doing os.stat(basename) to make sure you have a legal file in
the current directory; then use os.stat('..') to get the parent
directory inode, and stat each of the items in os.listdir('../..') to
find a name matching that inode etc. (note that the possibility of
hard-linked directories doesn't really spoil this - for relative paths,
we don't care exactly which absolute path is found).
this will work so long as the file is in a part of the filesystem that
can be traversed from the current directory to the root. what i'm not
sure about is whether it's possible to cross filesystem boundaries using
this kind of technique.