From: Neil Brown on
On Wed, 28 Jul 2010 18:28:02 +0100
David Howells <dhowells(a)redhat.com> wrote:

> Neil Brown <neilb(a)suse.de> wrote:
>
> > ctime and mtime have real cache-coherence semantics which require them being
> > updated by the kernel (whether the cache is on an NFS client, in a backup
> > archive, or in a .o translation of a .c file).
>
> So does creation time, at least for CIFS caching. Creation time has potential
> for spotting when the object at a pathname has changed for something else,
> given the lack of inode number and inode generation from windows servers.
> Creation time gives us one more datum to use.

This justifies for me why a CIFS client would want to extract the
creation-time from the CIFS protocol, but not why you want to expose it via a
generic interface.
The kernel/filesystem doesn't need to maintain creation-time to meet this
need, only the CIFS server needs to maintain it - the kernel/filesystem just
needs to provide somewhere to store it - xattrs.

Given that we have an extensible attribute framework, it seems wrong to be
adding new attributes to *stat. If a given filesystem wants to store certain
attributes more efficiently, then it is welcome to intercept xattr calls and
store (say) "cifs.birthtime" directly at a known offset in the inode.

The flip-side of extracting these various attributes is setting them. One
presumably doesn't want to set st_data_version and possibly not st_gen, but
there seems to be a need to set st_btime and FS_SYSTEM_FL and FS_TEMPORARY_FL
might want to be set. Your xstat doesn't give any way to do that, xattrs
already does - you just need to define names for the attributes.

So I'm against adding new attributes that simply involve the fs storing some
information for the application to use.

I'm still pondering those extra flags:
FS_SPECIAL_FL
FS_AUTOMOUNT_FL
FS_AUTOMOUNT_ANY_FL
FS_REMOTE_FL
FS_ENCRYPTED_FL
FS_OFFLINE_FL

They sound like they might be useful, they are not file-metadata (like
btime) but rather implementation details (like st_blocks). So it is probably
sensible to include them as you have done.

However I would really like to see clear and complete documentation for them.
When exactly should a filesystem set these flag, and what exactly can an
application assume if they are (or are not) set.

If a filesystem is mounted on an network-block-device, or a loop-back of a
file on NFS, is FS_REMOTE_FL set?
Is ROT13 enough for FS_ENCRYPTED_FL to be set?
If the NFS server is "not responding, still trying", should FS_OFFLINE_FL get
set on all files?
And I cannot even guess at the different between the two FS_AUTOMOUNT flags.
I'm sure it is something useful, but doco would be good. Should one of them
be set on mountpoints that NFSv4 detects from the server?

They sound useful, but they are only really useful if they have precise
meanings.


>
> > The only role the kernel might have would be setting the 'creation time' when
> > the file was created, but it seems even that isn't always what is wanted,
> > because people don't so much what the time of create of the
> > container-on-disk, but the time of creation of the data-content.
>
> That should be a timestamp in the content itself, not a filesystem metadata
> timestamp.
>
> > I would want to see a pretty convincing use-case that cannot be solved with
> > xattrs before 'creation time' was added to a generic kernel interface.
>
> Then there's no point even considering this. You could emulate the entirety
> of stat() with getxattr(). I've previously posted a patch to implement the
> retrieval of creation time, inode gen and data version as xattrs and been told
> that it's the wrong way to do it and I should extend stat instead.

:-( stuck between a rock and a hard-place ??

It would probably help to keep that sort of decision process (complete with
who to blame) documented in the change-log entry, but one never thinks of
doing that at the time.

I don't have any veto power here, and I don't want any. I think creation
time and inode gen make more sense as xattrs. I'm less certain about
data-version as the kernel has to know about it as a first-class concept.

If I have any power to influence the results, I would much rather spend it on
requiring clear precise useful definitions for each new attribute than on
determining which attributes should be first-class and which should be xattrs.


>
> > So just use xattrs and don't involve the kernel in any detailed knowledge of
> > this value.
>
> Why not? BSD has it in its stat struct. Windows has it in its Win32
> equivalents. Samba for one will look for it there, and use it if it is.
>
> Using an xattr means an extra pathwalk and extra locking per access for any
> program that wants it. It's a reasonable bet such a program will also be
> stat'ing the file it wants the creation time for.
>
> If we are going to extend stat anyway, then why not make out a short list of
> extra things we could usefully return and consider adding them? Something
> like creation time is reasonably easy to come by for little extra overhead.
> Ext4, for example, retains a copy of it in RAM in its inode struct.

Providing everybody imposes exactly the same semantics for "creation time"...

>
> > Maybe xstat should take a list of xattrs to be retrieved as well?? or maybe
> > not.
>
> The idea of xstat() having a variable-length buffer and variable arguments has
> been well derided. It ain't going to happen, much though I'd like it to. I'd
> quite like to offer the opportunity to return the security label, for example.

"well derided" like high-mem and SMP support? or "real-time" support and
priority inheritance?
I guess the deriders are wrong, and will eventually realise that they are
wrong. The difficult bit is we cannot know how long it will take them, or
how much you have to care.

NeilBrown
(unambiguous documentation!! the rest is just details)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Trond Myklebust on
On Fri, 2010-07-30 at 13:55 -0400, Phil Pishioneri wrote:
> On 7/22/10 2:59 PM, Trond Myklebust wrote:
> > The fact remains that most of us would be hard pressed to name an
> > application
>
> Microsoft Office?
>
> > that requires you to share the same dataset to both
> > Windows/CIFS and posix NFS clients.
>
> NFS client: Mac OS X (NFSv3, since v4 on it is still alpha *cough*).
>
> > tends to discourage mixing the two environments.
>
> Or is "discourage" not strong enough term to describe that we shouldn't
> be doing this?
>
> -Phil

Your Mac has a perfectly functional CIFS client, as do your Linux boxes.
They both interoperate just fine with Samba, and would presumably
continue to do so if someone were to decide to reuse the ctime field on
your Samba box as storage for a create time.

Trond

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: J. Bruce Fields on
On Thu, Jul 29, 2010 at 09:04:01AM +1000, Neil Brown wrote:
> On Wed, 28 Jul 2010 18:28:02 +0100
> David Howells <dhowells(a)redhat.com> wrote:
>
> > Neil Brown <neilb(a)suse.de> wrote:
> >
> > > ctime and mtime have real cache-coherence semantics which require them being
> > > updated by the kernel (whether the cache is on an NFS client, in a backup
> > > archive, or in a .o translation of a .c file).
> >
> > So does creation time, at least for CIFS caching. Creation time has potential
> > for spotting when the object at a pathname has changed for something else,
> > given the lack of inode number and inode generation from windows servers.
> > Creation time gives us one more datum to use.
>
> This justifies for me why a CIFS client would want to extract the
> creation-time from the CIFS protocol, but not why you want to expose it via a
> generic interface.
> The kernel/filesystem doesn't need to maintain creation-time to meet this
> need, only the CIFS server needs to maintain it

For what it's worth, the NFSv4 server would also export creation time if
we had it.

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: utz lehmann on
On Sat, 2010-07-31 at 10:08 +0200, Jan Engelhardt wrote:
> >When abusing an existing time stamp use atime not ctime please.
> >ctime has it's uses. atime was just a mistake and is nearly useless.
>
> MUAs make use of atime.

I know mutt uses atime to detect new messages. But there are better and
more reliable ways to do this.

>
> >And with noatime we already have creation time semantics for atime.
>
> noatime was a late afterthought, and because it can interfere with
> some programs, relatime came along too.

There are people who prefer noatime over relatime.

Using an existing time stamp for creation time is a bad idea IMHO. But
when doing this use the least important one. Which is atime. For example
ctime is used by backup programs.

Anyway when we want to support creation time it should be an additional
time stamp.


utz


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: utz lehmann on
On Sat, 2010-07-31 at 17:53 +0100, David Howells wrote:
> utz lehmann <lkml123(a)s2y4n2c.de> wrote:
>
> > When abusing an existing time stamp use atime not ctime please.
> > ctime has it's uses. atime was just a mistake and is nearly useless.
>
> CacheFiles currently uses atime to determine least-recently-usedness.

How does this works right with noatime or relatime (which is default)?

We had used FS-Cache with a few 10000s files cached. Doesn't it mean
that the cleanup has to stat them all?

Why didn't cachefilesd managed the cache index in a separate database
like other caches?


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/