From: Christoph Hellwig on
Adding Uli to the Cc list to make sure this system call is useful
for glibc / can be exported by it. Otherwise it's rather pointless
to add it.

> (6) BSD stat compatibility: Including more fields from the BSD stat such as
> creation time (st_btime) and inode generation number (st_gen) [Jeremy
> Allison, Bernd Schubert]

How is this different from (1) and (4)?

> (7) Extra coherency data may be useful in making backups [Andreas Dilger].

What do you mean with that?

> (8) Allow the filesystem to indicate what it can/cannot provide: A filesystem
> can now say it doesn't support a standard stat feature if that isn't
> available.

What for?

> (9) Make the fields a consistent size on all arches, and make them large.

Why making them large for the sake of it? We'll need massive changes
all through libc and applications to ever make use of this. So please
coordinate the types used with Uli.

> The following structures are defined for the use of these new system calls:
>
> struct xstat_parameters {
> unsigned long long request_mask;
> };

Just pass this as a single flag by value. And just make it an unsigned
long to make the calling convention a lot simpler.


> struct xstat_dev {
> unsigned int major, minor;
> };
>
> struct xstat_time {
> unsigned long long tv_sec, tv_nsec;
> };

No point in adding special types here that aren't genericly useful.
Also this is the first and only system call using split major/minor
values for the dev_t. All this just creates more churn than it helps.

>
> struct xstat {
> unsigned long long st_result_mask;

Just st_mask?

> unsigned long long st_data_version;

st version?

> unsigned long long st_inode_flags;



> The defined bits in request_mask and st_result_mask are:
>
> XSTAT_REQUEST_MODE Want/got st_mode
> XSTAT_REQUEST_NLINK Want/got st_nlink
> XSTAT_REQUEST_UID Want/got st_uid
> XSTAT_REQUEST_GID Want/got st_gid
> XSTAT_REQUEST_RDEV Want/got st_rdev
> XSTAT_REQUEST_ATIME Want/got st_atime
> XSTAT_REQUEST_MTIME Want/got st_mtime
> XSTAT_REQUEST_CTIME Want/got st_ctime
> XSTAT_REQUEST_INO Want/got st_ino
> XSTAT_REQUEST_SIZE Want/got st_size
> XSTAT_REQUEST_BLOCKS Want/got st_blocks
> XSTAT_REQUEST__BASIC_STATS The stuff in the normal stat struct
> XSTAT_REQUEST_BTIME Want/got st_btime
> XSTAT_REQUEST_GEN Want/got st_gen
> XSTAT_REQUEST_DATA_VERSION Want/got st_data_version
> XSTAT_REQUEST_INODE_FLAGS Want/got st_inode_flags
> XSTAT_REQUEST__EXTENDED_STATS The stuff in the xstat struct
> XSTAT_REQUEST__ALL_STATS The defined set of requestables

What's the point of the REQUEST in the name? Also no double
underscores inside the identifier. Instead adding a _MASK postfix
for masks would make it a lot more clear.

> The defined bits in st_inode_flags are the usual FS_xxx_FL flags in the LSW,
> plus some extra flags in the MSW:
>
> FS_SPECIAL_FL Special kernel file, such as found in procfs
> FS_AUTOMOUNT_FL Specific automount point
> FS_AUTOMOUNT_ANY_FL Free-form automount directory
> FS_REMOTE_FL File is remote
> FS_ENCRYPTED_FL File is encrypted
> FS_SYSTEM_FL File is marked system (DOS/NTFS/CIFS)
> FS_TEMPORARY_FL File is temporary (NTFS/CIFS)
> FS_OFFLINE_FL File is offline (CIFS)

Please don't overload the FL_ namespace even more. It's already a
complete mess given that it overloads the extN on-disk namespace.
You're much better off just adding a clean new namespace.

> The system calls are:
>
> ssize_t ret = xstat(int dfd,
> const char *filename,
> unsigned flags,
> const struct xstat_parameters *params,
> struct xstat *buffer,
> size_t buflen);

If you already have a buflen parameter there is absolute no need for
the extra results field. Just define new fields at the end and include
them if the bufsize is big enough and it's in the mask of requested
fields.

> When the system call is executed, the request_mask bitmask is read from the
> parameter block to work out what the user is requesting. If params is NULL,
> then request_mask will be assumed to be XSTAT_REQUEST__BASIC_STATS.

Why add a special case like that? Especially if we make the request
flags a pass by value scalar initalizing it is trivial.

> The request_mask should be set by the caller to specify extra results that the
> caller may desire. These come in a number of classes:
>
> (0) dev, blksize.
>
> These are local data and are always available.
>
> (1) mode, nlinks, uid, gid, [amc]time, ino, size, blocks.
>
> These will be returned whether the caller asks for them or not. The
> corresponding bits in result_mask will be set to indicate their presence.
>
> If the caller didn't ask for them, then they may be approximated. For
> example, NFS won't waste any time updating them from the server, unless as
> a byproduct of updating something requested.

Please don't introduce tons of special cases. Instead use a simple rule
like:

- a filesystem must return all attributes requests, or return an
error if it can't.
- a filesystem may return additional attributes, the caller can detect
this by looking at st_mask.

plus possibly a list of attributes the filesystem must be able to
provide if requests. I don't see a reason to make that mask different
from the attributes required by Posix.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Jan Engelhardt on

On Thursday 2010-07-15 04:17, David Howells wrote:
> (6) BSD stat compatibility: Including more fields from the BSD stat such as
> creation time (st_btime) and inode generation number (st_gen) [Jeremy
> Allison, Bernd Schubert].
>where st_btime is the file creation time, st_gen is the inode generation
>(i_generation), st_data_version is the data version number (i_version),
>st_inode_flags is the flags from FS_IOC_GETFLAGS plus some extras,
>request_mask and st_result_mask are bitmasks of data desired/provided and
>st_extra_results[] is where as-yet undefined fields are appended.

Linux already has a creation time field, it's called otime (there is no "b" in
"creation"), and you will find scattered fragments of that all over the kernel
(foremost, fs/jfs/, now btrfs, and I also notice sysvipc having something with
that name).

> struct xstat_time {
> unsigned long long tv_sec, tv_nsec;
> };

If it helps getting rid of the ugly suseconds_t in userspace ;-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Jan Engelhardt on
On Sunday 2010-07-18 10:48, Christoph Hellwig wrote:

>Adding Uli to the Cc list to make sure this system call is useful
>for glibc / can be exported by it. Otherwise it's rather pointless
>to add it.
>
>> (6) BSD stat compatibility: Including more fields from the BSD stat such as
>> creation time (st_btime) and inode generation number (st_gen) [Jeremy
>> Allison, Bernd Schubert]
>
>How is this different from (1) and (4)?

(4), the inode generation number, need not be related to time.

>> (8) Allow the filesystem to indicate what it can/cannot provide: A filesystem
>> can now say it doesn't support a standard stat feature if that isn't
>> available.
>
>What for?

Given xstat.otime=0, how would you determine whether the file is really
tagged with a date of 1970, or whether it's just the fs which didnot
store this kind of information.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Jan Engelhardt on
On Thursday 2010-07-22 14:17, Volker Lendecke wrote:
>On Thu, Jul 22, 2010 at 01:14:47PM +0100, David Howells wrote:
>> Jan Engelhardt <jengelh(a)medozas.de> wrote:
>>
>> > Linux already has a creation time field, it's called otime (there is no "b"
>> > in "creation"), and you will find scattered fragments of that all over the
>> > kernel (foremost, fs/jfs/, now btrfs, and I also notice sysvipc having
>> > something with that name).
>>
>> It is? It's called crtime in Ext4. st_btime, however, would be compatible
>> with BSD's stat, and Samba would just use it by way of autoconf magic if it
>> appeared.

Of course you can find remnants of btime in Linux's BSD-style task
accounting, but Linux always looked more like SysV than BSD, speaking
for otime. And if you are using autoconf, the cost of using otime over
btime seems the same.

>Samba has the following check:
>
># recent FreeBSD, NetBSD have creation timestamps called birthtime:
>AC_CHECK_MEMBERS([struct stat.st_birthtimespec.tv_nsec])
>AC_CHECK_MEMBERS([struct stat.st_birthtime], AC_CHECK_MEMBERS([struct stat.st_birthtimensec]))
>
>and the supporting code around that. "birth" might also be
>where the "b" comes from :-)

Well, in all reference to the Matrix movie, files aren't born. Except
for Directory Default ACLs and possibly security labels, they usually
don't inherit either :) And on a CS level, it's more like copy than
inherit, because if the parent changes, the file does not (with the
potential exception of security relabeling, bla).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Jan Engelhardt on

On Thursday 2010-07-22 19:16, Trond Myklebust wrote:
>> >But the fact is, th Unix ctime semantics are insane and largely
>> >useless. There's a damn good reason almost nobody uses ctime under
>> >unix.
>>
>> I beg to differ. ctime is not completely useless. It reflects changes on
>> the inode for when you don't you change the content. It's like an mtime
>> for the metadata. It comes useful when you go around in your filesystem
>> trying to figure out who of your co-admins screwed up the permissions on
>> /etc/passwd... and if the mtime is the same as that of the last backup,
>> I can at least have a reasonable assurance that it was /only/ the
>> metadata that was tampered with. (SHA1 check, yeah yeah, costly on large
>> files.)
>
>Errr... Only if you eliminate utimes() from your syscall table.
>Otherwise it is trivial to reset the mtime after changing the file
>contents.

Well yes; I had implicitly implied that evil people with malicious intent
are absent.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/