From: Andreas Dilger on
On 2010-06-29, at 19:17, David Howells wrote:
> int ext4_getattr(struct vfsmount *mnt, struct dentry *dentry,
> struct kstat *stat)
> {
> + if (S_ISDIR(inode->i_mode)) {
> + stat->result_flags |= XSTAT_QUERY_DATA_VERSION;
> + stat->data_version = inode->i_version;
> + }

Note that when ext4 is mounted with the "i_version" option that the i_version field is also updated on regular files, for use by NFSv4. See, for example, ext4_mark_iloc_dirty().

I had a hard time finding this, even though I knew it was there somewhere, because it isn't modifying "i_version" directly, but rather calling a helper function inode_inc_iversion().

It probably makes sense to always return i_version, unless it is 0.

Cheers, Andreas





--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: David Howells on
Arnd Bergmann <arnd(a)arndb.de> wrote:

> It also makes things like strace more complicated.

That's the most compelling argument.

> No, I think that would be worse than the current version. But if you remove
> the structure version in favor of the flags, you only need six arguments
> anyway.

I want to keep the structure version, just in case we need to expand fields in
the stat struct in future. Otherwise we may need to create yet another stat
syscall.

> You can also go further and fold the structure length into flags, because
> the length is just a function of the data you are passing.

The potential problem with passing the flags as a syscall argument is that
we're then limited to a single 32-bit integer. It might be enough, but if I
do as at least one person has suggested and assign each field in the struct
its own bit, that uses up half right there, plus I'd like to add at least one
operational flag (to force synchronisation with the server).

> Having a system call with flags, size and version is like wearing a belt,
> braces and suspenders. An unsigned long flags argument should be enough to
> hold up your pants[1].

I would like the size argument for two reasons: firstly, to prevent buffer
overruns and, secondly, because I can see some scope for variable-size fields
(such as for volume IDs or security labels), though the latter might be better
handled through getxattr() (which would mean extra overhead).

David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: David Howells on
Andreas Dilger <adilger(a)dilger.ca> wrote:

> Note that when ext4 is mounted with the "i_version" option that the
> i_version field is also updated on regular files, for use by NFSv4. See,
> for example, ext4_mark_iloc_dirty().
>
> I had a hard time finding this, even though I knew it was there somewhere,
> because it isn't modifying "i_version" directly, but rather calling a helper
> function inode_inc_iversion().

Ah, okay. Thanks!

> It probably makes sense to always return i_version, unless it is 0.

I didn't want to return it if it wasn't supported on the nominated file as
that may give a false sense of coherency.

David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Arnd Bergmann on
On Wednesday 30 June 2010, David Howells wrote:
> Arnd Bergmann <arnd(a)arndb.de> wrote:
> > No, I think that would be worse than the current version. But if you remove
> > the structure version in favor of the flags, you only need six arguments
> > anyway.
>
> I want to keep the structure version, just in case we need to expand fields in
> the stat struct in future. Otherwise we may need to create yet another stat
> syscall.

How many versions do you expect we need in the next 10 years, not counting
those where you just add a new field to the structure?

Given a 64 bit flag word, you can start using bits for the version from
the top and bits from the bottom for fields:

#define XSTAT_DEV 0x00000001
#define XSTAT_INO 0x00000002
#define XSTAT_MODE 0x00000004
....
#define XSTAT_LAYOUT_VERSION_2 0x8000000000000000
#define XSTAT_LAYOUT_VERSION_1 0x0000000000000000

> > You can also go further and fold the structure length into flags, because
> > the length is just a function of the data you are passing.
>
> The potential problem with passing the flags as a syscall argument is that
> we're then limited to a single 32-bit integer. It might be enough, but if I
> do as at least one person has suggested and assign each field in the struct
> its own bit, that uses up half right there, plus I'd like to add at least one
> operational flag (to force synchronisation with the server).

I'd imagine that there would be some reasonable way to group some of the
fields so that 32 bits last long enough. Alternatively, you can also make
it a 64 bit argument everywhere, which has some other small disadvantages.

> > Having a system call with flags, size and version is like wearing a belt,
> > braces and suspenders. An unsigned long flags argument should be enough to
> > hold up your pants[1].
>
> I would like the size argument for two reasons: firstly, to prevent buffer
> overruns and, secondly, because I can see some scope for variable-size fields
> (such as for volume IDs or security labels), though the latter might be better
> handled through getxattr() (which would mean extra overhead).

The idea of a syscall API with multiple fixed-length and variable-length
fields in the same structure scares me. If you want to go this far,
it may be better to base the interface on netlink and allow querying
multiple files at once.

For a classic syscall interface, I'd just stay away from variable-length
data and use either fixed-length fields or spend the extra overhead for
the getxattr values.

When all members of struct xstat are fixed length, you can simply add
new members at the end and add the associated flags at the same time.
Any code built against a given header file can only ask for the fields
that are part of the struct definition it uses. The kernel should
obviously only write the fields that the user asked for, in case the
user was built against an older header file. You can also maintain
forward compatibility if the kernel sets a bitmask in the struct with
the fields it has returned.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: David Howells on
Arnd Bergmann <arnd(a)arndb.de> wrote:

> Given a 64 bit flag word, you can start using bits for the version from
> the top and bits from the bottom for fields:

I suppose. It's cleaner, though, to keep them separate.

> Alternatively, you can also make it a 64 bit argument everywhere, which has
> some other small disadvantages.

No, you can't. 32-bit systems can only pass 32-bit arguments. If you're
suggesting passing a pointer to a 64-bit argument instead, how's that any
different from my suggestion of a separate parameter block?

> The idea of a syscall API with multiple fixed-length and variable-length
> fields in the same structure scares me. If you want to go this far,
> it may be better to base the interface on netlink and allow querying
> multiple files at once.

Urgh. Netlink is way too much overhead and even scarier. That's pretty much
a guarantee that people won't use it. It also has to work if CONFIG_NET=n.

David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/