From: Miklos Szeredi on
On Mon, 12 Jul 2010, Aneesh Kumar K.V wrote:
> This enables to use linkat to create hardlinks from a
> file descriptor pointing to the file. This can be used
> with open_by_handle syscall that returns a file descriptor.

This needs more thought, filesystems don't usually tolerate
resurrecting a file which has already been unlinked (i_nlink == 0).

Thanks,
Miklos

>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar(a)linux.vnet.ibm.com>
> ---
> fs/namei.c | 34 +++++++++++++++++++++++++---------
> 1 files changed, 25 insertions(+), 9 deletions(-)
>
> diff --git a/fs/namei.c b/fs/namei.c
> index a6a8093..9a7b71a 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -2553,16 +2553,28 @@ SYSCALL_DEFINE5(linkat, int, olddfd, const char __user *, oldname,
> {
> struct dentry *new_dentry;
> struct nameidata nd;
> - struct path old_path;
> - int error;
> + struct path old_path, *old_pathp;
> + struct file *file = NULL;
> + int error, fput_needed;
> char *to;
>
> if ((flags & ~AT_SYMLINK_FOLLOW) != 0)
> return -EINVAL;
>
> - error = user_path_at(olddfd, oldname,
> - flags & AT_SYMLINK_FOLLOW ? LOOKUP_FOLLOW : 0,
> - &old_path);
> + if (oldname == NULL && olddfd != AT_FDCWD) {
> + file = fget_light(olddfd, &fput_needed);
> + if (file) {
> + old_pathp = &file->f_path;
> + error = 0;
> + } else
> + error = -EBADF;
> + } else {
> + error = user_path_at(olddfd, oldname,
> + flags & AT_SYMLINK_FOLLOW ?
> + LOOKUP_FOLLOW : 0,
> + &old_path);
> + old_pathp = &old_path;
> + }
> if (error)
> return error;
>
> @@ -2570,7 +2582,7 @@ SYSCALL_DEFINE5(linkat, int, olddfd, const char __user *, oldname,
> if (error)
> goto out;
> error = -EXDEV;
> - if (old_path.mnt != nd.path.mnt)
> + if (old_pathp->mnt != nd.path.mnt)
> goto out_release;
> new_dentry = lookup_create(&nd, 0);
> error = PTR_ERR(new_dentry);
> @@ -2579,10 +2591,11 @@ SYSCALL_DEFINE5(linkat, int, olddfd, const char __user *, oldname,
> error = mnt_want_write(nd.path.mnt);
> if (error)
> goto out_dput;
> - error = security_path_link(old_path.dentry, &nd.path, new_dentry);
> + error = security_path_link(old_pathp->dentry, &nd.path, new_dentry);
> if (error)
> goto out_drop_write;
> - error = vfs_link(old_path.dentry, nd.path.dentry->d_inode, new_dentry);
> + error = vfs_link(old_pathp->dentry,
> + nd.path.dentry->d_inode, new_dentry);
> out_drop_write:
> mnt_drop_write(nd.path.mnt);
> out_dput:
> @@ -2593,7 +2606,10 @@ out_release:
> path_put(&nd.path);
> putname(to);
> out:
> - path_put(&old_path);
> + if (file)
> + fput_light(file, fput_needed);
> + else
> + path_put(&old_path);
>
> return error;
> }
> --
> 1.7.2.rc1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo(a)vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Aneesh Kumar K. V on
On Mon, 12 Jul 2010 10:05:39 +0200, Miklos Szeredi <miklos(a)szeredi.hu> wrote:
> On Mon, 12 Jul 2010, Aneesh Kumar K.V wrote:
> > This enables to use linkat to create hardlinks from a
> > file descriptor pointing to the file. This can be used
> > with open_by_handle syscall that returns a file descriptor.
>
> This needs more thought, filesystems don't usually tolerate
> resurrecting a file which has already been unlinked (i_nlink == 0).
>

We get -ENOENT error when we do that on ext*

open("test", O_RDONLY) = 3
unlink("test") = 0
linkat(3, NULL, AT_FDCWD, "test3", 0) = -1 ENOENT (No such file or directory)

ext4_link does the below

/*
* Return -ENOENT if we've raced with unlink and i_nlink is 0. Doing
* otherwise has the potential to corrupt the orphan inode list.
*/
if (inode->i_nlink == 0)
return -ENOENT;

I can move this check to VFS so that we do it for all file systems.

-aneesh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Miklos Szeredi on
On Mon, 12 Jul 2010, Aneesh Kumar K. V wrote:
> /*
> * Return -ENOENT if we've raced with unlink and i_nlink is 0. Doing
> * otherwise has the potential to corrupt the orphan inode list.
> */
> if (inode->i_nlink == 0)
> return -ENOENT;
>
> I can move this check to VFS so that we do it for all file systems.

That makes sense. Hopefully filesystems which implement hard links
will change i_nlink to zero for the last unlink...

Thanks,
Miklos
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Matt Helsley on
On Mon, Jul 12, 2010 at 10:05:39AM +0200, Miklos Szeredi wrote:
> On Mon, 12 Jul 2010, Aneesh Kumar K.V wrote:
> > This enables to use linkat to create hardlinks from a
> > file descriptor pointing to the file. This can be used
> > with open_by_handle syscall that returns a file descriptor.
>
> This needs more thought, filesystems don't usually tolerate
> resurrecting a file which has already been unlinked (i_nlink == 0).

File resurrection support would be useful for more than the
open-by-handle patches.

Checkpoint/restart would make use of it too. Without it, programs that use
large unlinked files would take corresponding amounts of IO to checkpoint.
By resurrecting the file its contents can be checkpointed without copying
on filesystems or block devices that support CoW-sharing of snapshot data.
Then, during restart, we'd do something like:

1. Take a snapshot of the snapshot and remount it rw.
(Not necessary if userspace doesn't mind restart
destroying the checkpoint.)
2. Re-open the snapshot of the resurrected file.
<everything else we already do to restart open files (e.g. seek)>
N. Re-unlink the file.

Cheers,
-Matt Helsley

>
> Thanks,
> Miklos
>
> >
> > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar(a)linux.vnet.ibm.com>
> > ---
> > fs/namei.c | 34 +++++++++++++++++++++++++---------
> > 1 files changed, 25 insertions(+), 9 deletions(-)
> >
> > diff --git a/fs/namei.c b/fs/namei.c
> > index a6a8093..9a7b71a 100644
> > --- a/fs/namei.c
> > +++ b/fs/namei.c
> > @@ -2553,16 +2553,28 @@ SYSCALL_DEFINE5(linkat, int, olddfd, const char __user *, oldname,
> > {
> > struct dentry *new_dentry;
> > struct nameidata nd;
> > - struct path old_path;
> > - int error;
> > + struct path old_path, *old_pathp;
> > + struct file *file = NULL;
> > + int error, fput_needed;
> > char *to;
> >
> > if ((flags & ~AT_SYMLINK_FOLLOW) != 0)
> > return -EINVAL;
> >
> > - error = user_path_at(olddfd, oldname,
> > - flags & AT_SYMLINK_FOLLOW ? LOOKUP_FOLLOW : 0,
> > - &old_path);
> > + if (oldname == NULL && olddfd != AT_FDCWD) {
> > + file = fget_light(olddfd, &fput_needed);
> > + if (file) {
> > + old_pathp = &file->f_path;
> > + error = 0;
> > + } else
> > + error = -EBADF;
> > + } else {
> > + error = user_path_at(olddfd, oldname,
> > + flags & AT_SYMLINK_FOLLOW ?
> > + LOOKUP_FOLLOW : 0,
> > + &old_path);
> > + old_pathp = &old_path;
> > + }
> > if (error)
> > return error;
> >
> > @@ -2570,7 +2582,7 @@ SYSCALL_DEFINE5(linkat, int, olddfd, const char __user *, oldname,
> > if (error)
> > goto out;
> > error = -EXDEV;
> > - if (old_path.mnt != nd.path.mnt)
> > + if (old_pathp->mnt != nd.path.mnt)
> > goto out_release;
> > new_dentry = lookup_create(&nd, 0);
> > error = PTR_ERR(new_dentry);
> > @@ -2579,10 +2591,11 @@ SYSCALL_DEFINE5(linkat, int, olddfd, const char __user *, oldname,
> > error = mnt_want_write(nd.path.mnt);
> > if (error)
> > goto out_dput;
> > - error = security_path_link(old_path.dentry, &nd.path, new_dentry);
> > + error = security_path_link(old_pathp->dentry, &nd.path, new_dentry);
> > if (error)
> > goto out_drop_write;
> > - error = vfs_link(old_path.dentry, nd.path.dentry->d_inode, new_dentry);
> > + error = vfs_link(old_pathp->dentry,
> > + nd.path.dentry->d_inode, new_dentry);
> > out_drop_write:
> > mnt_drop_write(nd.path.mnt);
> > out_dput:
> > @@ -2593,7 +2606,10 @@ out_release:
> > path_put(&nd.path);
> > putname(to);
> > out:
> > - path_put(&old_path);
> > + if (file)
> > + fput_light(file, fput_needed);
> > + else
> > + path_put(&old_path);
> >
> > return error;
> > }
> > --
> > 1.7.2.rc1
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> > the body of a message to majordomo(a)vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo(a)vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/