From: Christoph Hellwig on
> +static struct backing_dev_info *inode_to_bdi(struct inode *inode)
> +{
> + struct backing_dev_info *bdi = inode->i_mapping->backing_dev_info;
> +
> + /*
> + * This is a hack but it solves a problem with device inode
> + * for e.g. /dev/zero getting dirty (via touch or so) and confusing
> + * writeback code. In such cases we return the "parent" filesystem's
> + * bdi.
> + */
> + if (bdi_cap_writeback_dirty(bdi))
> + return bdi;
> + return inode->i_sb->s_bdi;

When do we ever have a writeback-capable bdi that sits inside another
filesystem? I think just always using inode->i_sb->s_bdi is the right
thing here.

And btw, having a BDI_CAP_NO_WRITEBACK instead of a BDI_CAP_WRITEBACK
is rather dumb, we'd better fix it up while we're at it.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Christoph Hellwig on
On Tue, Jul 27, 2010 at 07:24:38PM +0200, Jan Kara wrote:
> Always returning inode->i_sb->s_bdi wouldn't be a right thing IMHO.
> That would file inode for /dev/sda to BDI list of tmpfs mounted on /dev/
> which isn't what you want...

It shouldn't. Block device nodes are on the bdev filesystems, and
we twist the file->mapping pointer so that all the low-level read/write
code always deals with the bdev fs inode, and not the device node
filesystem.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Jan Kara on
On Tue 27-07-10 13:09:03, Christoph Hellwig wrote:
> > +static struct backing_dev_info *inode_to_bdi(struct inode *inode)
> > +{
> > + struct backing_dev_info *bdi = inode->i_mapping->backing_dev_info;
> > +
> > + /*
> > + * This is a hack but it solves a problem with device inode
> > + * for e.g. /dev/zero getting dirty (via touch or so) and confusing
> > + * writeback code. In such cases we return the "parent" filesystem's
> > + * bdi.
> > + */
> > + if (bdi_cap_writeback_dirty(bdi))
> > + return bdi;
> > + return inode->i_sb->s_bdi;
>
> When do we ever have a writeback-capable bdi that sits inside another
> filesystem? I think just always using inode->i_sb->s_bdi is the right
> thing here.
Always returning inode->i_sb->s_bdi wouldn't be a right thing IMHO.
That would file inode for /dev/sda to BDI list of tmpfs mounted on /dev/
which isn't what you want...

> And btw, having a BDI_CAP_NO_WRITEBACK instead of a BDI_CAP_WRITEBACK
> is rather dumb, we'd better fix it up while we're at it.
Yes, that seems reasonable. I'm just not sure how many places one has
to change to fix this and how to find them... Maybe grepping for BDI_CAP_
will be enough but I'm not sure.
Honza
--
Jan Kara <jack(a)suse.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Jan Kara on
On Tue 27-07-10 13:27:31, Christoph Hellwig wrote:
> On Tue, Jul 27, 2010 at 07:24:38PM +0200, Jan Kara wrote:
> > Always returning inode->i_sb->s_bdi wouldn't be a right thing IMHO.
> > That would file inode for /dev/sda to BDI list of tmpfs mounted on /dev/
> > which isn't what you want...
>
> It shouldn't. Block device nodes are on the bdev filesystems, and
Ok, so inode->i_sb->s_bdi will actually point to noop_backing_dev_info
as set by set_anon_super(). Or am I completely out?

> we twist the file->mapping pointer so that all the low-level read/write
> code always deals with the bdev fs inode, and not the device node
> filesystem.
Yes, I know this but I fail to see how this influences where </dev/sda's
inode>->i_sb->s_bdi ends up...

Honza
--
Jan Kara <jack(a)suse.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Christoph Hellwig on
On Tue, Jul 27, 2010 at 08:01:31PM +0200, Jan Kara wrote:
> > It shouldn't. Block device nodes are on the bdev filesystems, and
> Ok, so inode->i_sb->s_bdi will actually point to noop_backing_dev_info
> as set by set_anon_super(). Or am I completely out?

I think you're right. This seems rather bad if it's indeed true. I'll
quickly verify it using Dave's new tracing once I've built a block
tree kernel.

> > we twist the file->mapping pointer so that all the low-level read/write
> > code always deals with the bdev fs inode, and not the device node
> > filesystem.
>
> Yes, I know this but I fail to see how this influences where </dev/sda's
> inode>->i_sb->s_bdi ends up...

Ok, let's say we write to /dev/sda. For this we call

blkdev_aio_write -> __generic_file_aio_write

There we get the inode we operate on as

struct file *file = iocb->ki_filp;
struct address_space * mapping = file->f_mapping;
struct inode *inode = mapping->host;

Now file->f_mapping gets set up in blkdev_open as:

filp->f_mapping = bdev->bd_inode->i_mapping;

so it does not point to the inode of the device node, but an inode on
the bdev filesystem.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/