From: Dave Chinner on
On Tue, Jul 20, 2010 at 05:44:24PM -0700, Andrew Morton wrote:
> On Wed, 21 Jul 2010 08:45:25 +1000 Dave Chinner <david(a)fromorbit.com> wrote:
> > On Tue, Jul 20, 2010 at 03:36:56AM -0700, Andrew Morton wrote:
> > > On Tue, 20 Jul 2010 16:41:45 +1000 Stephen Rothwell <sfr(a)canb.auug.org.au> wrote:
> > > > Has anyone seen this or something similar?
> > >
> > > I get it all the time. See the thread "Subject: Re: linux-next: Tree for
> > > July 7".
> >
> > Yet nobody else seems to be able to reproduce it. Given that powerPC
> > is good at triggering reace conditions, maybe there is one that
> > only you are unlucky eough to trigger.
> >
> > Rather than just commenting out the BUG_ON() and ignoring the
> > problem, can you print out the inode state (and enough information
> > to identify the filesystem the inode belongs to) before triggering
> > the BUG_ON() so we can get some idea of how this is triggering?
>
> Already did. ext3. I_DIRTY_SYNC, I_DIRTY_DATASYNC and I_DIRTY_PAGES
> are set (i_state=0x67).
>
> A bit of poking around indicates that these inodes always have zero
> attached pages,

They should, because by the time that bug fires they should have had
all their pages stripped away.

> and they were dirtied within dquot_free_space().

AFAICT dquot_free_space() is called deep in the guts of
ext3_truncate() via dquot_free_block(), which is called directly
before end_writeback(). That should overwrite any state changes made
inside ext3_truncate. I wonder if iput_final() is racing with
something else here?

> This isn't necessarily a problem in the quota code (setting aside the
> question: why the heck does dquot_free_space() set I_DIRTY_PAGES??).
> If the vfs is asked to kill off a dirty inode, it should at least clean
> the thing first.
>
> I dunno. That fs/inode.c patch series from Viro looks fishy. I guess
> I get to bisect it tomorrow.

I suspect that is the only way to get to the bottom of this, short
of a reliable reproducer being discovered. I'm still trying to
reproduce it - I've even turned quota on - but I'm not having any
more luck than over the weekend, though...

Cheers,

Dave.
--
Dave Chinner
david(a)fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Andrew Morton on
On Wed, 21 Jul 2010 15:20:07 +1000 Dave Chinner <david(a)fromorbit.com> wrote:

> > and they were dirtied within dquot_free_space().
>
> AFAICT dquot_free_space() is called deep in the guts of
> ext3_truncate() via dquot_free_block(), which is called directly
> before end_writeback(). That should overwrite any state changes made
> inside ext3_truncate. I wonder if iput_final() is racing with
> something else here?
>

This isn't a race. I type `make' and the warnings spew out at hundreds
per second - every unlink, I'd say.

Did you try my .config?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Stephen Rothwell on
On Wed, 21 Jul 2010 00:29:07 -0700 Andrew Morton <akpm(a)linux-foundation.org> wrote:
>
> On Wed, 21 Jul 2010 15:20:07 +1000 Dave Chinner <david(a)fromorbit.com> wrote:
>
> > > and they were dirtied within dquot_free_space().
> >
> > AFAICT dquot_free_space() is called deep in the guts of
> > ext3_truncate() via dquot_free_block(), which is called directly
> > before end_writeback(). That should overwrite any state changes made
> > inside ext3_truncate. I wonder if iput_final() is racing with
> > something else here?
> >
>
> This isn't a race. I type `make' and the warnings spew out at hundreds
> per second - every unlink, I'd say.

Bisected to:

commit 8bfe4a06746e5f03c02afe3ceb97b5364c099f63
Author: Al Viro <viro(a)zeniv.linux.org.uk>
Date: Sun Jun 6 07:08:19 2010 -0400

convert ext3 to ->evict_inode()

Signed-off-by: Al Viro <viro(a)zeniv.linux.org.uk>

--
Cheers,
Stephen Rothwell sfr(a)canb.auug.org.au
http://www.canb.auug.org.au/~sfr/
From: Jan Kara on
On Wed 21-07-10 17:48:09, Stephen Rothwell wrote:
> On Wed, 21 Jul 2010 00:29:07 -0700 Andrew Morton <akpm(a)linux-foundation.org> wrote:
> >
> > On Wed, 21 Jul 2010 15:20:07 +1000 Dave Chinner <david(a)fromorbit.com> wrote:
> >
> > > > and they were dirtied within dquot_free_space().
> > >
> > > AFAICT dquot_free_space() is called deep in the guts of
> > > ext3_truncate() via dquot_free_block(), which is called directly
> > > before end_writeback(). That should overwrite any state changes made
> > > inside ext3_truncate. I wonder if iput_final() is racing with
> > > something else here?
> > >
> >
> > This isn't a race. I type `make' and the warnings spew out at hundreds
> > per second - every unlink, I'd say.
>
> Bisected to:
>
> commit 8bfe4a06746e5f03c02afe3ceb97b5364c099f63
> Author: Al Viro <viro(a)zeniv.linux.org.uk>
> Date: Sun Jun 6 07:08:19 2010 -0400
>
> convert ext3 to ->evict_inode()
>
> Signed-off-by: Al Viro <viro(a)zeniv.linux.org.uk>
Thanks for bisecting this. The patch series indeed seems to uncover
some discrepancies.
Ext3 has always dirtied inode in it's ->delete_inode method (via quota
code). But previously clear_inode() just overwrote the state with I_CLEAR
and thus we never saw the BUG_ON. After Al's patches, i_state is set in
end_writeback() which happens earlier. In particular it happens before
ext3_free_inode() which dirties the inode through quota code while freeing
xattrs - they are accounted in i_blocks, so i_blocks are updated during
freeing and inode is dirtied.
Actually, ext3_mark_inode_dirty() called during each mark_inode_dirty()
call writes the inode state to the journal so the dirty flag in the inode
state is in fact stale and overwriting it with I_CLEAR never mattered. In
this sense, the BUG_ON triggered is a false positive. But I believe this is
a separate story.
I'm not sure how to really fix this. It seems a bit premature to me to
mark inode as I_CLEAR before the filesystem is actually done with it. So
maybe the line
inode->i_state = I_FREEING | I_CLEAR;
should be moved to evict() fuction?

Honza

--
Jan Kara <jack(a)suse.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Jan Kara on
On Tue 20-07-10 17:44:24, Andrew Morton wrote:
> On Wed, 21 Jul 2010 08:45:25 +1000
....
> This isn't necessarily a problem in the quota code (setting aside the
> question: why the heck does dquot_free_space() set I_DIRTY_PAGES??).
Because sometime in the dark past (2.4 days I believe), I used
mark_inode_dirty in quota functions (not sure whether there even were
different inode dirty flags back then) and it stayed this way upto now.
mark_inode_dirty_sync() is of course more appropriate for quota code these
days. Cleanup is on its way...

Honza
--
Jan Kara <jack(a)suse.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/