From: Jan Kara on
> Jan Kara wrote:
>
> > Umm, but these two traces confuse me:
> >1) They are different traces that those you wrote about initially,
> >aren't they? Because here we would not call sync_dirty_buffer() from
> >journal_dirty_data().
> > BTW: Does this buffer trace lead to that Oops in submit_bh()? I guess not
> >as the buffer is not dirty...
>
> They do wind up at the same oops, from the same "testcase" (i.e. beat the
> tar out of the filesystem with multiple fsx's and fsstress...)
>
> The buffer is not dirty at that tracepoint because it has just done
> if (locked && test_clear_buffer_dirty(bh)) {
> prior to the tracepoint...
Oh, I see. OK.

>
> See the whole traces at
>
> http://people.redhat.com/esandeen/traces/eric_ext3_oops1.txt
> http://people.redhat.com/esandeen/traces/eric_ext3_oops2.txt
Hmm, those traces look really useful. I just have to digest them ;).

> As an aside, when we do journal_unmap_buffer... should it stay on
> t_sync_datalist?
Yes, it should and it seems it really was removed from it at some
point. Only later journal_dirty_data() came and filed it back to the
BJ_SyncData list. And the buffer remained unmapped till the commit time
and then *bang*... It may even be a race in ext3 itself that it called
journal_dirty_data() on an unmapped buffer but I have to read some more
code.

Bye
Honza
--
Jan Kara <jack(a)suse.cz>
SuSE CR Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Badari Pulavarty on
On Wed, 2006-10-11 at 16:22 +0200, Jan Kara wrote:
> > Jan Kara wrote:
> >
> > > Umm, but these two traces confuse me:
> > >1) They are different traces that those you wrote about initially,
> > >aren't they? Because here we would not call sync_dirty_buffer() from
> > >journal_dirty_data().
> > > BTW: Does this buffer trace lead to that Oops in submit_bh()? I guess not
> > >as the buffer is not dirty...
> >
> > They do wind up at the same oops, from the same "testcase" (i.e. beat the
> > tar out of the filesystem with multiple fsx's and fsstress...)
> >
> > The buffer is not dirty at that tracepoint because it has just done
> > if (locked && test_clear_buffer_dirty(bh)) {
> > prior to the tracepoint...
> Oh, I see. OK.
>
> >
> > See the whole traces at
> >
> > http://people.redhat.com/esandeen/traces/eric_ext3_oops1.txt
> > http://people.redhat.com/esandeen/traces/eric_ext3_oops2.txt
> Hmm, those traces look really useful. I just have to digest them ;).
>
> > As an aside, when we do journal_unmap_buffer... should it stay on
> > t_sync_datalist?
> Yes, it should and it seems it really was removed from it at some
> point. Only later journal_dirty_data() came and filed it back to the
> BJ_SyncData list. And the buffer remained unmapped till the commit time
> and then *bang*... It may even be a race in ext3 itself that it called
> journal_dirty_data() on an unmapped buffer but I have to read some more
> code.
>

Yes. calling journal_dirty_data() on unmapped buffer can definitely
happen. (only thing i am not sure is - why doesn't happen with a
simple testcase like dirtying only a part of a page in 1k filesystem.
I am not sure why we need journal_unmap_buffer() in the sequence).


Here is what I think is happening..

journal_unmap_buffer() - cleaned the buffer, since its outside EOF, but
its a part of the same page. So it remained on the page->buffers
list. (at this time its not part of any transaction).

Then, ordererd_commit_write() called journal_dirty_data() and we added
all these buffers to BJ_SyncData list. (at this time buffer is clean -
not dirty).

Now msync() called __set_page_dirty_buffers() and dirtied *all* the
buffers attached to this page.

journal_submit_data_buffers() got around to this buffer and tried to
submit the buffer...

Andrew is right - only option for us to check the filesize in the
write out path and skip the buffers beyond EOF.

Thanks,
Badari

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Eric Sandeen on
Badari Pulavarty wrote:

> Here is what I think is happening..
>
> journal_unmap_buffer() - cleaned the buffer, since its outside EOF, but
> its a part of the same page. So it remained on the page->buffers
> list. (at this time its not part of any transaction).
>
> Then, ordererd_commit_write() called journal_dirty_data() and we added
> all these buffers to BJ_SyncData list. (at this time buffer is clean -
> not dirty).
>
> Now msync() called __set_page_dirty_buffers() and dirtied *all* the
> buffers attached to this page.
>
> journal_submit_data_buffers() got around to this buffer and tried to
> submit the buffer...

This seems about right, but one thing bothers me in the traces; it seems like
there is some locking that is missing. In
http://people.redhat.com/esandeen/traces/eric_ext3_oops1.txt
for example, it looks like journal_dirty_data gets started, but then the
buffer_head is acted on by journal_unmap_buffer, which decides this buffer is
part of the running transaction, past EOF, and clears mapped, dirty, etc. Then
journal_dirty_data picks up again, decides that the buffer is not on the right
list (now BJ_None) and puts it back on BJ_SyncData. Then it gets picked up by
journal_submit_data_buffers and submitted, and oops.

Talking with Stephen, it seemed like the page lock should synchronize these
threads, but I've found that we can get to journal_dirty_data acting on the
buffer heads w/o having the page locked...

I'm still digging, and, er, grasping at straws here... Am I off base?

-Eric


> Andrew is right - only option for us to check the filesize in the
> write out path and skip the buffers beyond EOF.
>
> Thanks,
> Badari
>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: John Wendel on
Eric Sandeen wrote:
> Badari Pulavarty wrote:
>
>> Here is what I think is happening..
>>
>> journal_unmap_buffer() - cleaned the buffer, since its outside EOF, but
>> its a part of the same page. So it remained on the page->buffers
>> list. (at this time its not part of any transaction).
>>
>> Then, ordererd_commit_write() called journal_dirty_data() and we added
>> all these buffers to BJ_SyncData list. (at this time buffer is clean -
>> not dirty).
>>
>> Now msync() called __set_page_dirty_buffers() and dirtied *all* the
>> buffers attached to this page.
>>
>> journal_submit_data_buffers() got around to this buffer and tried to
>> submit the buffer...
>
> This seems about right, but one thing bothers me in the traces; it
> seems like there is some locking that is missing. In
> http://people.redhat.com/esandeen/traces/eric_ext3_oops1.txt
> for example, it looks like journal_dirty_data gets started, but then
> the buffer_head is acted on by journal_unmap_buffer, which decides
> this buffer is part of the running transaction, past EOF, and clears
> mapped, dirty, etc. Then journal_dirty_data picks up again, decides
> that the buffer is not on the right list (now BJ_None) and puts it
> back on BJ_SyncData. Then it gets picked up by
> journal_submit_data_buffers and submitted, and oops.
>
> Talking with Stephen, it seemed like the page lock should synchronize
> these threads, but I've found that we can get to journal_dirty_data
> acting on the buffer heads w/o having the page locked...
>
> I'm still digging, and, er, grasping at straws here... Am I off base?
>
> -Eric
>
>
>> Andrew is right - only option for us to check the filesize in the
>> write out path and skip the buffers beyond EOF.
>>
>> Thanks,
>> Badari
>>
Here's another data point for your consideration. I've been seeing this
error since I started running 2.6.18, I assumed it was hardware, so I've
tried 3 different disks, a PATA and 2 SATA drives, with VIA and Promise
controllers, the error has occurred on all of them. I see the error
infrequently, always when downloading lots of small files from Usenet
and building, copying and deleting large (200 - 300 MB). I haven't ever
had an oops/panic, just this error. When I run fsck, I always see a
single message that "deleted inode nnn has zero dtime". I hope this will
be useful.

Oct 11 20:37:32 Godzilla kernel: EXT3-fs error (device hda5):
ext3_free_blocks_sb: bit already cleared for block 4740550
Oct 11 20:37:32 Godzilla kernel: Aborting journal on device hda5.
Oct 11 20:37:32 Godzilla kernel: EXT3-fs error (device hda5) in
ext3_free_blocks_sb: Journal has aborted
Oct 11 20:37:32 Godzilla kernel: EXT3-fs error (device hda5) in
ext3_free_blocks_sb: Journal has aborted
Oct 11 20:37:32 Godzilla kernel: EXT3-fs error (device hda5) in
ext3_reserve_inode_write: Journal has aborted
Oct 11 20:37:32 Godzilla kernel: EXT3-fs error (device hda5) in
ext3_truncate: Journal has aborted
Oct 11 20:37:32 Godzilla kernel: EXT3-fs error (device hda5) in
ext3_reserve_inode_write: Journal has aborted
Oct 11 20:37:32 Godzilla kernel: EXT3-fs error (device hda5) in
ext3_orphan_del: Journal has aborted
Oct 11 20:37:32 Godzilla kernel: EXT3-fs error (device hda5) in
ext3_reserve_inode_write: Journal has aborted
Oct 11 20:37:32 Godzilla kernel: EXT3-fs error (device hda5) in
ext3_delete_inode: Journal has aborted
Oct 11 20:37:32 Godzilla kernel: __journal_remove_journal_head: freeing
b_committed_data
Oct 11 20:37:32 Godzilla kernel: __journal_remove_journal_head: freeing
b_committed_data
Oct 11 20:37:32 Godzilla kernel: ext3_abort called.
Oct 11 20:37:32 Godzilla kernel: EXT3-fs error (device hda5):
ext3_journal_start_sb: Detected aborted journal
Oct 11 20:37:32 Godzilla kernel: Remounting filesystem read-only

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Jan-Benedict Glaw on
On Wed, 2006-10-11 21:34:13 -0700, John Wendel <jwendel10(a)comcast.net> wrote:
> Oct 11 20:37:32 Godzilla kernel: EXT3-fs error (device hda5): ext3_free_blocks_sb: bit already cleared for block 4740550
> Oct 11 20:37:32 Godzilla kernel: Aborting journal on device hda5.
> Oct 11 20:37:32 Godzilla kernel: EXT3-fs error (device hda5) in ext3_free_blocks_sb: Journal has aborted
> Oct 11 20:37:32 Godzilla kernel: EXT3-fs error (device hda5) in ext3_free_blocks_sb: Journal has aborted
> Oct 11 20:37:32 Godzilla kernel: EXT3-fs error (device hda5) in ext3_reserve_inode_write: Journal has aborted
> Oct 11 20:37:32 Godzilla kernel: EXT3-fs error (device hda5) in ext3_truncate: Journal has aborted
> Oct 11 20:37:32 Godzilla kernel: EXT3-fs error (device hda5) in ext3_reserve_inode_write: Journal has aborted
> Oct 11 20:37:32 Godzilla kernel: EXT3-fs error (device hda5) in ext3_orphan_del: Journal has aborted
> Oct 11 20:37:32 Godzilla kernel: EXT3-fs error (device hda5) in ext3_reserve_inode_write: Journal has aborted
> Oct 11 20:37:32 Godzilla kernel: EXT3-fs error (device hda5) in ext3_delete_inode: Journal has aborted
> Oct 11 20:37:32 Godzilla kernel: __journal_remove_journal_head: freeing b_committed_data
> Oct 11 20:37:32 Godzilla kernel: __journal_remove_journal_head: freeing b_committed_data
> Oct 11 20:37:32 Godzilla kernel: ext3_abort called.
> Oct 11 20:37:32 Godzilla kernel: EXT3-fs error (device hda5): ext3_journal_start_sb: Detected aborted journal
> Oct 11 20:37:32 Godzilla kernel: Remounting filesystem read-only

This looks very much like the issue I see.

MfG, JBG

--
Jan-Benedict Glaw jbglaw(a)lug-owl.de +49-172-7608481
Signature of: http://catb.org/~esr/faqs/smart-questions.html
the second :