[MTD] Fix JFFS2 sync silent failure [Kernel]

Prev: Phylib polling when doing mdio_read will cause system response and transfer speed drop
Next: [Bug #15610] fsck leads to swapper - BUG: unable to handle kernel NULL pointer dereference & panic

From: Jens Axboe on 19 Apr 2010 03:40

On Sat, Apr 17 2010, J�rn Engel wrote:
> Moin David,
>
> if I read the code correctly, JFFS2 will happily perform a NOP on
> sys_sync() again. And this time it appears to be Jens' fault.
>
> JFFS2 does not appear to set s_bdi anywhere. And as of 32a88aa1,
> __sync_filesystem() will return 0 if s_bdi is not set. As a result,
> sync_fs() is never called for jffs2 and whatever remains in the wbuf
> will not make it to the device.
>
> The patch also adds a BUG_ON to catch this problem in any remaining or
> future offenders. I am not sure about network filesystems, but at
> least bdev- and mtd-based ones should be caught.
>
> Opinions?

I think that BUG_ON() would be a lot better as a printk() and fail mount
instead. There's really little point in killing the kernel for something
you could easily warn about and recover nicely.

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Jens Axboe on 19 Apr 2010 06:30

On Mon, Apr 19 2010, J�rn Engel wrote:
> On Mon, 19 April 2010 09:38:44 +0200, Jens Axboe wrote:
> > On Sat, Apr 17 2010, J�rn Engel wrote:
> > > Moin David,
> > >
> > > if I read the code correctly, JFFS2 will happily perform a NOP on
> > > sys_sync() again. And this time it appears to be Jens' fault.
> > >
> > > JFFS2 does not appear to set s_bdi anywhere. And as of 32a88aa1,
> > > __sync_filesystem() will return 0 if s_bdi is not set. As a result,
> > > sync_fs() is never called for jffs2 and whatever remains in the wbuf
> > > will not make it to the device.
> > >
> > > The patch also adds a BUG_ON to catch this problem in any remaining or
> > > future offenders. I am not sure about network filesystems, but at
> > > least bdev- and mtd-based ones should be caught.
> > >
> > > Opinions?
> >
> > I think that BUG_ON() would be a lot better as a printk() and fail mount
> > instead. There's really little point in killing the kernel for something
> > you could easily warn about and recover nicely.
>
> *shrug*
> The BUG_ON directly above is not qualitatively different. In both cases
> the only solution is to find and fix the bug in some other file,
> recompile and try again. But ultimately I don't care, as long as we
> catch the bug before people lose their data.
>
> Feel free to respin this patch. You caused the problem in the first
> place. ;)
>
> For the record, while I consider the two-liner that causes this mess a
> real turd, the rest of your patch was brilliant. A shame I didn't catch
> this earlier.

Thanks, we definitely should have put a debug statement to catch this in
from day 1, good debugging should be an important part of any new
infrastructure.

Care to send your jffs2 patch separately to David? Then I'll commit a
modified variant for complaining about missing ->s_bdi on mount.

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Jörn Engel on 19 Apr 2010 07:50

On Mon, 19 April 2010 12:20:56 +0200, Jens Axboe wrote:
>
> Care to send your jffs2 patch separately to David? Then I'll commit a
> modified variant for complaining about missing ->s_bdi on mount.

Sure.

David, this patch is untested. It looks trivially correct and fixes a
nasty bug, but I don't test jffs2 and only noticed the problem in
passing.

Jörn

--
Maintenance in other professions and of other articles is concerned with
the return of the item to its original state; in Software, maintenance
is concerned with moving an item away from its original state.
-- Les Belady

diff --git a/drivers/mtd/mtdsuper.c b/drivers/mtd/mtdsuper.c
index af8b42e..7c00319 100644
--- a/drivers/mtd/mtdsuper.c
+++ b/drivers/mtd/mtdsuper.c
@@ -44,6 +45,7 @@ static int get_sb_mtd_set(struct super_block *sb, void *_mtd)

sb->s_mtd = mtd;
sb->s_dev = MKDEV(MTD_BLOCK_MAJOR, mtd->index);
+ sb->s_bdi = mtd->backing_dev_info;
return 0;
}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Jens Axboe on 22 Apr 2010 08:10

On Thu, Apr 22 2010, J�rn Engel wrote:
> On Thu, 22 April 2010 12:39:53 +0200, Jens Axboe wrote:
> >
> > Here's a series for fixing these. At this point they are totally
> > untested except that I did compile them. Note that your analysis
> > appeared correct for all cases but ocfs2, which does use get_sb_bdev()
> > and hence gets ->s_bdi assigned.
> >
> > You can see them here, I'll post the series soon:
> >
> > http://git.kernel.dk/?p=linux-2.6-block.git;a=shortlog;h=refs/heads/for-linus
> >
> > The first patch is a helper addition, the rest are per-fs fixups.
>
> Looks good at a cursory glance. What's still missing is some sort of
> assertion. You are a smart person and missed this problem, twice even.
> Even if you hadn't, a not so smart person can add a new filesystem and
> miss s_bdi, like I did. We want some automatism to catch this.

I totally agree, we want some way to catch this problem in the future.
Really the check needs to be something ala:

if (!sb->s_bdi && mnt_has_storage_backing() && rw_mount)
yell_and_fail;

but I'm not sure how best to accomplish that. We can check for ->s_bdev
and mtd like you did, but that does not catch network file systems for
instance.

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Jörn Engel on 22 Apr 2010 08:20

On Thu, 22 April 2010 14:08:37 +0200, Jens Axboe wrote:
>
> I totally agree, we want some way to catch this problem in the future.
> Really the check needs to be something ala:
>
> if (!sb->s_bdi && mnt_has_storage_backing() && rw_mount)
> yell_and_fail;
>
> but I'm not sure how best to accomplish that. We can check for ->s_bdev
> and mtd like you did, but that does not catch network file systems for
> instance.

One way would be to either add a flag to all safe filesystems or create
a noop_bdi for them. It adds a line of code and some bytes[*] per
instance to most filesystems, but that's the only catchall I can think
of.

I guess if noone comes up with a better plan I'll look into that.

[*] Maybe we can steal a bit somewhere to make it less expensive.

Jörn

--
You can't tell where a program is going to spend its time. Bottlenecks
occur in surprising places, so don't try to second guess and put in a
speed hack until you've proven that's where the bottleneck is.
-- Rob Pike
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

| Next | Last
Pages: 1 2 3
Prev: Phylib polling when doing mdio_read will cause system response and transfer speed drop
Next: [Bug #15610] fsck leads to swapper - BUG: unable to handle kernel NULL pointer dereference & panic