From: Artem Bityutskiy on
On Thu, 2010-01-21 at 22:03 -0500, Jeff Angielski wrote:
> I am trying use an UBIFS root filesystem on my PowerPC MPC8544 but I am
> seeing some intermitent problems with "UBIFS assert failed in
> ubifs_dirty_inode" errors.
>
> On the first boot after I program the NAND with a fresh UBI image,
> everything seems to work ok.
>
> After that, on subsequent powercycles or reboots, I sometimes see a boot
> with the following error:
>
> [ 5.984232] UBIFS assert failed in ubifs_dirty_inode at 377 (pid 1011)

Interesting.

The stack trace for this assertion is:

[ 42.724193] [df121e60] [c00070f8] show_stack+0x3c/0x17c (unreliable)
[ 42.730566] [df121ea0] [c017e754] ubifs_dirty_inode+0xe0/0xe4
[ 42.736325] [df121eb0] [c00c6fbc] __mark_inode_dirty+0x3c/0x16c
[ 42.742260] [df121ec0] [c01f9034] random_write+0x8c/0xa4
[ 42.747584] [df121ef0] [c00a4d2c] vfs_write+0xb4/0x184
[ 42.752730] [df121f10] [c00a53e8] sys_write+0x4c/0x90
[ 42.757795] [df121f40] [c000fd48] ret_from_syscall+0x0/0x3c

Which leads us to

~/git/linux-2.6/drivers/char/random.c, where we can find this code:

inode->i_mtime = current_fs_time(inode->i_sb);
mark_inode_dirty(inode);
return (ssize_t)count;

which is the reason for the assertion and for the further budgeting
screw-up.

The thing is that UBIFS has to allocate budget every-time a clean inode
is made dirty. But the 'random' driver bypasses UBIFS budget allocation,
and instead, changes mtime directly and marks the inode as dirty
directly.

The driver should instead call the ->setattr() method of the inode,
which should do the right thing. IOW, something like this is needed:

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 8258982..f911781 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1108,6 +1108,7 @@ static ssize_t random_write(struct file *file,
const char __user *buffer,
{
size_t ret;
struct inode *inode = file->f_path.dentry->d_inode;
+ struct iattr;

ret = write_pool(&blocking_pool, buffer, count);
if (ret)
@@ -1116,8 +1117,11 @@ static ssize_t random_write(struct file *file,
const char __user *buffer,
if (ret)
return ret;

- inode->i_mtime = current_fs_time(inode->i_sb);
- mark_inode_dirty(inode);
+ iattr->i_mtime = current_fs_time(inode->i_sb);
+ iattr->ia_valid = ATTR_ATIME;
+ ret = inode_setattr(inode, &attr);
+ if (ret)
+ return ret;
return (ssize_t)count;
}

Note - I did not even compile-test this. Would you try that please :-)

CCing Matt and lkml.

> If that message occurs, in less than 1 or 2 minutes, the system is
> unresponsive and I start to see infinite messages of:
>
> [ 55.072791] UBIFS assert failed in ubifs_release_budget at 566 (pid 974)
> [ 55.079502] Call Trace:
> [ 55.081960] [df187ce0] [c00070f8] show_stack+0x3c/0x17c (unreliable)
> [ 55.088334] [df187d20] [c0196a44] ubifs_release_budget+0x3e0/0x5d0
> [ 55.094533] [df187d30] [c01752d4] release_existing_page_budget+0x30/0x40
> [ 55.101241] [df187d60] [c01765ac] do_writepage+0xd0/0x1e4
> [ 55.106652] [df187da0] [c00786e0] __writepage+0x24/0x80
> [ 55.111884] [df187db0] [c0078b14] write_cache_pages+0x184/0x310
> [ 55.117817] [df187e50] [c00c5cb0] writeback_single_inode+0xac/0x288
> [ 55.124092] [df187e80] [c00c6608] writeback_inodes_wb+0x2d4/0x458
> [ 55.130193] [df187ed0] [c00c68b8] wb_writeback+0x12c/0x200
> [ 55.135686] [df187f40] [c00c6cb0] wb_do_writeback+0x224/0x244
> [ 55.141439] [df187f80] [c00c6d38] bdi_writeback_task+0x68/0xa8
> [ 55.147282] [df187fa0] [c0086430] bdi_start_fn+0x80/0x104
> [ 55.152691] [df187fc0] [c0050cc4] kthread+0x78/0x7c
> [ 55.157581] [df187ff0] [c000fac8] kernel_thread+0x4c/0x68
> [ 55.163362] UBIFS assert failed in ubifs_budget_space at 463 (pid 1139)

[sip]

--
Best Regards,
Artem Bityutskiy (Артём Битюцкий)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Matt Mackall on
On Tue, 2010-01-26 at 06:40 +0200, Artem Bityutskiy wrote:
> On Thu, 2010-01-21 at 22:03 -0500, Jeff Angielski wrote:
> > I am trying use an UBIFS root filesystem on my PowerPC MPC8544 but I am
> > seeing some intermitent problems with "UBIFS assert failed in
> > ubifs_dirty_inode" errors.
> >
> > On the first boot after I program the NAND with a fresh UBI image,
> > everything seems to work ok.
> >
> > After that, on subsequent powercycles or reboots, I sometimes see a boot
> > with the following error:
> >
> > [ 5.984232] UBIFS assert failed in ubifs_dirty_inode at 377 (pid 1011)
>
> Interesting.
>
> The stack trace for this assertion is:
>
> [ 42.724193] [df121e60] [c00070f8] show_stack+0x3c/0x17c (unreliable)
> [ 42.730566] [df121ea0] [c017e754] ubifs_dirty_inode+0xe0/0xe4
> [ 42.736325] [df121eb0] [c00c6fbc] __mark_inode_dirty+0x3c/0x16c
> [ 42.742260] [df121ec0] [c01f9034] random_write+0x8c/0xa4
> [ 42.747584] [df121ef0] [c00a4d2c] vfs_write+0xb4/0x184
> [ 42.752730] [df121f10] [c00a53e8] sys_write+0x4c/0x90
> [ 42.757795] [df121f40] [c000fd48] ret_from_syscall+0x0/0x3c
>
> Which leads us to
>
> ~/git/linux-2.6/drivers/char/random.c, where we can find this code:
>
> inode->i_mtime = current_fs_time(inode->i_sb);
> mark_inode_dirty(inode);
> return (ssize_t)count;
>
> which is the reason for the assertion and for the further budgeting
> screw-up.
>
> The thing is that UBIFS has to allocate budget every-time a clean inode
> is made dirty. But the 'random' driver bypasses UBIFS budget allocation,
> and instead, changes mtime directly and marks the inode as dirty
> directly.
>
> The driver should instead call the ->setattr() method of the inode,
> which should do the right thing. IOW, something like this is needed:
>
> diff --git a/drivers/char/random.c b/drivers/char/random.c
> index 8258982..f911781 100644
> --- a/drivers/char/random.c
> +++ b/drivers/char/random.c
> @@ -1108,6 +1108,7 @@ static ssize_t random_write(struct file *file,
> const char __user *buffer,
> {
> size_t ret;
> struct inode *inode = file->f_path.dentry->d_inode;
> + struct iattr;
>
> ret = write_pool(&blocking_pool, buffer, count);
> if (ret)
> @@ -1116,8 +1117,11 @@ static ssize_t random_write(struct file *file,
> const char __user *buffer,
> if (ret)
> return ret;
>
> - inode->i_mtime = current_fs_time(inode->i_sb);
> - mark_inode_dirty(inode);
> + iattr->i_mtime = current_fs_time(inode->i_sb);
> + iattr->ia_valid = ATTR_ATIME;
> + ret = inode_setattr(inode, &attr);
> + if (ret)
> + return ret;
> return (ssize_t)count;
> }
>
> Note - I did not even compile-test this. Would you try that please :-)

Hmm. I'd just as soon drop it entirely. Here's a patch. Herbert, you
want to send this through your crypto tree?


random: drop weird m_time/a_time manipulation

No other driver does anything remotely like this that I know of except
for the tty drivers, and I can't see any reason for random/urandom to do
it. In fact, it's a (trivial, harmless) timing information leak. And
obviously, it generates power- and flash-cycle wasting I/O, especially
if combined with something like hwrngd. Also, it breaks ubifs's
expectations.

Signed-off-by: Matt Mackall <mpm(a)selenic.com>

diff -r 29db0c391ce8 drivers/char/random.c
--- a/drivers/char/random.c Sun Jan 17 11:01:16 2010 -0800
+++ b/drivers/char/random.c Mon Jan 25 23:32:00 2010 -0600
@@ -1051,12 +1051,6 @@
/* like a named pipe */
}

- /*
- * If we gave the user some bytes, update the access time.
- */
- if (count)
- file_accessed(file);
-
return (count ? count : retval);
}

@@ -1116,8 +1110,6 @@
if (ret)
return ret;

- inode->i_mtime = current_fs_time(inode->i_sb);
- mark_inode_dirty(inode);
return (ssize_t)count;
}


--
http://selenic.com : development and support for Mercurial and Linux


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Artem Bityutskiy on
On Tue, 2010-01-26 at 12:03 +0200, Artem Bityutskiy wrote:
> On Mon, 2010-01-25 at 23:48 -0600, Matt Mackall wrote:
> > Hmm. I'd just as soon drop it entirely. Here's a patch. Herbert, you
> > want to send this through your crypto tree?
> >
> >
> > random: drop weird m_time/a_time manipulation
> >
> > No other driver does anything remotely like this that I know of except
> > for the tty drivers, and I can't see any reason for random/urandom to do
> > it. In fact, it's a (trivial, harmless) timing information leak. And
> > obviously, it generates power- and flash-cycle wasting I/O, especially
> > if combined with something like hwrngd. Also, it breaks ubifs's
> > expectations.
> >
> > Signed-off-by: Matt Mackall <mpm(a)selenic.com>

Just in case anyone wonders where this came from, here is the beginning
of the thread:

http://lists.infradead.org/pipermail/linux-mtd/2010-January/028727.html

--
Best Regards,
Artem Bityutskiy (Артём Битюцкий)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Artem Bityutskiy on
On Mon, 2010-01-25 at 23:48 -0600, Matt Mackall wrote:
> Hmm. I'd just as soon drop it entirely. Here's a patch. Herbert, you
> want to send this through your crypto tree?
>
>
> random: drop weird m_time/a_time manipulation
>
> No other driver does anything remotely like this that I know of except
> for the tty drivers, and I can't see any reason for random/urandom to do
> it. In fact, it's a (trivial, harmless) timing information leak. And
> obviously, it generates power- and flash-cycle wasting I/O, especially
> if combined with something like hwrngd. Also, it breaks ubifs's
> expectations.
>
> Signed-off-by: Matt Mackall <mpm(a)selenic.com>
>
> diff -r 29db0c391ce8 drivers/char/random.c
> --- a/drivers/char/random.c Sun Jan 17 11:01:16 2010 -0800
> +++ b/drivers/char/random.c Mon Jan 25 23:32:00 2010 -0600
> @@ -1051,12 +1051,6 @@
> /* like a named pipe */
> }
>
> - /*
> - * If we gave the user some bytes, update the access time.
> - */
> - if (count)
> - file_accessed(file);
> -
> return (count ? count : retval);
> }
>
> @@ -1116,8 +1110,6 @@
> if (ret)
> return ret;
>
> - inode->i_mtime = current_fs_time(inode->i_sb);
> - mark_inode_dirty(inode);
> return (ssize_t)count;
> }

It may brake other FSes expectations, theoretically, as well.

Anyway, I'm perfectly fine if this is removed.

Jeff, could you please try Matt's patch and report back if you still
have issues or not. If no, you can use this as a temporary work-around
until a proper fix hits upstream or ubifs-2.6.git.

Thanks!

--
Best Regards,
Artem Bityutskiy (Артём Битюцкий)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Jeff Angielski on
Artem Bityutskiy wrote:
> On Mon, 2010-01-25 at 23:48 -0600, Matt Mackall wrote:
>> Hmm. I'd just as soon drop it entirely. Here's a patch. Herbert, you
>> want to send this through your crypto tree?
>>
>>
>> random: drop weird m_time/a_time manipulation
>>
>> No other driver does anything remotely like this that I know of except
>> for the tty drivers, and I can't see any reason for random/urandom to do
>> it. In fact, it's a (trivial, harmless) timing information leak. And
>> obviously, it generates power- and flash-cycle wasting I/O, especially
>> if combined with something like hwrngd. Also, it breaks ubifs's
>> expectations.
>>
>> Signed-off-by: Matt Mackall <mpm(a)selenic.com>
>>
>> diff -r 29db0c391ce8 drivers/char/random.c
>> --- a/drivers/char/random.c Sun Jan 17 11:01:16 2010 -0800
>> +++ b/drivers/char/random.c Mon Jan 25 23:32:00 2010 -0600
>> @@ -1051,12 +1051,6 @@
>> /* like a named pipe */
>> }
>>
>> - /*
>> - * If we gave the user some bytes, update the access time.
>> - */
>> - if (count)
>> - file_accessed(file);
>> -
>> return (count ? count : retval);
>> }
>>
>> @@ -1116,8 +1110,6 @@
>> if (ret)
>> return ret;
>>
>> - inode->i_mtime = current_fs_time(inode->i_sb);
>> - mark_inode_dirty(inode);
>> return (ssize_t)count;
>> }
>
> It may brake other FSes expectations, theoretically, as well.
>
> Anyway, I'm perfectly fine if this is removed.
>
> Jeff, could you please try Matt's patch and report back if you still
> have issues or not. If no, you can use this as a temporary work-around
> until a proper fix hits upstream or ubifs-2.6.git.

Matt's patch did not compile as written. I tried to implement what I
think he was trying to do and created this patch (it seems to match the
guts of what inode_setattr() was looking for):


diff --git a/drivers/char/random.c b/drivers/char/random.c
index 8258982..70f16c7 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1108,6 +1108,7 @@ static ssize_t random_write(struct file *file,
const char __user *buffer,
{
size_t ret;
struct inode *inode = file->f_path.dentry->d_inode;
+ struct iattr attr;

ret = write_pool(&blocking_pool, buffer, count);
if (ret)
@@ -1116,8 +1117,12 @@ static ssize_t random_write(struct file *file,
const char __user *buffer,
if (ret)
return ret;

- inode->i_mtime = current_fs_time(inode->i_sb);
- mark_inode_dirty(inode);
+ attr.ia_mtime = current_fs_time(inode->i_sb);
+ attr.ia_valid = ATTR_MTIME;
+ ret = inode_setattr(inode, &attr);
+ if (ret)
+ return ret;
+
return (ssize_t)count;
}

However, this patch does not fix the problem. I still see the same
errors. Matt, is this what you were trying to do?

I've also included the console dump just in case.

I did try Artem's patch that removes the offending code and that works
fine. No problems on any reboots or reading/writing the UBIFS.


--
Jeff Angielski
The PTR Group
www.theptrgroup.com