From: Dave Chinner on
On Mon, Jul 12, 2010 at 11:52:39PM +0800, Wu Fengguang wrote:
> On Mon, Jul 12, 2010 at 10:08:42AM +0800, Dave Chinner wrote:
> > On Sun, Jul 11, 2010 at 10:07:02AM +0800, Wu Fengguang wrote:
> > > - /*
> > > - * akpm: if the caller was the kupdate function we put
> > > - * this inode at the head of b_dirty so it gets first
> > > - * consideration. Otherwise, move it to the tail, for
> > > - * the reasons described there. I'm not really sure
> > > - * how much sense this makes. Presumably I had a good
> > > - * reasons for doing it this way, and I'd rather not
> > > - * muck with it at present.
> > > - */
> > > - if (wbc->for_kupdate) {
> > > + inode->i_state |= I_DIRTY_PAGES;
> > > + if (wbc->nr_to_write <= 0) {
> > > /*
> > > - * For the kupdate function we move the inode
> > > - * to b_more_io so it will get more writeout as
> > > - * soon as the queue becomes uncongested.
> > > + * slice used up: queue for next turn
> > > */
> > > - inode->i_state |= I_DIRTY_PAGES;
> > > - if (wbc->nr_to_write <= 0) {
> > > - /*
> > > - * slice used up: queue for next turn
> > > - */
> > > - requeue_io(inode);
> > > - } else {
> > > - /*
> > > - * somehow blocked: retry later
> > > - */
> > > - redirty_tail(inode);
> > > - }
> > > + requeue_io(inode);
> > > } else {
> > > /*
> > > - * Otherwise fully redirty the inode so that
> > > - * other inodes on this superblock will get some
> > > - * writeout. Otherwise heavy writing to one
> > > - * file would indefinitely suspend writeout of
> > > - * all the other files.
> > > + * somehow blocked: retry later
> > > */
> > > - inode->i_state |= I_DIRTY_PAGES;
> > > redirty_tail(inode);
> > > }
> >
> > This means that congestion will always trigger redirty_tail(). Is
> > that really what we want for that case?
>
> This patch actually converts some redirty_tail() cases to use
> requeue_io(), so are reducing the use of redirty_tail(). Also
> recent kernels are blocked _inside_ get_request() on congestion
> instead of returning to writeback_single_inode() on congestion.
> So the "somehow blocked" comment for redirty_tail() no longer includes
> the congestion case.

Shouldn't some of this be in the comment explain why the tail is
redirtied rather than requeued?

> > Also, I'd prefer that the
> > comments remain somewhat more descriptive of the circumstances that
> > we are operating under. Comments like "retry later to avoid blocking
> > writeback of other inodes" is far, far better than "retry later"
> > because it has "why" component that explains the reason for the
> > logic. You may remember why, but I sure won't in a few months time....
>
> Ah yes the comment is too simple. However the redirty_tail() is not to
> avoid blocking writeback of other inodes, but to avoid eating 100% CPU
> on busy retrying a dirty inode/page that cannot perform writeback for
> a while. (In theory redirty_tail() can still busy retry though, when
> there is only one single dirty inode.) So how about
>
> /*
> * somehow blocked: avoid busy retrying
> */

IMO, no better than "somehow blocked: retry later" because it
desont' include any of the explanation for the code you just gave
me. The comment needs to tell us _why_ we are calling
redirty_tail, not what redirty_tail does. Perhaps something like:

/*
* Writeback blocked by something other than congestion.
* Redirty the inode to avoid spinning on the CPU retrying
* writeback of the dirty page/inode that cannot be
* performed immediately. This allows writeback of other
* inodes until the blocking condition clears.
*/

Cheers,

Dave.
--
Dave Chinner
david(a)fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Andrew Morton on
On Mon, 12 Jul 2010 23:52:39 +0800
Wu Fengguang <fengguang.wu(a)intel.com> wrote:

> > Also, I'd prefer that the
> > comments remain somewhat more descriptive of the circumstances that
> > we are operating under. Comments like "retry later to avoid blocking
> > writeback of other inodes" is far, far better than "retry later"
> > because it has "why" component that explains the reason for the
> > logic. You may remember why, but I sure won't in a few months time....

me2 (of course). This code is waaaay too complex to be scrimping on comments.

> Ah yes the comment is too simple. However the redirty_tail() is not to
> avoid blocking writeback of other inodes, but to avoid eating 100% CPU
> on busy retrying a dirty inode/page that cannot perform writeback for
> a while. (In theory redirty_tail() can still busy retry though, when
> there is only one single dirty inode.) So how about
>
> /*
> * somehow blocked: avoid busy retrying
> */

That's much too short. Expand on the "somehow" - provide an example,
describe the common/expected cause. Fully explain what the "busy"
retry _is_ and how it can come about.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Wu Fengguang on
On Tue, Jul 13, 2010 at 06:22:54AM +0800, Andrew Morton wrote:
> On Mon, 12 Jul 2010 23:52:39 +0800
> Wu Fengguang <fengguang.wu(a)intel.com> wrote:
>
> > > Also, I'd prefer that the
> > > comments remain somewhat more descriptive of the circumstances that
> > > we are operating under. Comments like "retry later to avoid blocking
> > > writeback of other inodes" is far, far better than "retry later"
> > > because it has "why" component that explains the reason for the
> > > logic. You may remember why, but I sure won't in a few months time....
>
> me2 (of course). This code is waaaay too complex to be scrimping on comments.
>
> > Ah yes the comment is too simple. However the redirty_tail() is not to
> > avoid blocking writeback of other inodes, but to avoid eating 100% CPU
> > on busy retrying a dirty inode/page that cannot perform writeback for
> > a while. (In theory redirty_tail() can still busy retry though, when
> > there is only one single dirty inode.) So how about
> >
> > /*
> > * somehow blocked: avoid busy retrying
> > */
>
> That's much too short. Expand on the "somehow" - provide an example,
> describe the common/expected cause. Fully explain what the "busy"
> retry _is_ and how it can come about.

It was a long story.. This redirty_tail() was introduced when more_io
is introduced. The initial patch for more_io does not have the
redirty_tail(), and when it's merged, several 100% iowait bug reports
arises:

reiserfs:
http://lkml.org/lkml/2007/10/23/93

jfs:
commit 29a424f28390752a4ca2349633aaacc6be494db5
JFS: clear PAGECACHE_TAG_DIRTY for no-write pages

ext2:
http://www.spinics.net/linux/lists/linux-ext4/msg04762.html

They are all old bugs hidden in various filesystems that become
"obvious" with the more_io patch. At the time, the ext2 bug is thought
to be "trivial", so you didn't merge that fix. Instead the following
patch with redirty_tail() is merged:

http://www.spinics.net/linux/lists/linux-ext4/msg04507.html

This will in general prevent 100% on ext2 and other possibly unknown FS bugs.

I'll take David's comments and note the above in changelog.

Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/