From: Nick Piggin on
On Wed, Jul 21, 2010 at 12:31:46PM +0300, Artem Bityutskiy wrote:
> @@ -973,22 +981,37 @@ void __mark_inode_dirty(struct inode *inode, int flags)
> * reposition it (that would break b_dirty time-ordering).
> */
> if (!was_dirty) {
> - struct bdi_writeback *wb = &inode_to_bdi(inode)->wb;
> - struct backing_dev_info *bdi = wb->bdi;
> -
> - if (bdi_cap_writeback_dirty(bdi) &&
> - !test_bit(BDI_registered, &bdi->state)) {
> - WARN_ON(1);
> - printk(KERN_ERR "bdi-%s not registered\n",
> - bdi->name);
> + bdi = inode_to_bdi(inode);
> +
> + if (bdi_cap_writeback_dirty(bdi)) {
> + WARN(!test_bit(BDI_registered, &bdi->state),
> + "bdi-%s not registered\n", bdi->name);
> +
> + /*
> + * If this is the first dirty inode for this
> + * bdi, we have to wake-up the corresponding
> + * bdi thread to make sure background
> + * write-back happens later.
> + */
> + if (!wb_has_dirty_io(&bdi->wb))
> + wakeup_bdi = true;
> }
>
> inode->dirtied_when = jiffies;
> - list_move(&inode->i_list, &wb->b_dirty);
> + list_move(&inode->i_list, &bdi->wb.b_dirty);
> }
> }
> out:
> spin_unlock(&inode_lock);
> +
> + if (wakeup_bdi) {
> + spin_lock(&bdi->wb_lock);
> + if (!bdi->wb.task)
> + wake_up_process(default_backing_dev_info.wb.task);
> + else
> + wake_up_process(bdi->wb.task);
> + spin_unlock(&bdi->wb_lock);
> + }
> }

We really want to wake up the bdi right away when first dirtying
the inode? I haven't looked at where the state of the bdi code is
now, but isn't it better to have a a delay there?

And rather than spreading details of how bdi tasks are managed
would you consider putting this into its own function?

Other than that, I like your patches. Out of interest, is 5 seconds
very detremental to power usage? What is a reasonable goal for
wakeups? (eg. 95%+ of possible efficiency)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Artem Bityutskiy on
On Thu, 2010-07-22 at 10:41 +1000, Dave Chinner wrote:
> if (wakeup_bdi) {
> trace_writeback_wakeup(bdi)
> spin_lock(&bdi->wb_lock);
> if (!bdi->wb.task) {{
> trace_writeback_wakeup_nothread(bdi);
> wake_up_process(default_backing_dev_info.wb.task);
> } else
> wake_up_process(bdi->wb.task);
> spin_unlock(&bdi->wb_lock);
> }

OK, I'll take a look at this and will try to add this to v3, thanks.

--
Best Regards,
Artem Bityutskiy (Артём Битюцкий)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Tero.Kristo on


>-----Original Message-----
>From: Artem Bityutskiy [mailto:dedekind1(a)gmail.com]
>Sent: 22 July, 2010 09:48
>To: Nick Piggin; Kristo Tero (Nokia-MS/Tampere)
>Cc: Jens Axboe; linux-fsdevel(a)vger.kernel.org; linux-
>kernel(a)vger.kernel.org
>Subject: Re: [PATCHv2 11/11] writeback: prevent unnecessary bdi threads
>wakeups
>
>Hi Nick,
>
>On Thu, 2010-07-22 at 13:19 +1000, Nick Piggin wrote:
>> > out:
>> > spin_unlock(&inode_lock);
>> > +
>> > + if (wakeup_bdi) {
>> > + spin_lock(&bdi->wb_lock);
>> > + if (!bdi->wb.task)
>> > + wake_up_process(default_backing_dev_info.wb.task);
>> > + else
>> > + wake_up_process(bdi->wb.task);
>> > + spin_unlock(&bdi->wb_lock);
>> > + }
>> > }
>>
>> We really want to wake up the bdi right away when first dirtying
>> the inode? I haven't looked at where the state of the bdi code is
>> now, but isn't it better to have a a delay there?
>
>Yes, I guess we want to wake up the bdi thread after 5 secs (assuming
>default settings). I could implement a "wake_up_process_delayed"
>function which would use a timer, but I think it is not necessary to
>introduce these complications. We can just wake-up the bdi thread, it'll
>find out there is nothing to do, and will go sleep for 5 secs. I think
>this is good enough.
>
>IOW, delayed wake-up is not worth the trouble.
>
>> And rather than spreading details of how bdi tasks are managed
>> would you consider putting this into its own function?
>
>Sure, will do.
>
>> Other than that, I like your patches.
>
>Thanks :-)
>
>> Out of interest, is 5 seconds
>> very detremental to power usage? What is a reasonable goal for
>> wakeups? (eg. 95%+ of possible efficiency)
>
>I cannot tell for sure. In Nokia N900 phone we use OMAP3 and we have
>dynamic OFF-mode, so we switch off the CPU and peripherals completely
>when there is nothing to do, and SDRAM stays in low-power auto-refresh
>mode. Every useless wake-up makes us do a lot of job re-constructing the
>CPU state. I cannot tell the numbers, but I'm CCing Tero, who is working
>on OMAP3 PM and makes a lot of battery current measurements, he can
>provide some numbers.

Well, it is hard to give any good guidelines here, as it really depends on the device architecture, possible power saving modes etc., but I can give some sort of guestimate. Basically I think kernel should not generate any extra wakeups at all if it is not doing "anything too useful". In ideal world, everything should be interrupt driven as much as possible, and we would only have timers for things that are clearly visible for user, or can cause some sort of failure if neglected. Like if we ignore watchdogs, the device will reset itself.

5 seconds by itself is not that bad, the reason we want to get rid of these is that every single wakeup source cumulates. If we have 2 wakeups occurring at 5 second intervals and they are not synced, we effectively can wakeup every 2.5 seconds and so on. I guess a good target is to have 1 device level wakeup every 30 seconds or so, but due to cumulation, I tend to complain about anything that happens more often than once a minute.

-Tero

From: Artem Bityutskiy on
On Thu, 2010-07-22 at 18:05 +1000, Nick Piggin wrote:
> I can see what you mean, but I think the designs in core code should
> be made as efficient as possible _unless_ there is some complication
> in doing otherwise (not the other way around).
>
> This is producing 2 unrequired context switches, so I really would
> like to see it done properly. Setting up a timer is really pretty
> simple (or if you would care to implement a delayed process wakeup
> API, I think that would be useful -- I'm surprised there isn't one
> already).

OK, NP, I'll work on this.

The only problem I see is that it will involve more maintainers and
trees (I guess Ingo?), and make it even more difficult for me to reach
upstream :-) But let's try and see!

--
Best Regards,
Artem Bityutskiy (Артём Битюцкий)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Nick Piggin on
On Thu, Jul 22, 2010 at 09:48:24AM +0300, Artem Bityutskiy wrote:
> Hi Nick,
>
> On Thu, 2010-07-22 at 13:19 +1000, Nick Piggin wrote:
> > > out:
> > > spin_unlock(&inode_lock);
> > > +
> > > + if (wakeup_bdi) {
> > > + spin_lock(&bdi->wb_lock);
> > > + if (!bdi->wb.task)
> > > + wake_up_process(default_backing_dev_info.wb.task);
> > > + else
> > > + wake_up_process(bdi->wb.task);
> > > + spin_unlock(&bdi->wb_lock);
> > > + }
> > > }
> >
> > We really want to wake up the bdi right away when first dirtying
> > the inode? I haven't looked at where the state of the bdi code is
> > now, but isn't it better to have a a delay there?
>
> Yes, I guess we want to wake up the bdi thread after 5 secs (assuming
> default settings). I could implement a "wake_up_process_delayed"
> function which would use a timer, but I think it is not necessary to
> introduce these complications. We can just wake-up the bdi thread, it'll
> find out there is nothing to do, and will go sleep for 5 secs. I think
> this is good enough.
>
> IOW, delayed wake-up is not worth the trouble.

I can see what you mean, but I think the designs in core code should
be made as efficient as possible _unless_ there is some complication
in doing otherwise (not the other way around).

This is producing 2 unrequired context switches, so I really would
like to see it done properly. Setting up a timer is really pretty
simple (or if you would care to implement a delayed process wakeup
API, I think that would be useful -- I'm surprised there isn't one
already).


> > And rather than spreading details of how bdi tasks are managed
> > would you consider putting this into its own function?
>
> Sure, will do.
>
> > Other than that, I like your patches.
>
> Thanks :-)
>
> > Out of interest, is 5 seconds
> > very detremental to power usage? What is a reasonable goal for
> > wakeups? (eg. 95%+ of possible efficiency)
>
> I cannot tell for sure. In Nokia N900 phone we use OMAP3 and we have
> dynamic OFF-mode, so we switch off the CPU and peripherals completely
> when there is nothing to do, and SDRAM stays in low-power auto-refresh
> mode. Every useless wake-up makes us do a lot of job re-constructing the
> CPU state. I cannot tell the numbers, but I'm CCing Tero, who is working
> on OMAP3 PM and makes a lot of battery current measurements, he can
> provide some numbers.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/