From: Mark Brown on
On Fri, Aug 06, 2010 at 04:35:59PM -0700, david(a)lang.hm wrote:
> On Fri, 6 Aug 2010, Paul E. McKenney wrote:

Guys, please try to cut unneeded text from the quotes - it makes it much
easier to find the new text.

>>> Well, not really from the Linux point of view. It's not massively
>>> different to something like keeping an ethernet controller sufficiently
>>> alive to allow it to provide wake on LAN functionality while the system

>> The wake-on-LAN and the lights-out management systems are indeed
>> interesting examples, and actually pretty good ones. The reason I
>> excluded them is that they don't do any application processing -- their
>> only purpose is the care and feeding of the system itself. In contrast,
>> the embedded processors are able to do significant applications processing
>> (e.g., play back music) while any CPUs are completely shut down and most
>> of the memory is powered down as well.

This isn't a particularly meaningful distinction, things like the LoM
systems on servers are generally at least as capable as things like the
DSPs doing tasks like offloaded MP3 decode and often provide useful
services in themselves (like system monitoring). It's really just
semantics to treat them differently to something like a cellular modem -
at a high level they're both just independant processors ticking away
without the application processor.

> one other significant issue is that on the PC, things like wake-on-LAN,
> lights out management cards, etc require nothing from the main system
> other than power. If they do something, they are sending the signal to
> the chipset, which then wakes the system up. they don't interact with the
> main processor/memory/etc at all.

I don't see that it makes much difference what gets kept alive - at the
end of the day the point is that we're making a decision to keep bits of
the system alive over suspend.

> So as I see it, we need to do one of two things.

> 1. change the suspend definition to allow for some things to not be
> suspended

This is essentially what's already happening.

> 2. change the sleep/low-power mode definition to have a more standardized
> way of turning things off, and extend it to allow clocks to be turned off
> as well (today we have things able to be turned off, drive spin-down for
> example, but per comments in this thread it's all one-off methods)

Currently things like clock trees are frequently managed orthogonaly to
the system power state to at least some extent anyway - for example,
perfectly normal wake events like button presses will often require
clocks for things like debouncing.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Paul E. McKenney on
On Sat, Aug 07, 2010 at 01:14:32AM +0100, Mark Brown wrote:
> On Fri, Aug 06, 2010 at 04:35:59PM -0700, david(a)lang.hm wrote:
> > On Fri, 6 Aug 2010, Paul E. McKenney wrote:
>
> Guys, please try to cut unneeded text from the quotes - it makes it much
> easier to find the new text.
>
> >>> Well, not really from the Linux point of view. It's not massively
> >>> different to something like keeping an ethernet controller sufficiently
> >>> alive to allow it to provide wake on LAN functionality while the system
>
> >> The wake-on-LAN and the lights-out management systems are indeed
> >> interesting examples, and actually pretty good ones. The reason I
> >> excluded them is that they don't do any application processing -- their
> >> only purpose is the care and feeding of the system itself. In contrast,
> >> the embedded processors are able to do significant applications processing
> >> (e.g., play back music) while any CPUs are completely shut down and most
> >> of the memory is powered down as well.
>
> This isn't a particularly meaningful distinction, things like the LoM
> systems on servers are generally at least as capable as things like the
> DSPs doing tasks like offloaded MP3 decode and often provide useful
> services in themselves (like system monitoring). It's really just
> semantics to treat them differently to something like a cellular modem -
> at a high level they're both just independant processors ticking away
> without the application processor.

I agree that a smartphone's cellular modem can be argued to be very
similar to wake-on-LAN. The smartphone applications that seem to me
to be very different from wake-on-LAN are things like audio playback,
where the system is providing service to the user during the time that
it is suspended.

> > one other significant issue is that on the PC, things like wake-on-LAN,
> > lights out management cards, etc require nothing from the main system
> > other than power. If they do something, they are sending the signal to
> > the chipset, which then wakes the system up. they don't interact with the
> > main processor/memory/etc at all.
>
> I don't see that it makes much difference what gets kept alive - at the
> end of the day the point is that we're making a decision to keep bits of
> the system alive over suspend.

The distinction is whether or not the system is perceived to be actively
doing something useful while it is suspended. Yes, this is subjective,
but the distinction is still important.

> > So as I see it, we need to do one of two things.
>
> > 1. change the suspend definition to allow for some things to not be
> > suspended
>
> This is essentially what's already happening.

The time-of-day clock is certainly a case in point here. ;-)

Thanx, Paul

> > 2. change the sleep/low-power mode definition to have a more standardized
> > way of turning things off, and extend it to allow clocks to be turned off
> > as well (today we have things able to be turned off, drive spin-down for
> > example, but per comments in this thread it's all one-off methods)
>
> Currently things like clock trees are frequently managed orthogonaly to
> the system power state to at least some extent anyway - for example,
> perfectly normal wake events like button presses will often require
> clocks for things like debouncing.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: david on
On Sat, 7 Aug 2010, Mark Brown wrote:

> On Fri, Aug 06, 2010 at 04:35:59PM -0700, david(a)lang.hm wrote:
>> On Fri, 6 Aug 2010, Paul E. McKenney wrote:
>
>> So as I see it, we need to do one of two things.
>
>> 1. change the suspend definition to allow for some things to not be
>> suspended
>
> This is essentially what's already happening.
>
>> 2. change the sleep/low-power mode definition to have a more standardized
>> way of turning things off, and extend it to allow clocks to be turned off
>> as well (today we have things able to be turned off, drive spin-down for
>> example, but per comments in this thread it's all one-off methods)
>
> Currently things like clock trees are frequently managed orthogonaly to
> the system power state to at least some extent anyway - for example,
> perfectly normal wake events like button presses will often require
> clocks for things like debouncing.

I recognise that #1 is essentially what Android is already doing.

I'm asking the question, "Is this what Linux should be doing?

Personally, I think that suspend should be treated much more like a
low-power state and much less like hibernation than it currently is (I
believe that Linus has also voiced this opinion). And I think that the
situation with Android suspending while audio is playing between busts of
CPU activity is a perfect example.

for the moment, forget the problem of other apps that may be running, and
consider a system that's just running a media player.

the media player needs bursts of CPU to decode the media so that the
output device can access it (via DMA or something like that)

the media player needs bursts of I/O to read the encoded program source
from storage.

What we want to have happen in an ideal world is

when the storage isn't needed (between reads) the storage should shutdown
to as low a power state as possible.

when the CPU isn't needed (between decoding bursts) the CPU and as much of
the system as possible (potentially including some banks of RAM) should
shutdown to as low a power state as possible.


today there are two ways of this happening, via the idle approach (on
everything except Android), or via suspend (on Android)

Given that many platforms cannot go to into suspend while still playing
audio, the idle approach is not going to be able to be eliminated (and in
fact will be the most common approach to be used/deugged in terms of the
types of platforms), it seems to me that there may be a significant amount
of value in seeing if there is a way to change Android to use this
approach as well instead of having two different systems competing to do
the same job.

David Lang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Arve Hjønnevåg on
2010/8/6 Alan Stern <stern(a)rowland.harvard.edu>:
> On Thu, 5 Aug 2010, Arve Hj�nnev�g wrote:
>
>> count, tells you how many times the wakelock was activated. If a
>> wakelock prevented suspend for a long time a large count tells you it
>> handled a lot of events while a small count tells you it took a long
>> time to process the events, or the wakelock was not released properly.
>
> As noted, we already have this.
>

Almost. We have it when a device is passed in.

>> expire_count, tells you how many times the timeout expired. For the
>> input event wakelock in the android kernel (which has a timeout) an
>> expire count that matches the count tells you that someone opened an
>> input device but is not reading from it (this has happened several
>> times).
>
> This is a little tricky. �Rafael's model currently does not allow
> wakeup events started by pm_wakeup_event() to be cancelled any way
> other than by having their timer expire. �This essentially means that
> for some devices, expire_count will always be the same as count and for
> others it will always be 0. �To change this would require adding an
> extra timer struct, which could be done (in fact, an earlier version of
> the code included it). �It would be nice if we could avoid the need.
>
> Does Android use any kernel-internal wakelocks both with a timer and
> with active cancellation?
>

I don't know if they are all kernel-internal but these drivers appear
to use timeouts and active cancellation on the same wakelock:
wifi driver, mmc core, alarm driver, evdev (suspend blocker version
removes the timeout).

>> wake_count, tells you that this is the first wakelock that was
>> acquired in the resume path. This is currently less useful than I
>> would like on the Nexus One since it is usually "SMD_RPCCALL" which
>> does not tell me a lot.
>
> This could be done easily enough, but if it's not very useful then
> there's no point.
>
It is useful there is no other way to tell what triggered a wakeup,
but it would probably be better to just track wakeup interrupts/events
elsewhere.

>> active_since, tells you how long a a still active wakelock has been
>> active. If someone activated a wakelock and never released it, it will
>> be obvious here.
>
> Easily added. �But you didn't mention any field saying whether the
> wakelock is currently active. �That could be added too (although it
> would be racy -- but for detecting unreleased wakelocks you wouldn't
> care).
>

These are the reported stats, not the fields in the stats structure.
The wakelock code has an active flag. If we want to keep the
pm_stay_wake nesting (which I would argue against), we would need an
active count. It would also require a handle, which is a change Rafael
said would not fly.

>> total_time, total time the wake lock has been active. This one should
>> be obvious.
>
> Also easily added.
>
Only with a handle passed to all the calls.

>> sleep_time, total time the wake lock has been active when the screen was off.
>
> Not applicable to general systems. �Is there anything like it that
> _would_ apply in general?
>

The screen off is how it is used on android, the stats is keyed of
what user space wrote to /sys/power/state. If "on" was written the
sleep time is not updated.

>> max_time, longest time the wakelock was active uninterrupted. This
>> used less often, but the battery on a device was draining fast, but
>> the problem went away before looking at the stats this will show if a
>> wakelock was active for a long time.
>
> Again, easily added. �The only drawback is that all these additions
> will bloat the size of struct device. �Of course, that's why you used
> separately-allocated structures for your wakelocks. �Maybe we can
> change to do the same; it seems likely that the majority of device
> structures won't ever be used for wakeup events.
>

Since many wakelocks are not associated with s struct device we need a
separate object for this anyway.

>> >> and I would prefer that the kernel interfaces would
>> >> encourage drivers to block suspend until user space has consumed the
>> >> event, which works for the android user space, instead of just long
>> >> enough to work with a hypothetical user space power manager.
>
> Rafael doesn't _discourage_ drivers from doing this. �However you have
> to keep in mind that many kernel developers are accustomed to working
> on systems (mostly PCs) with a different range of hardware devices from
> embedded systems like your phones. �With PCI devices(*), for example,
> there's no clear point where a wakeup event gets handed off to
> userspace.
>
> On the other hand, there's no reason the input layer shouldn't use
> pm_stay_awake and pm_relax. �It simply hasn't been implemented yet.
....

The merged user space interface makes this unclear to me. When I first
used suspend on android I had a power manager process that opened all
the input devices and reset a screen off timeout every time there was
an input event. If the input layer uses pm_stay_awake to block suspend
when the queue is not empty, this will deadlock with the current
interface since reading the wake count will block forever if an input
event occurred right after the power manager decides to suspend.

--
Arve Hj�nnev�g
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Ted Ts'o on
On Fri, Aug 06, 2010 at 06:00:34PM -0700, david(a)lang.hm wrote:
>
> today there are two ways of this happening, via the idle approach
> (on everything except Android), or via suspend (on Android)
>

Most other devices use a lot more power at idle; in some cases it's
because the hardware just isn't as power optimized (why bother, when
you have 94,000 mWh of power at your disposal with a 6 cell laptop
battery, as opposed to the 800-1000 mWh that you might have on a cell
phone battery). In other cases, it's because the kernel and the
low-level software stack (never mind the applications) are waking up
the CPU too darned often --- in other words, idle simply isn't idle
enough.

So you may want to consider whether part of the problem is that
general purpose Linux systems need a radical redesign to get power
utilization down to those sorts of levels --- where the CPU might only
be waking up once every half-hour or so, and then only do actual
useful work.

Can you get there by assuming that every single application is
competently written? In an idle approach, you have to. That way lies
Maemo, where installing just one bad application will cut your battery
life time by a factor of 2-3. You could try stopping processes by
using kill -STOP, but this at that point, you've moved into Android
strategy of "suspend". And the only question is what is the most
efficient way to allow the system to run when there is true work that
needs to be done, and how to avoid deadlocks by stopping processes
that might be holding user space locks --- and to administer when and
how to suspend the processes.

Sure, you could do someting amazing complicated using cgroups, and
user space processes that have to wake up the CPU every 30 seconds to
see if it's safe to suspend the system --- but why not just use the
system which is being used by 200,000 new phones every day? It's
simple and it works. And unlike Maemo, untrustworthy applications
don't end up chewing up your battery lifetime.

- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/