From: Rafael J. Wysocki on
On Wednesday, August 04, 2010, Paul E. McKenney wrote:
> On Tue, Aug 03, 2010 at 08:39:22PM -0700, Arve Hj�nnev�g wrote:
> > On Tue, Aug 3, 2010 at 5:51 PM, <david(a)lang.hm> wrote:
> > > On Tue, 3 Aug 2010, Paul E. McKenney wrote:
> > >
> > >> On Tue, Aug 03, 2010 at 04:19:25PM -0700, david(a)lang.hm wrote:
> > >>>
> > >>> On Tue, 3 Aug 2010, Arve Hj?nnev?g wrote:
> > >>>
> > >>>> 2010/8/2 <david(a)lang.hm>:
> > >>>>>
> > >>>>> so what is the fundamental difference between deciding to go into
> > >>>>> low-power
> > >>>>> idle modes to wake up back up on a given point in the future and
> > >>>>> deciding
> > >>>>> that you are going to be idle for so long that you may as well suspend
> > >>>>> until
> > >>>>> there is user input?
> > >>>>>
> > >>>>
> > >>>> Low power idle modes are supposed to be transparent. Suspend stops the
> > >>>> monotonic clock, ignores ready threads and switches over to a separate
> > >>>> set of wakeup events/interrupts. We don't suspend until there is user
> > >>>> input, we suspend until there is a wakeup event (user-input, incoming
> > >>>> network data/phone-calls, alarms etc..).
> > >>>
> > >>> s/user input/wakeup event/ and my question still stands.
> > >>>
> > >>> low power modes are not transparent to the user in all cases (if the
> > >>> screen backlight dimms/shuts off a user reading something will
> > >>> notice, if the system switches to a lower clock speed it can impact
> > >>> user response time, etc) The system is making it's best guess as to
> > >>> how to best srve the user by sacraficing some capibilities to save
> > >>> power now so that the power can be available later.
> > >>>
> > >>> as I see it, suspending until a wakeup event (button press, incoming
> > >>> call, alarm, etc) is just another datapoint along the same path.
> > >>>
> > >>> If the system could not wake itself up to respond to user input,
> > >>> phone call, alarm, etc and needed the power button pressed to wake
> > >>> up (or shut down to the point where the battery could be removed and
> > >>> reinstalled a long time later), I would see things moving into a
> > >>> different category, but as long as the system has the ability to
> > >>> wake itself up later (and is still consuming power) I see the
> > >>> suspend as being in the same category as the other low-power modes
> > >>> (it's just more expensive to go in and out of)
> > >>>
> > >>>
> > >>> why should the suspend be put into a different category from the
> > >>> other low-power states?
> > >>
> > >> OK, I'll bite...
> > >
> > > thanks, this is not intended to be a trap.
> > >
> > >> From an Android perspective, the differences are as follows:
> > >>
> > >> 1. Deep idle states are entered only if there are no runnable tasks.
> > >> In contrast, opportunistic suspend can happen even when there
> > >> are tasks that are ready, willing, and able to run.
> > >
> > > Ok, this is a complication to what I'm proposing (and seems a little odd,
> > > but I can see how it can work), but not neccessarily a major problem. it
> > > depends on exactly how the decision is made to go into low power states
> > > and/or suspend. If this is done by an application that is able to look at
> > > either all activity or ignore one cgroup of processes at different times in
> > > it's calculations than this would work.
> > >
> > >> 2. There can be a set of input events that do not bring the system
> > >> out of suspend, but which would bring the system out of a deep
> > >> idle state. For example, I believe that it was stated that one
> > >> of the Android-based smartphones ignores touchscreen input while
> > >> suspended, but pays attention to it while in deep idle states.
> > >
> > > I see this as simply being a matter of what devices are still enabled at the
> > > different power savings levels. At one level the touchscreen is still
> > > powered, while at another level it isn't, and at yet another level you have
> > > to hit the power soft-button. This isn't fundamentally different from
> > > powering off a USB peripheral that the system decides is idle (and then not
> > > seeing input from it until something else wakes the system)
> >
> > The touchscreen on android devices is powered down long before we
> > suspend, so that is not a good example. There is still a significant
> > difference between suspend and idle though. In idle all interrupts
> > work, in suspend only interrupts that the driver has called
> > enable_irq_wake on will work (on platforms that support it).
> >
> > >> 3. The system comes out of a deep idle state when a timer
> > >> expires. In contrast, timers cannot expire while the
> > >> system is suspended. (This one is debatable: some people
> > >> argue that timers are subject to jitter, and the suspend
> > >> case for timers is the same as that for deep idle states,
> > >> but with unbounded timer jitter. Others disagree. The
> > >> resulting discussions have produced much heat, but little
> > >> light. Such is life.)
> > >
> > > if you have the ability to wake for an alarm, you have the ability to wake
> > > for a timer (if from no other method than to set the alarm to when the timer
> > > tick would go off)
> >
> > If you just program the alarm you will wake up see that the monotonic
> > clock has not advanced and set the alarm another n seconds into the
> > future. Or are proposing that suspend should be changed to keep the
> > monotonic clock running? If you are, why? We can enter the same
> > hardware states from idle, and modifying suspend to wake up more often
> > would increase the average power consumption in suspend, not improve
> > it for idle. In other words, if suspend wakes up as often as idle, why
> > use suspend?
>
> Hmmm... The bit about the monotonic clock not advancing could help
> explain at least some of the heartburn from the scheduler and real-time
> folks. ;-)

I think that indeed is the case, although they haven't expressed that directly
yet (at least not that I know of :-)).

> My guess is that this is not a problem for Android workloads, which
> probably do not contain aggressive real-time components. (With the
> possible exception of interactions with the cellphone network, which
> I believe are handled by a separate core with separate OS.) However,
> pulling this into the Linux kernel would require that interactions with
> aggressive real-time workloads be handled, one way or another.
>
> I can see a couple possible resolutions:
>
> 1. Make OPPORTUNISTIC_SUSPEND depend on !PREEMPT_RT, so that
> opportunistic suspend simply doesn't happen on systems that
> support aggressive real-time workloads.
>
> 2. Allow OPPORTUNISTIC_SUSPEND and PREEMPT_RT, but suppress
> opportunistic suspend when there is a user-created real-time
> process. One way to handle this would be with a variation
> on a tongue-in-cheek suggestion from Peter Zijlstra, namely
> to have every real-time process hold a wakelock. Note that
> such a wakelock would need to be held even if the real-time
> process in question was not runnable, in order to meet
> possible real-time deadlines when the real-time process was
> awakened.

I guess the scheduler itself would need to hold that wakelock.

> 3. Your proposal here. ;-)
>
> Thoughts?

The case when there's a real-time process that's not using its time slices
(because it doesn't have anything to do) seems to be hard. You'd probably
want to suspend in that case, but then meeting the real-time deadlines would
be kind of unrealistic ...

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Rafael J. Wysocki on
On Wednesday, August 04, 2010, Paul E. McKenney wrote:
> On Tue, Aug 03, 2010 at 08:57:58PM -0700, Arjan van de Ven wrote:
> > On Tue, 3 Aug 2010 17:10:15 -0700
> > "Paul E. McKenney" <paulmck(a)linux.vnet.ibm.com> wrote:
> > >
> > > OK, I'll bite...
> > >
> > > >From an Android perspective, the differences are as follows:
> > >
> > > 1. Deep idle states are entered only if there are no runnable
> > > tasks. In contrast, opportunistic suspend can happen even when there
> > > are tasks that are ready, willing, and able to run.
> >
> > for "system suspend", this is an absolutely valid statement.
> > for "use suspend as idle state", it's not so clearly valid.
> > (but this is sort of a separate problem, basically the "when do we
> > freeze the tasks that we don't like for power reasons" problem,
> > which in first order is independent on what kind of idle power state
> > you pick, and discussed extensively elsewhere in this thread)
>
> From what I can see, the Android folks are are using "suspend" in
> the "system suspend" sense.
>
> I agree that the proposals for freezing subsets of the tasks in the
> system are independent of whether idle or suspend is being used.
> Instead, such freezing depends on (for example) whether or not the
> display is active.
>
> That said, freezing subsets of tasks is a nice-to-have rather than a
> hard requirement for Android. Though I suspect that the appearance
> of a reliable way of freezing subsets of tasks just might promote
> this to a hard requirement. ;-)
>
> > > 2. There can be a set of input events that do not bring the
> > > system out of suspend, but which would bring the system out of a deep
> > > idle state. For example, I believe that it was stated that
> > > one of the Android-based smartphones ignores touchscreen input while
> > > suspended, but pays attention to it while in deep idle states.
> >
> > I would argue that this is both a hardware specific issue, but also a
> > policy issue. From the user point of view, screen off with idle and
> > screen off with suspend aren't all that different (if my phone would
> > decide to idle rather than suspend because some app blocks suspend... I
> > wouldn't expect a difference in behavior when I touch the screen).
> > "Screen off -> don't honor touch after a bit" is almost an independent,
> > but very real, policy problem (and a forced one in suspend, I'll grant
> > you that). I could even argue that the policy decision "we don't care
> > about the touch screen input" is a pre-condition for entering suspend
> > (or in android speak, caring for touch screen input/having the touch
> > screen path active would be a suspend blocker)
>
> I agree that the subset of input events that do not bring the system out
> of suspend would be governed both by hardware capabilities and by policy.
>
> > > 3. The system comes out of a deep idle state when a timer
> > > expires. In contrast, timers cannot expire while the
> > > system is suspended. (This one is debatable: some people
> > > argue that timers are subject to jitter, and the suspend
> > > case for timers is the same as that for deep idle states,
> > > but with unbounded timer jitter. Others disagree. The
> > > resulting discussions have produced much heat, but little
> > > light. Such is life.)
> >
> > I'll debate it even harder in that it's platform specific whether
> > timers can get the system out of suspend or not. Clearly on the Android
> > platform in question that's not the case, but for some of the Intel
> > phone silicon for example, timers CAN be wake sources to get you out of
> > suspend just fine. It just depend on which exact hw you talk about.
> > Generally, even if the fast timers aren't wake up sources, there'll be
> > some sort of alarm thing that you can pre-wake.. but yes you are right
> > in saying that's rather lame.
> > Either way, it's not a general property of suspend, but a property of
> > suspend on the specific platform in question.
>
> Good point, I do need to emphasize the fact that whether or not timers
> pull the system out of suspend also depends both on hardware and
> on policy. So I will change my statement to say something like "The
> system comes out of a deep idle state when a timer expires. In contrast,
> timers do not necessarily expire while the system is suspended, depending
> on both hardware support and platform/application policy."

That's correct IMO.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Paul E. McKenney on
On Wed, Aug 04, 2010 at 10:42:08PM +0200, Pavel Machek wrote:
> Hi!
>
> > > If this doesn't work for the Android folks for whatever reason, another
> > > approach would be to do the freeze in user code, which could track
> > > whether any user-level resources (pthread mutexes, SysV semas, whatever)
> > > where held, and do the freeze on a thread-by-thread basis within each
> > > "victim" application as the threads reach safe points.
> >
> > The main problem I see with the cgroups solution is that it doesn't seem
> > to do anything to handle avoiding loss of wakeup events.
>
> In different message, Arve said they are actually using low-power idle
> to emulate suspend on Android.

Hello, Pavel,

Could you please point me at this message?

Thanx, Paul

> This came like a bit of a shock to me ("why do they make it so complex
> then"), but... it also means that as soon as you are able to stop
> "unwanted" processing, you can just leave normal cpuidle mechanisms to
> deal with the rest...
>
> (Of course, you'll also have to fix kernel timers not to beat
> unneccessarily often; still that's better solution that just stoping
> them all and then sprinkling wakelocks all over the kernel to deal
> with obvious bugs it introduces...)
> Pavel
> --
> (english) http://www.livejournal.com/~pavelmachek
> (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Matthew Garrett on
On Wed, Aug 04, 2010 at 10:51:07PM +0200, Rafael J. Wysocki wrote:
> On Wednesday, August 04, 2010, Matthew Garrett wrote:
> > No! And that's precisely the issue. Android's existing behaviour could
> > be entirely implemented in the form of binary that manually triggers
> > suspend when (a) the screen is off and (b) no userspace applications
> > have indicated that the system shouldn't sleep, except for the wakeup
> > event race. Imagine the following:
> >
> > 1) The policy timeout is about to expire. No applications are holding
> > wakelocks. The system will suspend providing nothing takes a wakelock.
> > 2) A network packet arrives indicating an incoming SIP call
> > 3) The VOIP application takes a wakelock and prevents the phone from
> > suspending while the call is in progress
> >
> > What stops the system going to sleep between (2) and (3)? cgroups don't,
> > because the voip app is an otherwise untrusted application that you've
> > just told the scheduler to ignore.
>
> I _think_ you can use the just-merged /sys/power/wakeup_count mechanism to
> avoid the race (if pm_wakeup_event() is called at 2)).

Yes, I think that solves the problem. The only question then is whether
it's preferable to use cgroups or suspend fully, which is pretty much up
to the implementation. In other words, is there a reason we're still
having this conversation? :) It'd be good to have some feedback from
Google as to whether this satisfies their functional requirements.

--
Matthew Garrett | mjg59(a)srcf.ucam.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Rafael J. Wysocki on
On Wednesday, August 04, 2010, Matthew Garrett wrote:
> On Wed, Aug 04, 2010 at 11:30:44AM -0700, david(a)lang.hm wrote:
> > a couple days ago I made the suggestion to put non-privilaged tasks in a
> > cgroup so that the idle/suspend decision code could ignore acitivity
> > caused by this cgroup.
> >
> > in the second version wakeup events would be 'activity' that would be
> > counted and therefor the system would not be idle. As for the race with
> > suspending and new things happening, wouldn't that be handled the same
> > way that it is in a normal linux box?
>
> No! And that's precisely the issue. Android's existing behaviour could
> be entirely implemented in the form of binary that manually triggers
> suspend when (a) the screen is off and (b) no userspace applications
> have indicated that the system shouldn't sleep, except for the wakeup
> event race. Imagine the following:
>
> 1) The policy timeout is about to expire. No applications are holding
> wakelocks. The system will suspend providing nothing takes a wakelock.
> 2) A network packet arrives indicating an incoming SIP call
> 3) The VOIP application takes a wakelock and prevents the phone from
> suspending while the call is in progress
>
> What stops the system going to sleep between (2) and (3)? cgroups don't,
> because the voip app is an otherwise untrusted application that you've
> just told the scheduler to ignore.

I _think_ you can use the just-merged /sys/power/wakeup_count mechanism to
avoid the race (if pm_wakeup_event() is called at 2)).

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/