From: James Bottomley on
On Fri, 2010-06-04 at 11:59 +0200, Ingo Molnar wrote:
> * Brian Swetland <swetland(a)google.com> wrote:
> > On Fri, Jun 4, 2010 at 1:55 AM, Ingo Molnar <mingo(a)elte.hu> wrote:
> > > * Brian Swetland <swetland(a)google.com> wrote:
[...]
> > > In any case, this is not to suggest that the suspend-blocker bits are
> > > 'impossible' to merge. I just say that if you start with your most
> > > difficult feature you should not be surprised to be on the receiving end
> > > of a 1000+ mails flamewar on lkml ;-)
> >
> > Yeah, I do understand that we're not making it easy for ourselves here. I
> > think we hit the point where Rafael and Matthew signed off on things and
> > thought "aha, linux-pm maintainers are happy, now we're getting somewhere"
> > only to realize the light at the end of the tunnel was a bit further out
> > than we anticipated ^^
>
> That's a well-known problem on lkml: the light at the end of the tunnel was
> the other train ;-)
>
> Anyway, i'm not pessimistic at all: _some_ sort of scheme appears to be
> crystalising out today. Everyone seems to agree now that the main usecases are
> indeed useful and need handling one way or another - the rest is really just
> technological discussions how to achieve the mostly-agreed-upon end goal.

It's still not clear to me whether everyone's revolving around to using
the current suspend block API because it's orthogonal to all other
mechanisms and is therefore separate from the kernel (and can be
compiled out if you don't want it). Or whether re-expressing what the
android drivers want (minimum idle states and suspend block) in pm_qos
terms which others can use is the way to go. I think the latter, but
I'd like to know what other people think (because I'm not wedded to this
preference).

> The worst situation are features where one side says 'we dont need this kind
> of functionality at all' - IMO auto/opportunistic-suspend isnt in that
> situation, fortunately.

Great ... because deprecating the problem has been one of the persistent
memes by some people on this huge thread.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Alan Stern on
On Fri, 4 Jun 2010, Ingo Molnar wrote:

> Note that this does not necessarily have to be implemented as 'execute suspend
> from the idle task' code: scheduling from the idle task, while can certainly
> be made to work, is a somewhat recursive concept that we might want to avoid
> for robustness reasons.
>
> Instead, the 'deepest idle' (suspend) method could consist of a wakeup of a
> kernel thread (or of any of the existing kernel threads such as the migration
> thread) - which kernel thread then does a race-free suspend: it offlines all
> but one CPU [on platforms that need that] and then initiates the suspend - but
> aborts the attempt if there's any sign of wakeup activity.

Out of morbid curiosity... A typical sign of wakeup activity is a
thread becoming runnable because of expiration of a kernel timer or an
I/O completion interrupt. How would the "race-free suspend" thread
detect this sort of thing? Indeed, isn't the inability to detect these
part of what makes the existing suspend implementation (the freezer in
particular) not race-free?

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Florian Mickler on
On Fri, 04 Jun 2010 09:24:06 -0500
James Bottomley <James.Bottomley(a)suse.de> wrote:

> On Fri, 2010-06-04 at 11:59 +0200, Ingo Molnar wrote:
> > Anyway, i'm not pessimistic at all: _some_ sort of scheme appears to be
> > crystalising out today. Everyone seems to agree now that the main usecases are
> > indeed useful and need handling one way or another - the rest is really just
> > technological discussions how to achieve the mostly-agreed-upon end goal.
>
> It's still not clear to me whether everyone's revolving around to using
> the current suspend block API because it's orthogonal to all other
> mechanisms and is therefore separate from the kernel (and can be
> compiled out if you don't want it). Or whether re-expressing what the
> android drivers want (minimum idle states and suspend block) in pm_qos
> terms which others can use is the way to go. I think the latter, but
> I'd like to know what other people think (because I'm not wedded to this
> preference).

I'd like to know that also.
I have a patch to add�pm_qos_add_request_nonblock function, so it is
possible to register an pm_qos constraint by passing preallocated
memory to it.

Notifying should be possible to do from atomic contexts via
async_schedule()?

The scalability issues of pm_qos can be adressed by using plists for
all pm_qos_class'es. Or by having the different pm_qos_class'es provide
their own implementations for the update and get operations.

Cheers,
Flo

>
> James
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Rafael J. Wysocki on
On Friday 04 June 2010, Peter Zijlstra wrote:
> On Fri, 2010-06-04 at 01:23 +0200, Ingo Molnar wrote:
> > Btw., i'd like to summarize the scheduler based suspend scheme proposed by
> > Thomas Gleixner, Peter Zijlstra and myself. I found no good summary of it in
> > the big thread, and there are also new elements of the proposal:
>
> Just to clarify, my proposition doesn't go much further than treating
> 'suspend' as a genuine idle state (on suitable hardware, which x86 isn't).
>
> > - Create a 'deep idle' mode that suspends. This, if all constraints
> > are met, is triggered by the scheduler automatically: just like the other
> > idle modes are triggered currently. This approach fixes the wakeup
> > races because an incoming wakeup event will set need_resched() and
> > abort the suspend.
> >
>
> Right, so 'suspend' as idle seems (at least on UP/arm) a very sensible
> idea. On SMP current suspend hot-unplugs all but the boot cpu, I'm not
> sure we need to do that, since if the system is genuinely idle, what races
> are there?
>
> And if its not idle...
>
> > ( This mode can even use the existing suspend code to bring stuff down,
> > therefore it also solves the pending timer problem and works even on
> > PC style x86. )
>
> You cannot solve the pending timer issue from idle, unless you allow idle
> to stop clock_monotonic, which would change idle semantics, and that is not
> something I can say is a good idea.
>
> You want all idle states to have the same semantics, otherwise things just
> get way too confusing.
>
> > - Solve crappy app confinement via the scheduler:
> >
> > A first proposal was to use the existing cgroup mechanism,
>
> I still believe containment is a cgroup problem.

I kind of agree here, so I'd like to focus a bit on that.

Here's my idea in the very general terms:

(1) Use the cgroup freezer to "suspend" the "untrusted" apps (ie. the ones
that don't use suspend blockers aka wakelocks in the Android world) at the
point Android would normally start opportunistic suspend.

(2) Allow the cpuidle framework to put CPUs into low-power states after the
"trusted" apps (ie. the ones that use suspend blockers in the Android
world) have gone idle.

(3) Teach the cpuidle framework to schedule runtime suspend of I/O devices
before idling the last CPU (*).

(4) Design a mechanism to resume the I/O devices suspended in (3) so that
they are not powered up unnecessarily (that's going to be difficult as far
as I can see).

This way, in principle, we should be able to save (at least almost) as much
energy as the opportunistic suspend currently used by Android, provided that
things will be capable of staying idle for extended periods of time.

(*) That may require per-device PM QoS requirements to be used, in which case
devices may even be suspended earlier if the PM QoS requirements of all
of their users are met.

I wonder what people think. Is this realistic and if so, would it be difficult
to implement?

Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Thomas Gleixner on
On Sat, 5 Jun 2010, Rafael J. Wysocki wrote:
> I kind of agree here, so I'd like to focus a bit on that.
>
> Here's my idea in the very general terms:
>
> (1) Use the cgroup freezer to "suspend" the "untrusted" apps (ie. the ones
> that don't use suspend blockers aka wakelocks in the Android world) at the
> point Android would normally start opportunistic suspend.

There is an additional benefit to this approach:

In the current android world a background task (e.g. download
initiated before the screensaver kicked in) prevents the suspend,
but that also means that the crapplications can still suck power
completely unconfined.

With the cgroup freezer you can "suspend" them right away and
just keep the trusted background task(s) alive which allows us to
go into deeper idle states instead of letting the crapplications
run unconfined until the download finished and the suspend
blocker goes away.

> (2) Allow the cpuidle framework to put CPUs into low-power states after the
> "trusted" apps (ie. the ones that use suspend blockers in the Android
> world) have gone idle.
>
> (3) Teach the cpuidle framework to schedule runtime suspend of I/O devices
> before idling the last CPU (*).
>
> (4) Design a mechanism to resume the I/O devices suspended in (3) so that
> they are not powered up unnecessarily (that's going to be difficult as far
> as I can see).
>
> This way, in principle, we should be able to save (at least almost) as much
> energy as the opportunistic suspend currently used by Android, provided that
> things will be capable of staying idle for extended periods of time.
>
> (*) That may require per-device PM QoS requirements to be used, in which case
> devices may even be suspended earlier if the PM QoS requirements of all
> of their users are met.
>
> I wonder what people think. Is this realistic and if so, would it be difficult
> to implement?

I think it's realistic and not overly complicated to implement.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/