suspend blockers & Android integration [Kernel]

Prev: How do I ignore the changes made by CVS keyword substitution efficiently?
Next: [PATCH 1/2] x86: make save_stack_address() !CONFIG_FRAME_POINTER friendly

From: Arve Hjønnevåg on 4 Jun 2010 03:40

On Fri, Jun 4, 2010 at 12:13 AM, Ingo Molnar <mingo(a)elte.hu> wrote:
>
> * Arve Hj?nnev?g <arve(a)android.com> wrote:
>
>> On Thu, Jun 3, 2010 at 4:23 PM, Ingo Molnar <mingo(a)elte.hu> wrote:
>> ...
>> > ?- Controlled auto-suspend: drivers (such as input) could on wakeup
>> > ? automatically set the 'minimum wakeup latency' value of wakee tasks to a
>> > ? lower value. This automatically prevents another auto-suspend in the near
>> > ? future: up to the point the wakee task increases its latency (via the
>> > ? scheduler syscall) again and allows suspend again.
>> >
>>
>> How do you clear the latency value in a safe way? If another wakeup event
>> happens right after your wakee task is done processing the last event and
>> decides to increase its latency, auto suspend will be allowed even though
>> you have an unprocessed wakeup event. Also how do you know which task will
>> read the event if it is not already waiting for it?
>
> The easiest solution would be to not do any of that initially. (If it's ever a
> concern we could subtract/add without destroying the nesting property)
>
> Why do you need to track input wakeups? It's rather fragile and rather

Because we have keys that should always turn the screen on, but the
problem is not specific to input events. If we enabled a wakeup event
it usually means we need this event to always work, not just when the
system is fully awake or fully suspended.

> unnecessary - the idle drivers know it very well how to not go into the
> deepest idle mode already today. We wont hit C8 on laptops when you are using
> the desktop.
>

The whole point allow the use of suspend.

>> > ? This means there will be no surprise suspends for a task that may take a
>> > ? bit longer than usual to finish its work. [ Detail: this would only be done
>> > ? for tasks that have a non-default (non-infinity) task->latency value - to
>> > ? prevent the input driver from lowering latency values (and preventing
>> > ? future suspends) just because some unaware apps are running and using input
>> > ? drivers. ]
>>
>> Don't you need two inifinity values for this?
>
> Yes - any value above the max idle latency in the system will do.
>
> Thanks,
>
> � � � �Ingo
>

--
Arve Hj�nnev�g
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Ingo Molnar on 4 Jun 2010 04:00

* Brian Swetland <swetland(a)google.com> wrote:

> On Thu, Jun 3, 2010 at 12:30 PM, Ingo Molnar <mingo(a)elte.hu> wrote:
> >
> > Sadly the response from the Android team has been 100% uncompromising: either
> > suspend blockers or nothing.
>
> Well, we're willing to accept something that gives us the same
> functionality (thus rewriting the api several times to meet various
> objections, current discussions around
> constraint-based-implementations / pm-qos, etc). We believe we're
> solving a real problem here and have not seen a counter-proposal that
> accomplishes the same.
>
> Suggestions such as "just yell at developers for writing bad apps" or
> "it's the user's fault if they install a lousy app" or "make your app
> marketplace more restrictive" are not helpful. [...]

Agreed.

> [...] The technical discussions around alternatives are more so (though I
> do feel like we're going in circles in places), [...]

Yep.

> [...] which again is why we're still here talking about this (that and Arve
> is about a billion times more patient and persistent than I am).
>
> We're not interested in massively rearchitecting our userspace to accomplish
> this (and the "rewrite your userspace!" proposals I've seen have had race
> conditions and/or significant more complexity than the wakelock model).

Having a somewhat different ABI for achieving things you'll probably have
prepare for. I doubt it would result in any large-scale, massive rewrites.

> ...
>
> > Also, why did the Android team start its contributions with such a
> > difficult and controversial kernel feature?
>
> We started here because it's possibly the only api level change we have --
> almost everything else is driver or subarch type work or controversial but
> entirely self-contained (like the binder, which I would be shocked to see
> ever hit mainline). [...]

So why arent those bits mainline? It's a 1000 times easier to get drivers and
small improvements and non-ABI changes upstream.

After basically two years of growing your fork (and some attempts to get your
drivers into drivers/staging/ - from where they have meanwhile dropped out
again) you re-started with the worst possible thing to merge: a big and
difficult kernel feature affecting many subsystems. Why?

This is one of the fundamental problems here. People simply dont know you,
because you have not worked with us much - and hence they dont trust you
positively out of box - they are neutral at best.

And believe me, it's hard enough to get difficult features upstream if people
_do_ know you and when they positively _do_ trust you ... Arent you talking to
Andrew Morton about how to do these things properly? This is kernel
contribution 101 really.

> [...] Assertions have been made that because the "android kernel" (not a
> term I like -- linux is linux, we have some assorted patches on top) [...]

I've been tracking android-common and android-msm for a while and i have to
say that it shows a very lackluster attitude towards upstream:

- The latest branches i can see are v2.6.32 based today. We are in the
v2.6.35 stabilization cycle and are developing v2.6.36. I.e. your upstream
base is about a year too old.

- The last commit is a couple of weeks old AFAICS.

- The diffstat of android-common/android-2.6.32 is:

890 files changed, 39962 insertions(+), 6286 deletions(-)

Those assorted patches have spread over nearly a thousand files. FYI, by
the looks of it you are facing an exponentially worsening maintenance
overhead curve here.

Is there perhaps some other tree i should be following? I'm looking at:

[remote "android-msm"]
url = git://android.git.kernel.org/kernel/msm.git
fetch = +refs/heads/*:refs/remotes/android-msm/*
[remote "android-common"]
url = git://android.git.kernel.org/kernel/common.git
fetch = +refs/heads/*:refs/remotes/android-common/*

Btw., the commits i've glanced at looked mostly clean and well structured, so
i see no fundamental reason why this couldn't be done better.

> See: http://www.kroah.com/log/linux/android-kernel-problems.html and various
> other rants about the evil terrible android forks, etc.
>
> So, we figure, let's sort out the hard problem first and then move on with
> our lives.

Well, my suggestion would be to first build up a path towards upstream, build
up trust, reduce your very high cross section to mainline - and do the most
difficult bits last.

Especially 'move on with our lives' suggests that you just want to get rid of
this ABI divergence and continue-as-usual with the pattern of non-cooperation,
hm?

> > There is absolutely _zero_ technical reason why the Android team should
> > present this as as an all-or-nothing effort. Why not merge hw drivers
> > first (with suspend blockers commented or stubbed out), to reduce the fork
> > distance?
>
> If that's the case then there is no problem and people could stop yelling at
> us and just submit their drivers. Awesome.
>
> I can't speak for all the nameless silicon vendors Greg represents, that we
> apparently are preventing from doing this (how? I don't know!), etc, but for
> my team maintaining multiple versions of drivers is a headache, we'd rather
> square away the wakelock debate first and figure something out there, as it
> just seems like a more logical approach. Maybe we're crazy.

It's not crazy, it's just IMHO inefficient and very difficult to do it like
that. And you arent the first one to try it like that (people _always_
gravitate towards coming with their most difficult patches first - because
they are very often the most useful patches) - it's a non-trivial learning
curve IMHO.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Ingo Molnar on 4 Jun 2010 04:20

* Arjan van de Ven <arjan(a)infradead.org> wrote:

> On Thu, 3 Jun 2010 19:26:50 -0700 (PDT)
> Linus Torvalds <torvalds(a)linux-foundation.org> wrote:
>
> > If the system is idle (or almost idle) for long times, I would heartily
> > recommend actively shutting down unused cores. Some CPU's are hopefully
> > smart enough to not even need that kind of software management, but I
> > suspect even the really smart ones might be able to take advantage of the
> > kernel saying: "I'm shutting you down, you don't have to worry about
> > latency AT ALL, because I'm keeping another CPU active to do any real
> > work".
>
> sadly the reality is that "offline" is actually the same as "deepest C
> state". At best.
>
> As far as I can see, this is at least true for all Intel and AMD cpus.
>
> And because there's then no power saving (but a performance cost), it's
> actually a negative for battery life/total energy.
>
> (lots of experiments inside Intel seem to confirm that, it's not just
> theory)

Well, the scheme would only be useful if it's _NOT_ just a deep C4 state, but
something that prevents tasks from being woken to that CPU for a good period
of time. Hot-unplugging that CPU achieves that (the runqueues are pulled), so
i think in Linus's idea makes sense in principle.

[ Or have you done deep-idle experiments to that effect as well? ]

I suspect it all depends on the cost: and our current hot-unplug and
hot-replug code is all but cheap ...

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Ingo Molnar on 4 Jun 2010 04:20

* Linus Torvalds <torvalds(a)linux-foundation.org> wrote:

> On Fri, 4 Jun 2010, Ingo Molnar wrote:
>
> > What you say is absolutely true, hence this would be driven via
> > sched_tick() + TIF notifiers - i.e. only ever treat user-mode tasks as
> > 'idle-able'. This can be done with no overhead to the regular fastpaths.
> >
> > The TIF notifier would be the one scheduling to idle - and would thus do
> > it only to user-mode tasks.
>
> The thing is, unless there is some _really_ deep other reason to do
> something like this, I still think it's total overdesign to push any
> knowledge/choices like this into the scheduler. I'd rather keep things way
> more independent, less tied to each other and to deep kernel subsystems.

Well, the deep reason as i see it is simply the observation that what the
Android auto-suspend code implements via the suspend-blocker patches is an
idle driver and user-space scheduler in disguise. (if you count that as a deep
enough reason)

I dont mind hacks if they are local and if i dont have to maintain them, but
the objection from other folks was that suspend blockers are not that local
and not that maintainable. And if (and that's a big if) we have a global
effect anyway, then we might as well consider implementing it cleanly:

- A global /sys flag is fundamentally racy and only allows a single
user-space actor. Not a problem on mobile phones but sure violates
taste buds.

Proper per task latency attributes are not racy - we always know the
maximum/minimum values, without user-space interfering with each other.

- When done correctly we might win a couple of new features as well around
the fringes:

- Useful for power savings on mobile: crappy apps can be idled on an
intermediate level, even before the system goes totally idle. There's no
equivalent suspend-blockers feature.

- Useful for real-time tasks that want to idle lower prio tasks when some
really important thing is running - even if the real-time task might sleep.
This is superior to the 'hog the CPU' kind of hacks that have been used
for this purpose before.

- The hacks needed to express a race-free suspend/wakeup cycle are unnatural
and stem from the model being a user-space driven idle manager instead of a
proper part of task sleep/wakeup.

- None of this code seems to impact any scheduler hotpath (most of it is just
a special form of idle driver) - it's all on deeper levels of idle and, at
most, in off-line return-to-userspace codepaths. So there's no strong
performance reason _against_ some level of integration. There is indeed
the coupling effect as you mention, which weighs against.

- i also think Andoid's auto-suspend is a strategic feature to Linux: i
think auto/opportunistic suspend will matter more and more, and my guess
is that ten years most of our daily systems will be doing auto-suspend and
will have proper wakeups from suspend implemented in hardware. Not just
phones and gadgets but also portable tablets, book readers, TVs - and i
wouldnt mind a non-portable, table sized tablet either ;-)

At which point i'd hate to have some hack of a solution ingrained and
ABI-ized with little chance to move user-space to sanity.

But yes, i definitely agree with you that it all comes down to 'do we care':

- If we care we should integrate it intelligently where it belongs
conceptually: the idle drivers and the scheduler.

- If we dont care then we should isolate the hacks as much as possible - and
then the current suspend blocker patch-set is definitely a good basis to
start. (with perhaps the /sys hackery cleaned up a bit, as you suggested)

I dont favor either of the solutions too deeply - so i personally have not
NAK-ed suspend blockers - i just saw a half a dozen semi-NAKs flying from
other folks, so tried to help come up with a palatable design.

_If_ most of x86 hardware was able to suspend race-free i think deeper
integration would be a slam-dunk - as we could make it work almost everywhere.
Sadly only a tiny subset of x86 qualifies, so the argument isnt obvious. Maybe
we should pick a variant of suspend blockers and re-examine things in a few
years? It being an ABI makes it difficult tho.

What i would personally find unacceptable is to have _neither_ solutions - and
the discussion was heading towards that stage really, with both sides digging
the trenches of non-cooperation. IMHO we just cannot afford to let this drop
on the floor as the feature is immensely useful to Android and thus to Linux
at large.

Anyway, i'm glad that it's up to you ;-)

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Ingo Molnar on 4 Jun 2010 04:40

* Arve Hj?nnev?g <arve(a)android.com> wrote:

> > [...]
> >
> > Why do you need to track input wakeups? It's rather fragile and rather
> > unnecessary [...]
>
> Because we have keys that should always turn the screen on, but the problem
> is not specific to input events. If we enabled a wakeup event it usually
> means we need this event to always work, not just when the system is fully
> awake or fully suspended.

Hm, i cannot follow that generic claim. Could you please point out the problem
to me via a specific example? Which task does what, what undesirable thing
happens where, etc.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Prev: How do I ignore the changes made by CVS keyword substitution efficiently?
Next: [PATCH 1/2] x86: make save_stack_address() !CONFIG_FRAME_POINTER friendly