From: Theodore Tso on

On May 28, 2010, at 8:49 AM, Igor Stoppa wrote:

> ext Theodore Tso wrote:
>
>> I've seen very hard to debug situations with Maemo where users are essentially asked to uninstall all their applications, and then install them back one at a time, waiting several hours between each install for a charge/discharge cycle, to figure out which application was waking up the system so !@#@! much while the screen was turned off. And, when the periodic wakeups are faster than the refresh time of powertop, no, powertop won't help you find the crapplication. If you think that's acceptable, fine --- we'll see who wins in the marketplace, and who gets blamed for producing a crappy platform --- the incompetent application programmer, or the platform supplier.
>>
> Those apps were from an experimental repository, which is not enabled by default in stock SW.

Well, yes, if the company strategy is to have a walled garden ala the Apple iPhone App store, life is much simpler. But if the requirements mean that apps don't need preapproval, the requirements on the platform get harder. I think the take-home here is we have a requirement that the platform behave well even without someone screening the applications for the "default SW repository".

-- Ted
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Matthew Garrett on
On Fri, May 28, 2010 at 01:21:38PM +0100, Alan Cox wrote:

> So I put my phone down
>
> The UI manager gets told the phone is 'down'
> Ten seconds later it is still down
<- wakeup event that should be delivered to untrusted app arrives here

At this point you may mark the downtrodden group as ignored between the
untrusted app receiving the event and the untrusted app marking itself
as important. To avoid this you need the UI manager to receive every
wakeup event in order to change its scheduling decisions.

> If I push the button we get an IRQ
> We come out of power save
> The app gets poked

(The cgroup has to have some awareness of suspend/resume so that it can
allow the untrusted apps to be scheduled again)

> The app may be unimportant but the IRQ means we have a new timeout of
> some form to run down to idle

The timeout-based nature means that if the application doesn't get
scheduled for some reason (say there's heavy swap pressure - not likely
in the embedded world, but an issue on laptop-type devices) the event
may not be handled before you get back to sleep. I accept that this
isn't likely to be a problem in the real world, but it does make this
mechanism less deterministic than a suspend block based one.

> If you are absolutely utterly paranoid about it you need the button
> driver to mark the task it wakes back as important rather than rely on
> time for response like everyone else. That specific bit is uggglly but
> worst case its just a google private patch to a few drivers. I understand
> why Android wants it. The narrower the gap between 'we are doing nothing,
> sit in lowest CPU on state' and 'we are off' the better the battery life
> and the more hittable the condition.

Not just the button driver. Every driver that generates wakeupa. This
gets difficult when it comes to the network layer, for instance, when
the network driver has very little idea how the packet it just received
will be routed.

> Apart from that optional paranoia case my kernel now contains some
> trivial changes of generic value that have nothing to do with suspend
> blocking. Android has suspend blocking by choosing to use the generic
> features in its own specific way and we need almost no code writing ?

The problem is that you still have a race, and fixing that race requires
every event that could generate a wakeup to be proxied out to the policy
manager as well. That's a moderate additional overhead.

--
Matthew Garrett | mjg59(a)srcf.ucam.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Peter Zijlstra on
On Fri, 2010-05-28 at 13:21 +0100, Alan Cox wrote:
> [Total kernel changes
>
> Ability to mark/unmark a scheduler control group as outside of
> some parts of idle consideration. Generically useful and
> localised. Group latency will do most jobs fine (Zygo is correct
> it can't solve his backup case elegantly I think)
>
> Test in the idling logic to distinguish the case and only needed
> for a single Android specific power module. Generically useful
> and localised]

I really don't like this..

Why can't we go with the previously suggested: make bad apps block on
QoS resources or send SIGXCPU, SIGSTOP, SIGTERM and eventually SIGKILL?



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Theodore Tso on

On May 28, 2010, at 8:16 AM, Theodore Tso wrote:
>
> I've seen very hard to debug situations with Maemo where users are essentially asked to uninstall all their applications, and then install them back one at a time, waiting several hours between each install for a charge/discharge cycle, to figure out which application was waking up the system so !@#@! much while the screen was turned off. And, when the periodic wakeups are faster than the refresh time of powertop, no, powertop won't help you find the crapplication.

Sorry, miswording: s/faster/less frequent/

I'm not convinced CPU activity LEDs help either, BTW. It only takes the CPU getting crowbarred out of idle for a tiny amount of time before you start impacting battery life, and if the crapplication is only doing it every 30-60 seconds or so, I doubt you'd see it on the LED's.... that sort of thing might be acceptable if you have a 1-3 pound battery, but maybe much less so if you have a bettery which is cell-phoned sized.

-- Ted


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Alan Cox on
On Fri, 28 May 2010 14:30:36 +0200
Peter Zijlstra <peterz(a)infradead.org> wrote:

> On Fri, 2010-05-28 at 13:21 +0100, Alan Cox wrote:
> > [Total kernel changes
> >
> > Ability to mark/unmark a scheduler control group as outside of
> > some parts of idle consideration. Generically useful and
> > localised. Group latency will do most jobs fine (Zygo is correct
> > it can't solve his backup case elegantly I think)
> >
> > Test in the idling logic to distinguish the case and only needed
> > for a single Android specific power module. Generically useful
> > and localised]
>
> I really don't like this..
>
> Why can't we go with the previously suggested: make bad apps block on
> QoS resources or send SIGXCPU, SIGSTOP, SIGTERM and eventually SIGKILL

Ok. Are you happy with the QoS being attached to a scheduler control
group and the use of them to figure out what is what ?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/