[PATCH 0/8] Suspend block api (version 8) [Kernel]

Prev: [ANN] Linux Security Summit 2010 - Announcement and CFP
Next: [PATCH 4/8] PM: suspend_block: Add debugfs file

From: Arve Hjønnevåg on 28 May 2010 02:20

On Thu, May 27, 2010 at 10:15 PM, Peter Zijlstra <peterz(a)infradead.org> wrote:
> On Thu, 2010-05-27 at 15:19 -0400, Alan Stern wrote:
>> On Thu, 27 May 2010, Peter Zijlstra wrote:
>>
>> > I still don't see how blocking applications will cause missed wakeups in
>> > anything but a buggy application at worst, and even those will
>> > eventually get the event when they unblock.
>> >
>> > What seems to be the confusion?
>>
>> During forced suspend, applications are block because they are frozen.
>>
>> When an event occurs, the application is notified somehow. �But it
>> can't respond because it is frozen. �Hence the event remains sitting in
>> a kernel queue and the system goes ahead and suspends anyway. �The
>> application doesn't get thawed until the system wakes up at some
>> indefinite time in the future.
>
> If the kernel is awake to put things in queues, we're clearly not
> suspended and userspace is running ?!

Suspend is not an atomic operation. User space is frozen before
freezable kernel threads both of these happen before drivers are
suspended.

--
Arve Hj�nnev�g
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Florian Mickler on 28 May 2010 02:40

On Thu, 27 May 2010 21:55:26 -0700
Brian Swetland <swetland(a)google.com> wrote:

> On Thu, May 27, 2010 at 3:55 PM, Alan Cox <alan(a)lxorguk.ukuu.org.uk> wrote:
> >
> > This started because the Android people came to a meeting that was put
> > together of various folks to try and sort of the big blockage in getting
> > Android and Linux kernels back towards merging.
> >
> > I am interested right now in finding a general solution to the Android
> > case and the fact it looks very similar to the VM, hard RT, gamer and
> > other related problems although we seem to have diverged from that logic.
>
> I think that the suspend block model can be viewed as a constraints
> problem (similar to some of things things you've been sketching out in
> these threads), but I think we (Google/Android) view it as more of a
> state constraint (don't enter suspend) than a latency constraint.
>
> We think there's a need for these constraints both from the driver
> side and userspace side, and that these constraints are not tied to
> processes (multiple entities in one process may have different
> constraints at different times or multiple processes may be working
> together to accomplish some goal under a single constraint -- at least
> both cases exist in the Android system as it ships today).
>
> The exact naming of the API is not terribly important to us. The
> first thing we spent a bunch of time discussing last summer when Arve
> first looked into sending wakelocks upstream was changing the name
> because many objected to "wakelock" for various reasons.
>
> Being able to have userful statistics (which drivers/processes/etc
> held which wakelock for how long, how many times, etc) is important to
> us. While we want to do the best we can in the face of poorly written
> apps, we also want to educate users and developers about which apps
> are contributing to their poor battery life -- so users can decide to
> uninstall an app if its usefulness does not justify its impact on
> battery life and application developers can be more aware of what the
> cost of their app is to endusers.
>
> As an example, http://frotz.net/misc/battery-stats-unplugged.txt
> contains a dump from the "battery service" aggregating wakelock usage,
> cpu usage, and sensor device usage of processes (#....: sections) on
> my phone the other day for a ~3 hour period. This data is presented
> visually to the enduser in a "what's using my battery" feature of the
> platform. "realtime" refers to wall clock time here and "uptime"
> refers to not-in-suspend execution time.
>
> Brian

Hi!
Thinking about the issue a little more, this isn't really about trusted
apps and not trusted apps. Or crapplications.

The point is, that as soon as an app takes a suspend-blocker it becomes
what is here referred to as a "trusted app". But just because it is then visible as
consuming power in an official way.

Android suspends (as in echo mem > /sys/power/state)
whenever possible. It's as if there were a spring on the laptop lid,
and if the user doesnt hold his grip on it, the thing closes. How does
he hold his grip? The application registers a suspend-blocker for him.

So, why not use something like idle/QOS with this?

I can imagine to theoretically have a "latency requirement" where 0
means this application does not interact with the user. and != 0 means
this application interacts with the user.

("latency requirement" doesn't quite get it, but it works for now)

In android land, the default would be that every application has a
latency-requirement of 0. And then everything (userland) that takes a
suspend-blocker would be changed to take a "latency requirement != 0".

Now, if the system interacts with the user
( i.e. there is a global
latency requirement > 0, where "global latency requirement" is
computed by the pm framework maxing over all the userland processes
and the kernel side)
everything has to run. So we also need to schedule things which specify
a latency requirement == 0.

This last thing means, that it has to be independent of the scheduler, doesn't it?

I don't see how renaming suspend_blocker to set_pidle would not do
something equivalent to this, but the bit's are probably a bit scattered
throughout the kernel.
(Which I don't think is introduced by that patch set, but by the fact that
suspend is currently not an idle state.)

I can understand if there needs to be a good solution in the kernel
from day 1.

So, what would compose to a good solution?

Here should probably the more experienced people jump in, but let me express
what i've gathered in this discussion (especially from Thomas and Alan Cox):

1. change suspend framework to be "just another idle state"
2. specify that "just another idle state" can only be entered if
"global latency requirement" == 0
3. probably add some cost-estimate-computation to the "just another
idle state"

(the trick here is, that this idle-state ignores all current measures of "idle",
so the cost for this would only depend on the cost-estimate to enter it and
the suspend-power-usage. which also means it is probably 'opportune' to enter it, whenever possible,
except the machine is idle the old way already (because the cost to enter is bigger))

4. change the userspace suspend interface
i.e. echo mem > /sys/power/state to override the "global
latency requirement" to be 0.

5. convert the drivers to relax their latency-requirement to be 0
whenever possible. (in android land, this is already done, probably just needs a
s/suspend_block/set_pidle(1)/ )
6. enhance the cpufreq drivers to take global latency requirement into
view. (i.e. opportunistic suspend would be implemented in the proper place,
don't know which that is, please chime in)

So, what specifically would have to be done to the suspend blockers patches?
And can it be done incrementally? (I guess the answer is no, we don't want this done
in the kernel , we want it done right?)

Cheers,
Flo

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Florian Mickler on 28 May 2010 03:10

On Thu, 27 May 2010 22:09:37 -0400
Ben Gamari <bgamari.foss(a)gmail.com> wrote:

> On Wed, 26 May 2010 14:24:30 +0200, Florian Mickler <florian(a)mickler.org> wrote:
> > Because he is using a robust kernel that provides suspend blockers and
> > is preventing the vampire from sucking power?
> >
> Suspend blockers are only a flawed and indirect way to keep the vampire
> from sucking.
>

> > Most users don't even grasp the simple concept of different "programs".
> > They just have a device and click here and there and are happy.
> >
> > Really, what are you getting at? Do you deny that there are programs,
> > that prevent a device from sleeping? (Just think of the bouncing
> > cows app)
>
> He's getting at the fact that there are much better ways to deal with
> this problem. The issue here is that we seem to be expected to swallow
> whatever Google throws at us, regardless of the quality of the
> solution. It seems like the best argument we have for merging is "we
> couldn't think of anything better and we need it yesterday." This might be
> a good enough reason for shipping, but it certainly doesn't satisfy the
> requirements for merging.

I don't disagree on the quality. But I don't think it is because of the
patches, but because of how the kernel is architectured in that area
(suspend not being an idle state).

Look, probably suspend needs to be integrated into the idle states and
used from there. I could imagine a cost-specification for idle states:

c3
cost-to-transition-to-this-state: X
powersavings-per-time: Y
expected time we stay in this state: relative short, there is a
timer sheduled
suspend-blockers: ignored

suspend
cost-to-transition-to-this-state: depends, how much drivers to
suspend, how much processes to freeze, ...
powersavings-per-time: Y
expected time we stay in this state: long, independent of
sheduled timers
suspend-blockers: need not be activated

Now, a governor could compute if it is ok, to enter suspend or only
wait for idle-c3. And maybe it would never transition from idle-c3 to
suspend but only from c1. because the cost to enter suspend would mean
it just has to go to c1 anyway.

what do ya think?

> > And if you have two kernels, one with which your device is dead after 1
> > hour and one with which your device is dead after 10 hours. Which would
> > you prefer? I mean really... this is ridiculous.
>
> It is absolutely not. If you want to keep power usage down, then
> implement real resource management in the scheduler. Suspend blockers
> are nothing but a clunky and ineffective means of resource allocation.
> As has been pointed out in this thread, there are much better ways of
> dealing with this problem.
>
> - Ben

I think this has to be independently to the scheduler, because as soon
as the user interacts with the phone, everything needs to be scheduled.
even the stuff that doesn't directly interact with the user.
as soon as _nothing_ interacts with the user, the phone does schedule
_nothing_ anymore.

Cheers,
Flo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Thomas Gleixner on 28 May 2010 04:30

On Thu, 27 May 2010, Alan Stern wrote:

> On Thu, 27 May 2010, Thomas Gleixner wrote:
>
> > > The two of you are talking at cross purposes. Thomas is referring to
> > > idle-based suspend and Matthew is talking about forced suspend.
> >
> > Yes, and forced suspend to disk is the same as force suspend to disk,
> > which has both nothing to do with sensible resource management.
>
> If I understand correctly, you are saying that all the untrusted
> applications should run with QoS(NONE). Then they could do whatever
> they wanted without causing any interference.
>
> And with idle-based power management (rather than forced suspend),
> there would be no issue with wakeup events getting unduly delayed.
>
> Unless one of those events was meant for an untrusted application. Is
> that the source of the difficulty?

Probably, but that's not solved by suspend blockers either as I
explained several times now. Because those untrusted apps either lack
blocker calls or are not allowed to use them, so the blocker does not
help for those either.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Florian Mickler on 28 May 2010 04:50

On Thu, 27 May 2010 20:05:39 +0200 (CEST)
Thomas Gleixner <tglx(a)linutronix.de> wrote:

> On Thu, 27 May 2010, Matthew Garrett wrote:
>
> > On Thu, May 27, 2010 at 07:24:02PM +0200, Thomas Gleixner wrote:
> >
> > > Oh no. They paper over a short coming. If there is a pending event,
> > > the kernel knows that. It just does not make use of this
> > > information. Blockers just paper over this by sprinkling
> > > do_not_suspend() calls all over the place. What a sensible solution.
> >
> > Even if we could use suspend-via-deep-idle-state on PCs, we still need
> > to be able to enter suspend while the system isn't idle. There's two
> > ways to do that:
> >
> > 1) Force the system to be idle. Doing this race-free is difficult.
> >
> > 2) Enter suspend even though the system isn't idle. Since we can't rely
> > on the scheduler, we need drivers to know whether userspace has consumed
> > all wakeup events before allowing the transition to occur. Doing so
> > requires either in-kernel suspend blockers or something that's almost
> > identical.
>
> You're just not getting it. If user space has consumed the event is
> not relevant at all.
>
> What's relevant is whether the app has processed the event and reacted
> accordingly. That's all that matters.
>
> Emptying your input queue is just the wrong indicator.
>
> And as I explained several times now: It does _NOT_ matter when the
> app goes back in to blocked/idle state. You have to spend the CPU
> cycles and power for that anyway.
>
> And for the apps which do not use the user space blockers the queue
> empty indicator is just bullshit, because after emptying the queue the
> kernel can go into suspend w/o any guarantee that the event has been
> processed.
>
> The whole concept sucks, as it does not solve anything. Burning power
> now or in 100ms is the same thing power consumption wise.
>
> Thanks,
>
> tglx

Thomas,
do you really have a problem with the actual concept? Or do you just
don't like the way it is done?

IMO, the whole concept is defining 2 modes of operation:

1. user interacts with the device (at least one suspend block active)
2. user doesn't interact with the device (zero suspend block active)

In case 1. the device wants _everything_ sheduled as normal (and save
maximum possible power, i.e. runtime pm with every technology available
now).

In case 2. we want nothing sheduled (and save maximum possible power,
i.e. suspend)

And now, every application and every kernel driver annotates (on behalve
of the user) if it (possibly) interacts with the user.

(Is this really the problematic bit, that userspace is giving
the kernel hints? Or is it that the hints are called "blocker"?)

We can only enter mode 2, if _nothing_ (claims) to interact with the
user.

To integrate this with the current way of doing things, i gathered it
needs to be implemented as an idle-state that does the suspend()-call?

Attributes of the idle states could be smth like this:

c3
cost-to-transition-to-this-state: X
powersavings-per-time: Y
expected time we stay in this state: relative short, there is a
timer sheduled
suspend-blockers: ignored

suspend
cost-to-transition-to-this-state: depends, how much drivers to
suspend, how much processes to freeze, how much state to save
powersavings-per-time: Y
expected time we stay in this state: long, independent of
sheduled timers
suspend-blockers: must not be activated

Now all transitions and opportunistic suspend could be handled by the
same algorithms.

Would this work?

Cheers,
Flo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

First | Prev | Next | Last
Pages: 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
Prev: [ANN] Linux Security Summit 2010 - Announcement and CFP
Next: [PATCH 4/8] PM: suspend_block: Add debugfs file