[PATCH 0/8] Suspend block api (version 8) [Kernel]

Prev: [ANN] Linux Security Summit 2010 - Announcement and CFP
Next: [PATCH 4/8] PM: suspend_block: Add debugfs file

From: Zygo Blaxell on 28 May 2010 13:30

On Fri, May 28, 2010 at 08:13:08AM -0700, Brian Swetland wrote:
> On Fri, May 28, 2010 at 8:06 AM, Alan Cox <alan(a)lxorguk.ukuu.org.uk> wrote:
> > They fix a general problem in terms of a driver specific item. We end up
> > making changes around the tree but we make everyone happy not just
> > Android. Also we are isolating policy properly. The apps and drivers say
> > "I have these needs", the power manager figures out how to meet them.
>
> That makes sense -- and as I've mentioned elsewhere, we're really not
> super picky about naming -- if it turns out that
> wakelocks/suspendblockers were shorthand for "request a qos constraint
> that ensures that threads are running", we'll be able to get things
> done just as well as we do now.

From my reading of this thread, there's a lot of overlap between
suspendblockers and constraints. Many use cases are served equally
well with one or the other, except for one: a case where an event that
should ultimately wake the system triggers a code execution path (or data
flow path) that wanders through a user-space full of complex interacting
processes where the kernel (and maybe even the processes) can't see it.

Suspend-blockers in user-space handle this by making such code/data paths
visible to the kernel. An all-kernel constraint-based approach has no
way to see the user-space paths, so the system will end up trying to
sleep when it should be waking up.

Wait, what? Surely all the user-space code handling such events is
running under a PM-QoS constraint that says "don't sleep if this process
is runnable," so the system won't go to sleep. Presumably all other
processes which don't handle wakeup events will be running under a
PM-QoS constraint that says "do sleep even if this process is runnable."

That's true, except for one common case: a process is drawing things on
the display on behalf of other processes, and that drawing process can't
have the "don't sleep" constraint because if it did the system would
seem to be continuously busy and never go to sleep. Any process that is
handling a critical event but also needs to talk to the display process
will end up being not-runnable, and the system may go to sleep before the
display process wakes up. So we need another PM-QoS constraint that says
"don't sleep even if this process isn't runnable, because some *other*
runnable process might do something that makes our critical process
runnable again." The critical event handling app would switch to this
PM-QoS constraint until it had received an ack from whatever it talked
to in user-space, then switch back to the "don't sleep if this process
is runnable" state until a new event comes in.

So, three constraint policies should do it (*):

1. Do sleep even if this process is runnable,

2. Don't sleep if this process is runnable, and

3. Don't sleep even if this process isn't runnable, as long as
at least one other runnable process exists somewhere on the
system.

"Runnable" would include tasks that are literally runnable as well as
tasks that aren't runnable but are presumed to be imminently runnable
(e.g. blocked on timers that are going to expire before the wakeup
latency).

"Sleep" means going into any state where the scheduler doesn't run
any tasks. That covers most CPU idle modes, deep power saving states,
ACPI suspend, or whatever.

(*) or you could define a "please stop wasting CPU" message in user-space,
and send that message to anything in user-space which has a PM-QoS
constraint better than "none" whenever something in user-space thinks
the user has gone away. Then the display process can have constraint #2,
and we don't need #3.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Peter Zijlstra on 28 May 2010 14:20

On Fri, 2010-05-28 at 13:27 -0400, Zygo Blaxell wrote:
> From my reading of this thread, there's a lot of overlap between
> suspendblockers and constraints. Many use cases are served equally
> well with one or the other,

If using suspend-blockers,

Please explain to me how:

- I will avoid the cpu going into some idle state for which the wakeup
latency is larger than my RT app fancies?

- to avoid some tasks from being serviced by the filesystems whilst
others are? (ionice on steroids).

- does my sporadic task (with strict bandwidth budget) not suffer
bandwidth inversion?

suspend blockers do a bit of each of that, but none of it in a usable
fashion.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Zygo Blaxell on 28 May 2010 16:00

On Fri, May 28, 2010 at 08:16:20PM +0200, Peter Zijlstra wrote:
> On Fri, 2010-05-28 at 13:27 -0400, Zygo Blaxell wrote:
> > From my reading of this thread, there's a lot of overlap between
> > suspendblockers and constraints. Many use cases are served equally
> > well with one or the other,

Oops, I apparently meant "many use cases *of suspendblockers* are served
equally well with one or the other."

> If using suspend-blockers,
> Please explain to me how:
> - I will avoid the cpu going into some idle state for which the wakeup
> latency is larger than my RT app fancies?

....though I'd think you could do that by holding a suspendblocker, thus
preventing the CPU from going into any idle state at all.

There's four likely outcomes, corresponding to inclusion or non-inclusion
of suspend blockers and PM constraints in the kernel. Both could coexist
in the same kernel, since a suspend blocker can be trivially expressed as
"an extreme PM constraint with other non-constraint-related semantics."

It's the "other non-constraint-related semantics" that seem to be the
contentious issue. What can a suspend blocker do that a PM resource
constraint cannot do? If that set contains at least one useful use case,
then we need either suspend blockers, or some other thing that provides
for the use case.

Lots of people want PM constraints, and I haven't seen anyone suggest
there should *not* be PM constraints in the kernel some day. I've seen
a few "working and useful PM constraints aren't going to happen any time
soon" statements, and several "there's lots of stuff you still can't do
with PM constraints or suspend blockers" statements, but those aren't
arguments *against* PM constraints or *for* suspend blockers.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Rafael J. Wysocki on 28 May 2010 17:50

On Friday 28 May 2010, Peter Zijlstra wrote:
> On Fri, 2010-05-28 at 15:20 +0200, Peter Zijlstra wrote:
> > On Fri, 2010-05-28 at 14:02 +0100, Alan Cox wrote:
> > > On Fri, 28 May 2010 14:30:36 +0200
> > > Peter Zijlstra <peterz(a)infradead.org> wrote:
> > >
> > > > On Fri, 2010-05-28 at 13:21 +0100, Alan Cox wrote:
> > > > > [Total kernel changes
> > > > >
> > > > > Ability to mark/unmark a scheduler control group as outside of
> > > > > some parts of idle consideration. Generically useful and
> > > > > localised. Group latency will do most jobs fine (Zygo is correct
> > > > > it can't solve his backup case elegantly I think)
> > > > >
> > > > > Test in the idling logic to distinguish the case and only needed
> > > > > for a single Android specific power module. Generically useful
> > > > > and localised]
> > > >
> > > > I really don't like this..
> > > >
> > > > Why can't we go with the previously suggested: make bad apps block on
> > > > QoS resources or send SIGXCPU, SIGSTOP, SIGTERM and eventually SIGKILL
> > >
> > > Ok. Are you happy with the QoS being attached to a scheduler control
> > > group and the use of them to figure out what is what ?
> >
> > Up to a point, but explicitly not running runnable tasks complicates the
> > task model significantly, and interacts with fun stuff like bandwidth
> > inheritance and priority/deadline inheritance like things -- a subject
> > you really don't want to complicate further.
> >
> > We really want to do our utmost best to make applications block on
> > something without altering our task model.
> >
> > If applications keep running despite being told repeatedly to cease, I
> > think the SIGKILL option is a sane one (they got SIGXCPU, SIGSTOP and
> > SIGTERM before that) and got ample opportunity to block on something.
> >
> > Traditional cpu resource management treats the CPU as an ever
> > replenished resource, breaking that assumption (not running runnable
> > tasks) puts us on very shaky ground indeed.
>
> Also, I'm not quite sure why we would need cgroups to pull this off.
>
> It seems most of the problems the suspend-blockers are trying to solve
> are due to the fact of not running runnable tasks. Not running runnable
> tasks can be seen as assigning tasks 0 bandwidth. Which is a situation
> extremely prone to all things inversion. Such a situation would require
> bandwidth inheritance to function at all, so possibly we can see
> suspend-blockers as a misguided implementation of that.

I think this is a matter of what is regarded as a "runnable task". Some
tasks may not even be regarded as runnable in specific power conditions,
although otherwise they would be.

Consider updatedb or another file indexing ... thing on a laptop. I certainly
don't want anything like this to run and drain my battery, even if it has
already been started when the machine was on AC power. Now, of course,
I can kill it, but for that I need to notice that it's running and it presumably
might have done some job already and it would be wasteful to lose it.
It would be quite nice if that app was not regarded as runnable when the
system was on battery power.

In my view that's quite analogous to the Android situation, when they simply
don't want some tasks to be regarded as runnable in specific situations.

Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Arve Hjønnevåg on 28 May 2010 18:00

On Fri, May 28, 2010 at 9:31 AM, Alan Cox <alan(a)lxorguk.ukuu.org.uk> wrote:
>> I think Arve's concern was the representation of the "I care, but only
>> a little" or "just low enough to ensure threads must run" level which
>> is what suspend blockers would map to (low enough to ensure we
>> shouldn't halt the world but not necessarily implying a hard latency
>> constraint beyond that).
>
> That's why I suggested "manyana" (can't get accents for ma�ana in a
> define) or perhaps "dreckly"[1]. They are both words that mean "at some
> point" but in a very very vague and 'relax it'll happen eventually' sense.
>
> More importantly it's policy. It's a please meet this constraint guide
> to the PM layer - not a you must do as say even if its stupid.

Huh?

>
>> > They fix a general problem in terms of a driver specific item. We end up
>> > making changes around the tree but we make everyone happy not just
>> > Android. Also we are isolating policy properly. The apps and drivers say
>> > "I have these needs", the power manager figures out how to meet them.
>>
>> That makes sense -- and as I've mentioned elsewhere, we're really not
>> super picky about naming -- if it turns out that
>> wakelocks/suspendblockers were shorthand for "request a qos constraint
>> that ensures that threads are running", we'll be able to get things
>> done just as well as we do now.
>
> Cool. I think they are or at least they are close enough that nobody will
> notice the join ;)
>
>> > Where it gets ugly is if you start trying to have drivers giving an app a
>> > guarantee which the app then magically has to know to dispose of.
>>
>> Yeah -- which is something we've avoided in the existing model with
>> overlapping wakelocks during handoff between domains.
>
> I'm not sure avoided is the right description - its there in all its
> identical ugliness in wakelock magic
>
> If you treat QoS guarantees as a wakelock for your purposes (which is
> just fine, drivers and apps give you policy, you use it how you like)
> then you could write the paragraph below substituting the word
> 'guarantee' for 'wakelock' So in that sense the mess is the same because
> in both cases you are trying to suspend active tasks rather than asking
> the task to behave and then taking remedial action with offenders.
>
>> - input service is select()ing on input devices
>> - when select() returns it grabs a wakelock, reads events, passes them
>> on, releases the wakelock
>> - the event subsystem can then safely drop its "should be running
>> threads" constraint as soon as the last event is read because it has
>> no queues for userspace to drain, but the overlapping wakelock
>> prevents the system from immediately snapping back to sleep
>
> The conventional PC model is 'we don't go back into sleep proper fast
> enough for that race to occur'.

This is the same as saying these two threads don't run often enough to
need a mutex around their critical section. Just because you have not
been bitten by the race yet, does not mean it does not exist.

> It's hard to see how you change it. An

If each layer prevents suspend while it knows there are pending events
you don't have a race. Suspend blockers lets you do this.

> app->device "thank you for that event, I enjoyed it very much and have
> finished with it" message moves the underlying event management and QoS
> knowledge into he driver proper but doesn't really change the interface.
>
Yes you can do this, and it it how the android alarm driver works, but
we found the select()/poll(), block suspend, read event, process event
then unblock suspend sequence cleaner (especially for interfaces that
can return more than one event at a time). Kernel suspend blocker lets
you implement the alarm driver model, adding user-space suspend
blockers lets you implement the second.

>> > If you are prepared to exclude untrusted apps from perfectly reliable
>> > event reporting (ie from finger to application action) that doesn't seem
>> > to be a neccessity anyway.
>>
>> Currently in the Android userpace only trusted (system) apps can
>> directly obtain wakelocks -- arbitrary apps obtain them via rpc to a
>> trusted system service (which ensures the app has been granted
>> permission to do this and tracks usage for accountability to
>> user/developer).
>
> Clearly that would continue to work out.
>
> Alan
> [1] Dreckly being used in Cornwall, as one friend put it 'Like man�na but
> without that dreadful sense of urgency'
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo(a)vger.kernel.org
> More majordomo info at �http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at �http://www.tux.org/lkml/
>

--
Arve Hj�nnev�g
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

First | Prev | Next | Last
Pages: 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58
Prev: [ANN] Linux Security Summit 2010 - Announcement and CFP
Next: [PATCH 4/8] PM: suspend_block: Add debugfs file