From: david on
On Mon, 7 Jun 2010, Florian Mickler wrote:

> On Sun, 6 Jun 2010 04:14:09 -0700 (PDT)
> david(a)lang.hm wrote:
>
>> On Sun, 6 Jun 2010, Florian Mickler wrote:
>>
>>> On Sun, 6 Jun 2010 12:19:08 +0200
>>> Vitaly Wool <vitalywool(a)gmail.com> wrote:
>>>
>>>> 2010/6/6 <david(a)lang.hm>:
>>>>
>>>>> as an example (taken from this thread).
>>>>>
>>>>> system A needs to wake up to get a battery reading, store it and go back to
>>>>> sleep, It does so every 10 seconds. But when it does so it only runs the one
>>>>> process and then goes back to sleep.
>>>>>
>>>>> system B has the same need, but wakes up every 10 minutes. but when it does
>>>>> so it fully wakes up and this allows the mail app to power up the radio,
>>>>> connect to the Internet and start checking for new mail before oppurtunistic
>>>>> sleep shuts things down (causing the mail check to fail)
>>>>>
>>>>> System A will last considerably longer on a battery than System B.
>>>>
>>>> Exactly, thanks for pointing out the specific example :)
>>>>
>>>> ~Vitaly
>>>
>>> This does not affect suspend_blockers nor does suspend_blockers
>>> interfere with that.
>>>
>>> Suspend_blockers allow the system to suspend ("mem">/sys/power/state
>>> suspend), when the userspace decides that the device is not in use.
>>>
>>> So implementing suspend_blockers support does not impact any
>>> optimizations done to either system A nor system B.
>>
>> Actually, it does.
>>
>> system A is what's being proposed by kernel developers, where the
>> untrusted stuff is in a different cgroup and what puts the system to sleep
>> is 'normal' power management. It doesn't sleep as long, but when it wakes
>> up the untrusted stuff is still frozen, so it doesn't stay awake long, or
>> do very much.
>>
>> System B is suspend blockers where you are either awake or asleep, and
>> when you wake up you wake up fully, but oppertunistic sleep can interrupt
>> untrusted processes at any time. The system sleeps longer (as fewer things
>> can wake it), but when it wakes up it's fully awake.
>>
>> David Lang
>
> You say, that coming back from suspend takes the system to full power
> (and everything runs) before it begins the descend into
> runtime-low-power?
> But are you referring to the fact that coming back
> from suspend starts in the zero-idle-state (i.e. "consumes extra
> power") or that all processes run when it is not suspended?

I am referring to the fact that with suspend blockers and opertunistic
suspend all processes start running when it's not suspended (because they
were all running when it was suspended)

If instead the system only wakes up the trusted processes to handle
whatever woke the system up and is then idle again, it spends less power
and time while awake.

> Because the latter would of course (theretically) profit from the
> framework-controlled-cgroup-freeze/thaw (with and without
> opportunistic suspend) while the former should be a problem that
> both opportunistic suspend as well as suspend-from-idle have. Or not?
>
> So, here is the question I'm asking myself: If System A were to be
> complemented by suspend_blockers, wouldn't it still be better?

not neccessarily.

having suspend blockers inside the kernel adds significant complexity, it's
worth it only if the complexity buys you enough. In this case the question
is if the suspend blockers would extend the sleep time enough more to
matter. As per my other e-mail, this is an area with rapidly diminishing
returns as the sleep times get longer.

> With System A you could try to do a really sophisticated
> power-management scheme and so on... but as soon as you allow 3rd-Party
> Apps, how do you manage their cross-dependencies? I.e. you can not
> automatically detect when App1 needs App2 to function.
> You need to allow all 3rd-Party apps to run as a group.
>
> So you can perhaps partition your software stack into "untrusted
> applications" and different groups of software with audited
> dependencies.
>
> If one group interacts with another group (as will be the case at least
> with the "untrusted applications" group) you have to have them both
> running at the same time.
>
> This really gets pretty complex. Do you really think something like
> this is better than a simple suspend? (I.e. suspend blockers or
> having just one group)

even if all you do is have two groups (trusted and untrusted), all you
need to do is to watch for the interaction between these two. Put the
third-party apps in the untrusted group.

depending on what security you have available, you may be able to define
more, smaller groups after using the security to make sure that there is
no overlap between them

> Suppose you implement suspend blockers with a cgroup freeze... how do
> you implement the freeze/thaw control?

I thought the answer had been provided, one of the trusted apps implements
the freeze/thaw, and everything happens in userspace.

> Cheers,
> Flo
>
> p.s.: do you see an possibility for any kind of "priority inheritance"
> in the cgroup scheme? I don't.

is there a need for it?

David Lang

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Arve Hjønnevåg on
2010/6/8 Alan Stern <stern(a)rowland.harvard.edu>:
> On Mon, 7 Jun 2010, Arve Hj�nnev�g wrote:
>
>> The patch that modifies evdev (posted in this patchset) uses an ioctl
>> to enable the suspend blocker. Not all input devices are used for
>> wakeup events and those don't need to block suspend.
>
> But you do have a 1-1 correspondence, right? �That is, the input
> devices that are used for wakeup events are exactly the ones that block
> suspend?
>

Yes.

>
>> If you read an event that occurred after you blocked the task
>> freezing, then tasks will never get frozen again (until more events
>> occur). I think my original description was less confusing, but it
>> seems you got completely distracted by my use of block and unblock
>> suspend when referring to the user space api.
>
> I still find your wording a little confusing. �Task freezing can be
> prevented (a more accurate term than "blocked") by two kinds of things:
> a suspend blocker or an "active" wakeup source. �I'm not sure which
> kind you mean here.

I mean prevented by a user space suspend blocker.

>
>> It has an indirect connection. You report a wakeup event when it
>> occurs, but clear it when user space calls an api before reading the
>> event. So:
>
> Yes, that's right.
>
>> Wakeup event occurs, and the driver:
>> - report wakeup event type A
>> - queue event for delivery to user-space
>
> That's not really two distinct steps. �Queuing the event for delivery
> to userspace involves waking up any tasks that are waiting to read the
> device file; that action (calling wake_up_all() or whatever the driver
> does) is how the event gets reported.
>

If you want to ensure that more than one process see the event it has
to be two steps, but it does not affect the race I was trying to
describe.

>> User space wakes up:
>> - Calls api to block task freezing for event type A
>
> Again, that's a confusing way of putting it. �The API you're referring
> to is simply the function that activates a suspend blocker. �It does
> prevent task freezing, but you shouldn't say it prevents freezing for
> event type A. �More like the other way around: In addition to
> preventing freezing, the function tells the power manager that event
> type A should no longer be considered active. �Thus, in a sense it
> _stops_ event type A from preventing freezing.
>
>> Another wakeup event occurs, and the driver:
>> - report wakeup event type A
>> - queue event for delivery to user-space
>
> Same as above.
>
>> User space continues:
>> - Read events

Sorry, I missed the unblock task freezing step here.

>> - Wait for more events
>>
>> Result: Task are not frozen again.
>
> Because the suspend blocker was never deactivated. �The same thing
> happens with wakelocks: If a task activates a wakelock and never
> deactivates it, the system won't go into opportunistic suspend again.

Yes, but with the sequence of events above task will not be frozen
again even if the wake-lock/suspend-blocker/task-freezing-preventer is
released.

>
> Here's how my scheme is meant to work:
>
> � � � �Wakeup event for input device A occurs.
>
> � � � �A's driver adds an entry to the input device queue and
> � � � �(if the queue was empty) does wake_up_all() on the device
> � � � �file's wait_queue.
>
> � � � �The PM process returns from poll() and sees that device
> � � � �file A is now readable, so it adds A to its list of active
> � � � �sources and unfreezes userspace.
>
> � � � �Some other process sees that device file A is now readable,
> � � � �so it activates a suspend blocker and reads events from A.
>
> � � � �When the PM process receives the request to activate the
> � � � �suspend blocker, it removes A from its list of active
> � � � �sources. �But it doesn't freeze userspace yet, because now
> � � � �a suspend blocker is active.

If another event happens at this point don't you put A back on the
list? If so, it never gets removed.

>
> � � � �The other process consumes events from A and does other
> � � � �stuff. �Maybe more input data arrives while this is happening
> � � � �and the process reads it. �Eventually the process decides to
> � � � �deactivate the suspend blocker, perhaps when no more data
> � � � �is available from the device file, perhaps not.
>
> � � � �When the PM process receives the request to deactivate the
> � � � �suspend blocker, it sees that now there are no active
> � � � �sources and no active suspend blockers. �Therefore it
> � � � �freezes userspace and does a big poll() on all possible
> � � � �sources. �(If there are still events on the input device
> � � � �queue, the poll() returns immediately.)
>
> � � � �Rinse and repeat.
>
> I don't see any dangerous races there. �The scheme can be made a little
> more efficient by having the PM process do another poll() (with 0
> timeout) just before freezing userspace; if the result indicates that a
> source is active then the freezing and unfreezing can be skipped.
>
> The big assumption here is that a user process never consumes wakeup
> events without first activating a suspend blocker. �This seems like a
> reasonable assumption, but we can work around it if necessary.
>
>> >> It seems you would need a way to pass the wakeup source id to use from
>> >> user space to the driver and for this to work
>> >
>> > No, nothing needs to be passed from userspace to the kernel. �However
>> > the source ID (or a set of source IDs) does need to be passed to the
>> > power manager process, probably when the suspend blocker is created.
>> >
>>
>> Then the source id need to be passed from the kernel to user-space.
>
> A source ID is a file descriptor. �File descriptors are passed from the
> kernel to userspace whenever a file is opened; I can't deny it. �And
> they are passed back to the kernel as part of the read() and poll()
> system calls. �Is that what you mean?
>
>> No, that is not the unclear part. What is unclear to me is where the
>> source IDs come from. Are they static and hardcoded in the driver and
>> user-space, or are they passed between the driver and user-space
>> client?
>
> They are not static; they are file descriptors. �I guess this should
> have been made more clear originally, but this is still pretty new to
> me too.
>
>> I don't understand how you are planning to ensure that the driver and
>> user-space code that consumes the real event use the same source id.
>
> How can it be otherwise? �The userspace code consumes the event by
> reading from the device file. �In order to do so, it has to use the
> same file descriptor it received when it opened the device file
> originally.
>
>> The biggest problem I have with it though is that you have created a
>> new race condition between reporting that a wakeup event has occurred
>> and processing of the real event.
>
> There is no race. �The driver reports an event has occurred by making
> the data available to be read from the device file, and the event is
> processed by reading it from the device file (or at least, that's the
> first step in processing the event).
>

If the driver making data available to be read triggers a wakeup event
in the power manager process that has to be cleared by the process
reading the events, then you have a race. Since the power manager is
selecting/polling on the same file descriptor, I don't see what you
gain from linking the wakeup events to suspend blockers. If you break
this link it think can work, but it does require us to modify all code
that reads wakeup events from the kernel to register the file
descriptors they get events from. It would also require adding
poll/select support to android alarm driver, and any driver that
currently uses a wakelock with a timeout would need to notify the user
space power manager instead.

>
> There's one other thing worth mentioning. �All along I've been talking
> about a power manager process that coordinates all these activities.
> In theory there's no reason that process couldn't be implemented as a
> kernel thread. �This would improve efficiency by reducing the number of
> context switches, and it would change IPC calls into plain system
> calls.
>
> If you did implement it that way, it could be done as a standalone
> kernel module, totally noninvasive. �It would not need to be part of
> the vanilla kernel and nobody would object to it.
>
> Alan Stern
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo(a)vger.kernel.org
> More majordomo info at �http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at �http://www.tux.org/lkml/
>



--
Arve Hj�nnev�g
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on


On Tue, 8 Jun 2010, david(a)lang.hm wrote:
>
> having suspend blockers inside the kernel adds significant complexity, it's
> worth it only if the complexity buys you enough. In this case the question is
> if the suspend blockers would extend the sleep time enough more to matter. As
> per my other e-mail, this is an area with rapidly diminishing returns as the
> sleep times get longer.

Well, the counter-argument that nobody seems to have brought up is that
suspend blockers exist, are real code, and end up being shipped in a lot
of machines.

That's a _big_ argument in favour of them. Certainly much bigger than
arguing against them based on some complexity-arguments for an alternative
that hasn't seen any testing at all.

IOW, I would seriously hope that this discussion was more about real code
that _exists_ and does what people need. It seems to have degenerated into
something else.

Because in the end, "code talks, bullshit walks". People can complain and
suggest alternatives all they want, but you can't just argue. At some
point you need to show the code that actually solves the problem.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Felipe Contreras on
On Wed, Jun 9, 2010 at 6:46 AM, Linus Torvalds
<torvalds(a)linux-foundation.org> wrote:
> On Tue, 8 Jun 2010, david(a)lang.hm wrote:
>>
>> having suspend blockers inside the kernel adds significant complexity, it's
>> worth it only if the complexity buys you enough. In this case the question is
>> if the suspend blockers would extend the sleep time enough more to matter. As
>> per my other e-mail, this is an area with rapidly diminishing returns as the
>> sleep times get longer.
>
> Well, the counter-argument that nobody seems to have brought up is that
> suspend blockers exist, are real code, and end up being shipped in a lot
> of machines.
>
> That's a _big_ argument in favour of them. Certainly much bigger than
> arguing against them based on some complexity-arguments for an alternative
> that hasn't seen any testing at all.
>
> IOW, I would seriously hope that this discussion was more about real code
> that _exists_ and does what people need. It seems to have degenerated into
> something else.
>
> Because in the end, "code talks, bullshit walks". People can complain and
> suggest alternatives all they want, but you can't just argue. At some
> point you need to show the code that actually solves the problem.

That's assuming there is an actual problem, which according to all the
embedded people except android, there is not.

And if there is indeed such a problem (probably not big), it might be
solved properly by the time suspend blockers are merged, or few
releases after.

Whatever the solution (or workaround) is, it would be nice if it could
be used by more than just android people, and it would also be nice to
do it without introducing user-space API that *nobody* likes and might
be quickly deprecated.

--
Felipe Contreras
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Rafael J. Wysocki on
On Wednesday 09 June 2010, Felipe Contreras wrote:
> On Wed, Jun 9, 2010 at 6:46 AM, Linus Torvalds
> <torvalds(a)linux-foundation.org> wrote:
> > On Tue, 8 Jun 2010, david(a)lang.hm wrote:
> >>
> >> having suspend blockers inside the kernel adds significant complexity, it's
> >> worth it only if the complexity buys you enough. In this case the question is
> >> if the suspend blockers would extend the sleep time enough more to matter. As
> >> per my other e-mail, this is an area with rapidly diminishing returns as the
> >> sleep times get longer.
> >
> > Well, the counter-argument that nobody seems to have brought up is that
> > suspend blockers exist, are real code, and end up being shipped in a lot
> > of machines.
> >
> > That's a _big_ argument in favour of them. Certainly much bigger than
> > arguing against them based on some complexity-arguments for an alternative
> > that hasn't seen any testing at all.
> >
> > IOW, I would seriously hope that this discussion was more about real code
> > that _exists_ and does what people need. It seems to have degenerated into
> > something else.
> >
> > Because in the end, "code talks, bullshit walks". People can complain and
> > suggest alternatives all they want, but you can't just argue. At some
> > point you need to show the code that actually solves the problem.
>
> That's assuming there is an actual problem, which according to all the
> embedded people except android, there is not.

Yes, there is, but they've decided to ignore it.

> And if there is indeed such a problem (probably not big), it might be
> solved properly by the time suspend blockers are merged, or few
> releases after.

Not quite. Have you followed all of the discussion, actually?

> Whatever the solution (or workaround) is, it would be nice if it could
> be used by more than just android people, and it would also be nice to
> do it without introducing user-space API that *nobody* likes and might
> be quickly deprecated.

I agree with Linus and I don't have that much of a problem with the API that
people seem to have. In fact the much-hated user space API is just a char
device driver with 3 ioctls (that can be extended in future if need be) and
the kernel API is acceptable to me. Yes, there is some overlap between it
and PM QoS, but IMhO that overlap may be reduced over time (eg. by
using PM QoS requirements to implement suspend blockers).

To me, the question boils down to whether or not we're able to persuade the
Android people to use any other approach (eg. by demonstrating that something
else is actually better), because even if we invent a brilliant new approach,
but Android ends up using its old one anyway, the net result will be as though
we haven't done anything useful.

Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/