From: Rafael J. Wysocki on
On Thursday, June 10, 2010, Neil Brown wrote:
> On Thu, 10 Jun 2010 10:59:43 +0200
> "Rafael J. Wysocki" <rjw(a)sisk.pl> wrote:
>
> > On Thursday, June 10, 2010, Neil Brown wrote:
> > > On Wed, 9 Jun 2010 11:40:27 +0200
> > > "Rafael J. Wysocki" <rjw(a)sisk.pl> wrote:
> > >
> > > > On Wednesday 09 June 2010, Felipe Contreras wrote:
> > > > > On Wed, Jun 9, 2010 at 6:46 AM, Linus Torvalds
> > > > > <torvalds(a)linux-foundation.org> wrote:
> > > > > > On Tue, 8 Jun 2010, david(a)lang.hm wrote:
> > > > > >>
> > > > > >> having suspend blockers inside the kernel adds significant complexity, it's
> > > > > >> worth it only if the complexity buys you enough. In this case the question is
> > > > > >> if the suspend blockers would extend the sleep time enough more to matter. As
> > > > > >> per my other e-mail, this is an area with rapidly diminishing returns as the
> > > > > >> sleep times get longer.
> > > > > >
> > > > > > Well, the counter-argument that nobody seems to have brought up is that
> > > > > > suspend blockers exist, are real code, and end up being shipped in a lot
> > > > > > of machines.
> > > > > >
> > > > > > That's a _big_ argument in favour of them. Certainly much bigger than
> > > > > > arguing against them based on some complexity-arguments for an alternative
> > > > > > that hasn't seen any testing at all.
> > > > > >
> > > > > > IOW, I would seriously hope that this discussion was more about real code
> > > > > > that _exists_ and does what people need. It seems to have degenerated into
> > > > > > something else.
> > > > > >
> > > > > > Because in the end, "code talks, bullshit walks". People can complain and
> > > > > > suggest alternatives all they want, but you can't just argue. At some
> > > > > > point you need to show the code that actually solves the problem.
> > > > >
> > > > > That's assuming there is an actual problem, which according to all the
> > > > > embedded people except android, there is not.
> > > >
> > > > Yes, there is, but they've decided to ignore it.
> > > >
> > > > > And if there is indeed such a problem (probably not big), it might be
> > > > > solved properly by the time suspend blockers are merged, or few
> > > > > releases after.
> > > >
> > > > Not quite. Have you followed all of the discussion, actually?
> > > >
> > > > > Whatever the solution (or workaround) is, it would be nice if it could
> > > > > be used by more than just android people, and it would also be nice to
> > > > > do it without introducing user-space API that *nobody* likes and might
> > > > > be quickly deprecated.
> > > >
> > > > I agree with Linus and I don't have that much of a problem with the API that
> > > > people seem to have. In fact the much-hated user space API is just a char
> > > > device driver with 3 ioctls (that can be extended in future if need be) and
> > > > the kernel API is acceptable to me.
> > >
> > > I think there is a little bit more to it than that. It seems there is a new
> > > ioctl for input/event devices to say "Any events queued here should be
> > > treated as wake-up events". There may be similar additions to other devices,
> > > but I know of no details.
> >
> > That's not in the patchset as in my pull request.
> >
> > It is used on Android, though, and it would have been submitted separately, had
> > the first patchset been merged.
>
> Very true. But as the one cannot be used without the other, they really need
> to be considered as a package.

You can use suspend blockers as in the pull request without the input patch
in principle.

> >
> > > I wonder if we can get a complete statement of changes to the user-space
> > > API...
> > >
> > > > Yes, there is some overlap between it
> > > > and PM QoS, but IMhO that overlap may be reduced over time (eg. by
> > > > using PM QoS requirements to implement suspend blockers).
> > > >
> > > > To me, the question boils down to whether or not we're able to persuade the
> > > > Android people to use any other approach (eg. by demonstrating that something
> > > > else is actually better), because even if we invent a brilliant new approach,
> > > > but Android ends up using its old one anyway, the net result will be as though
> > > > we haven't done anything useful.
> > >
> > > Yes. There is no point unless we can meet somewhere in the middle. I think
> > > that would have to include a full suspend that freezes all processes.
> > > Solutions which reject that - while quite clever - would require too much
> > > change to Android user-space to be acceptable.
> >
> > Moreover, having thought a bit more about the "power manager in user space"
> > concept I'm not sure if it really is that better than the original wakelocks
> > idea. Namely, it only repaces a kernel-based mechanism with a user space
> > task doing basically the same thing, but the communication between that task
> > and the other cooperating user space tasks is arguably more complicated (it
> > also uses the kernel resources, although indirectly).
> >
> > So, for a phone-like system, where you'd generally want to simplify user space,
> > having a "power manager" in the kernel seems to make sense to me.
> >
>
> Following that logic would we end up putting everything in the kernel?

No, I don't think so.

> To my mind the advantage of having something in user-space is flexibility -
> you can refine the interfaces and behaviours without bothering the kernel.
> The reasons for putting things in the kernel are:
> - tight integration with VM or processes (the two main abstractions that the
> kernel has to manage)
> - privileged access to devices
> - arbitration between processes with different privilege levels

I'd add inter-process communication to this list as well.

> Need-for-speed, on the other hand, is not necessarily a justification for
> going in the kernel - experience shows that we can make user-space
> interactions quite fast enough.

Still, if a kernel-based approach ends up being simpler than a user-space-only
alternative, I tend to prefer the former. And please remember that in this
particular case all user-space-only alternatives in fact _have_ _to_ use
facilities provided by the kernel anyway, although they can use the existing
ones.

> The "power manager" itself is quite trivial. It just needs something like:
> while true
> wait for all wake-locks to be dropped
> activate suspend
> and for 'activate suspend' to be interruptible by something taking out a
> wake-lock.

Where the last thing is kind of complicated.

> To my mind, the most interesting part of this is interruptibility. There is
> essentially only one way to do that in Unix/Linux: signals.

Well, I don't follow. :-)

> So it makes sense to have a task - with a pid - performing this loop so that
> it can be sent a signal.
> And we already have mechanisms for sending signals on all sorts of different
> events, both kernel events (fcntl(F_OWNER)), file events (FNOTIFY), and
> normal process operations (kill).
>
> If you are going to have a process (which I think you should)

There's no such process in the original patchset and it seemed to get away with
that.

> it makes sense for it to spend some time in user-space to use ipc-of-your-choice to
> communicate with other processes, and spend some time in kernel-space to be
> able to wait for in-kernel locks to be dropped, and to use signals to get out
> of kernel-space, and system calls to get into kernel space.
>
> Yes, the proposed user-space interface for suspend blocks may not be all that
> horrible. But the fact is that you can get all the required functionality by
> just using currently existing interfaces. Against that background, adding
> anything new should be avoided.

Again, if things can be simplified overall by adding something new to the
kernel, I don't see a problem with that in principle. Especially if that new
stuff doesn't really interfere with anything already existing.

There also is the "real code" argument you seem to totally neglect. Whatever
you suggest, you should actually implement it and show that it works and that
it really is better than the proposed approach. Till then, it's just pure
speculation.

So there's real code that I don't really have a problem with and I haven't seen
any alternative implemented and working yet.

Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Mark Brown on
On Thu, Jun 10, 2010 at 05:46:46PM +0200, Rafael J. Wysocki wrote:
> On Thursday, June 10, 2010, Mark Brown wrote:
> > On Thu, Jun 10, 2010 at 10:59:43AM +0200, Rafael J. Wysocki wrote:

> > > So, for a phone-like system, where you'd generally want to simplify user space,
> > > having a "power manager" in the kernel seems to make sense to me.

> > I'm not clear where this requirement to simplify user space specifically
> > for phones comes from

> This isn't a requirement, but something that IMO is reasonable.

> > - phones do have pretty substantial software stacks that aren't that far away
> > to PCs.

> That doesn't seem to be relevant here.

Sure, that's my point - you seem to be suggesting that phones have a
different requirement here.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Arve Hjønnevåg on
2010/6/10 Alan Stern <stern(a)rowland.harvard.edu>:
> On Wed, 9 Jun 2010, Arve Hj�nnev�g wrote:
>
>> > I think this is where you misunderstood. �There is no "report wakeup
>> > event" as such. �All that happens is that data becomes available to be
>> > read from the input device file. �However the power manager process
>> > isn't polling the device file at this point (because a suspend blocker
>> > is active), so it doesn't realize that the source has become active
>> > again.
>> >
>>
>> Yes this is not what I though you were suggesting. I thought you were
>> trying to make sure the power manager sees all wakeup events. If you
>> are only listening for wakeup events while no suspend blockers are
>> active, why latch them?
>
> ...
>
>> If you only poll the fd after the last user-space suspend blocker is
>> released, why do you care when the kernel wakelock could have been
>> released? It seem the only thing it saves you is an extra poll call
>> when two wakeup events happen at the same time and one of them is
>> fully processed and unblocks suspend before the other event handler
>> blocks suspend. It seems strange to remove your wakeup event from the
>> list when a specific suspend blocker is acquired when any suspend
>> blocker will prevent that wakeup event from being added to the list in
>> the first place.
>
> ...
>
>> I don't think there is a need to tie the fds to anything else. If you
>> poll the fds on the last suspend unblock call, you should get the same
>> behaviour.
>
> You are quite right; there is no need to associate suspend blockers
> with wakeup sources. �The power manager merely needs to poll all wakeup
> sources whenever no suspend blockers are active. �I put those
> associations in the proposal because of the line of reasoning that led
> up to it, but they aren't necessary.
>
>
>> All input events that can wake the system are handled by one
>> user-space suspend blocker. Input devices come and go so we would need
>> to add and remove the fds dynamically.
>
> Correct; the power manager would need to know whenever a wakeup-capable
> device file was opened or closed.
>
>
>> > The power manager would indeed have to know about wakeup devices that
>> > don't need to _keep_ the system awake. �Here's one way to cope: During
>> > those times when no suspend blockers are active but the PM process
>> > thinks a wakeup source is active, the PM process could poll every few
>> > seconds to update its list of active sources. �At those points it could
>> > remove wakeup sources that have timed out.
>> >
>>
>> For that to work the wakeup events would have to be reported to the
>> power manager in a reliable way in the first place. Passing the file
>> descriptor that the app uses to the power manager does not work for
>> this, since the app could read the event while the power manager was
>> not in the poll call and the power manager would never see it.
>
> If the app activates a suspend blocker before reading the event, this
> doesn't matter. �If the app doesn't activate a suspend blocker then it
> risks being suspended after it has read the event but before it has
> handled the event. �This is equally true with wakelocks.
>

It is not the same. Using a wakelock with a timeout only has a problem
if the app did not get a change to run and block suspend before the
timeout expires. With the timeout values we use there is only a
problem if the system is already unresponsive. If the driver does not
block suspend but instead a power manager calls select or poll on a
file descriptor while the app does a blocking read, the power manager
can easily miss the event and suspend before the app blocks suspend.

>> Also,
>> existing apps don't pass their file descriptors to the power manager,
>> so it has the get the event from somewhere else.
>
> Now you've put your finger on the key. �The main difference between my
> scheme and the original wakelock scheme is that programs have to inform
> the power manager whenever they open or close a wakeup-capable device
> file. �With everything implemented inside the kernel this isn't
> necessary, because obviously a kernel driver already knows when its
> device file is opened or closed.
>
> As I said before, this additional complication in userspace is the
> price paid for keeping stuff out of the kernel. �If you had implemented
> wakelocks this way originally, you would not have needed to patch the
> vanilla kernel and this entire stormy discussion would never have
> occurred. �(But of course you couldn't have used this for the original
> implementation of wakelocks, because back then the hardware couldn't
> achieve the lowest power states from idle.)
>
>> > Obviously this proposal would complicate your userspace. �Not
>> > enormously, since most of the work is confined to the power manager,
>>
>> No, the main problem it that it is not confined to the power manager.
>> The power manager does not have a list of file descriptors to monitor,
>> so we have to modify all code that handles wakeup events. This
>> includes vendor supplied code that we don't have the source for, but,
>> on some platforms at least, this code relies on kernel wakelocks with
>> a timeout and could at first be handled the same way we would have to
>> handle existing apps reading from a socket.
>
> Look, I never said this scheme was _better_ than wakelocks. �I merely
> said that it could be used without significant modifications to the
> kernel.
>
>> The suspend blocker approach is more generally useful since it
>> supports hardware where suspend is needed. Why this argument is being
>> ignored is very puzzling.
>
> Probably because people doesn't envision system suspend being used for
> dynamic power management on that kind of hardware.
>

I'm not sure what you mean by dynamic power management here (frequency
of suspends?), but auto suspend is already in use on x86 desktops and
laptops. Suspend blockers can fix the race with some wakeup events
there.

>> Your solution is not immediately useful since depends on removing all
>> periodic kernel timers and adding support to stopping the monotonic
>> clock for set of processes that are frozen.
>
> The monotonic clock is indeed an issue.
>
> Periodic kernel timers... I don't know about them. �Has anyone tried to
> measure them? �There are excellent reasons for trimming them when the
> system goes into low-power idle, independent of Android. �This sounds
> like the kind of thing Arjan might have worked on.
>
>> Your solution forces the user space interface to be more complicated,
>> and it creates a new user space power manager that has to start before
>> any process that handle wakeup events.
>
> That's right.
>
> Alan Stern
>
>



--
Arve Hj�nnev�g
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Alan Stern on
On Thu, 10 Jun 2010, Arve Hj�nnev�g wrote:

> >> For that to work the wakeup events would have to be reported to the
> >> power manager in a reliable way in the first place. Passing the file
> >> descriptor that the app uses to the power manager does not work for
> >> this, since the app could read the event while the power manager was
> >> not in the poll call and the power manager would never see it.
> >
> > If the app activates a suspend blocker before reading the event, this
> > doesn't matter. �If the app doesn't activate a suspend blocker then it
> > risks being suspended after it has read the event but before it has
> > handled the event. �This is equally true with wakelocks.
> >
>
> It is not the same. Using a wakelock with a timeout only has a problem
> if the app did not get a change to run and block suspend before the
> timeout expires. With the timeout values we use there is only a
> problem if the system is already unresponsive. If the driver does not
> block suspend but instead a power manager calls select or poll on a
> file descriptor while the app does a blocking read, the power manager
> can easily miss the event and suspend before the app blocks suspend.

You've lost me. If the power manager is sitting inside a select/poll,
how can it miss the event (given that the event will make data
available to be read on one of the descriptors being polled)?

Or put it another way: With wakelocks, if the app doesn't use a suspend
blocker then once it reads the event data and the timed wakelock is
deactivated, there is nothing to prevent the system from immediately
going into opportunistic suspend. My scheme can fail in the same way.
Is that what you meant?

> >> The suspend blocker approach is more generally useful since it
> >> supports hardware where suspend is needed. Why this argument is being
> >> ignored is very puzzling.
> >
> > Probably because people doesn't envision system suspend being used for
> > dynamic power management on that kind of hardware.
> >
>
> I'm not sure what you mean by dynamic power management here (frequency
> of suspends?), but auto suspend is already in use on x86 desktops and
> laptops. Suspend blockers can fix the race with some wakeup events
> there.

You should stress this point more strongly when conversing with others.
I doubt it will be enough to change anybody's mind, but it can't hurt.
Indeed, if you propose suspend blockers as a way to fix a lost-wakeup
bug in existing distributions, rather than as something needed to
support Android, people might view it more favorably.


There's one question that I don't remember ever seeing answered. To
which kernel drivers do you intend to add suspend blockers?

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Arve Hjønnevåg on
2010/6/10 Alan Stern <stern(a)rowland.harvard.edu>:
> On Thu, 10 Jun 2010, Arve Hj�nnev�g wrote:
>
>> >> For that to work the wakeup events would have to be reported to the
>> >> power manager in a reliable way in the first place. Passing the file
>> >> descriptor that the app uses to the power manager does not work for
>> >> this, since the app could read the event while the power manager was
>> >> not in the poll call and the power manager would never see it.
>> >
>> > If the app activates a suspend blocker before reading the event, this
>> > doesn't matter. �If the app doesn't activate a suspend blocker then it
>> > risks being suspended after it has read the event but before it has
>> > handled the event. �This is equally true with wakelocks.
>> >
>>
>> It is not the same. Using a wakelock with a timeout only has a problem
>> if the app did not get a change to run and block suspend before the
>> timeout expires. With the timeout values we use there is only a
>> problem if the system is already unresponsive. If the driver does not
>> block suspend but instead a power manager calls select or poll on a
>> file descriptor while the app does a blocking read, the power manager
>> can easily miss the event and suspend before the app blocks suspend.
>
> You've lost me. �If the power manager is sitting inside a select/poll,
> how can it miss the event (given that the event will make data
> available to be read on one of the descriptors being polled)?
>

It cannot sit inside of select/poll all the time.

> Or put it another way: With wakelocks, if the app doesn't use a suspend
> blocker then once it reads the event data and the timed wakelock is
> deactivated, there is nothing to prevent the system from immediately
> going into opportunistic suspend. �My scheme can fail in the same way.
> Is that what you meant?
>

No, if an app reads from a file descriptor and block suspend when the
read call returns, then suspend is blocked while processing the data.
If the driver uses a wakelock with a timeout this will fail if the
thread does not get to the suspend block call before the timeout
expires, but unrelated events that don't prevent the app from running
will not cause any problems. In your scheme the user-space power
manager may miss events on this file descriptor since select/poll will
not see an event if the app read that event right before the power
manager called select/poll.

>> >> The suspend blocker approach is more generally useful since it
>> >> supports hardware where suspend is needed. Why this argument is being
>> >> ignored is very puzzling.
>> >
>> > Probably because people doesn't envision system suspend being used for
>> > dynamic power management on that kind of hardware.
>> >
>>
>> I'm not sure what you mean by dynamic power management here (frequency
>> of suspends?), but auto suspend is already in use on x86 desktops and
>> laptops. Suspend blockers can fix the race with some wakeup events
>> there.
>
> You should stress this point more strongly when conversing with others.
> I doubt it will be enough to change anybody's mind, but it can't hurt.
> Indeed, if you propose suspend blockers as a way to fix a lost-wakeup
> bug in existing distributions, rather than as something needed to
> support Android, people might view it more favorably.
>
>
> There's one question that I don't remember ever seeing answered. �To
> which kernel drivers do you intend to add suspend blockers?
>

All drivers that generate wakeup events need to either use suspend
blockers directly or call into something else that does. For instance,
with the patch to block suspend while input events are queued to
user-space, an input driver that fully handles its events in its
interrupt handler does not need any additional suspend blockers, but
if the driver needs a work function or a timer to run before it
reports the event it needs to block suspend until it has reported the
event.

--
Arve Hj�nnev�g
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/