PM: Avoid losing wakeup events during suspend [Kernel]

Prev: [PATCH] [RFC] kvm: time accelerator/decelerator for kvm guests
Next: wait_for_completion_interruptible does not wait !!

From: mark gross on 21 Jun 2010 02:20

On Sun, Jun 20, 2010 at 10:23:38PM -0400, Alan Stern wrote:
> On Sun, 20 Jun 2010, Rafael J. Wysocki wrote:
>
> > > > Generally, there are two problems in that area. First, if a wakeup event
> > > > occurs exactly at the same time when /sys/power/state is being written to,
> > > > the even may be delivered to user space right before the freezing of it,
> > > > in which case the user space consumer of the event may not be able to process
> > > > it before the system is suspended.
> > >
> > > Indeed, the same problem arises if the event isn't delivered to
> > > userspace until after userspace is frozen.
> >
> > In that case the kernel should abort the suspend so that the event can be
> > delivered to the user space.
>
> Yes.
>
> > > Of course, the underlying issue here is that the kernel has no direct way
> > > to know when userspace has finished processing an event. Userspace would
> > > have to tell it, which generally would mean rewriting some large number of user
> > > programs.
> >
> > I'm not sure of that. If the kernel doesn't initiate suspend, it doesn't
> > really need to know whether or not user space has already consumed the event.
>
> That's true. But it only shifts the onus: When a userspace program has
> finished processing an event, it has to tell the power-manager process.
> Clearly this sort of thing is unavoidable, one way or another.
>
> > > > The following patch illustrates my idea of how these two problems may be
> > > > addressed. It introduces a new global sysfs attribute,
> > > > /sys/power/wakeup_count, associated with a running counter of wakeup events
> > > > and a helper function, pm_wakeup_event(), that may be used by kernel subsystems
> > > > to increment the wakeup events counter.
> > >
> > > In what way is this better than suspend blockers?
> >
> > It doesn't add any new framework and it doesn't require the users of
> > pm_wakeup_event() to "unblock" suspend, so it is simpler. It also doesn't add
> > the user space interface that caused so much opposition to appear.
>
> Okay. A quick comparison shows that in your proposal:
>
> There's no need to register and unregister suspend blockers.
> But instead you create the equivalent of a suspend blocker
> inside every struct device.
>
> Drivers (or subsystems) don't have to activate suspend
> blockers. But instead they have to call pm_wakeup_event().
>
> Drivers don't have to deactivate suspend blockers. You don't
> have anything equivalent, and as a result your scheme is
> subject to the race described below.
>
> There are no userspace suspend blockers and no opportunistic
> suspend. Instead a power-manager process takes care of
> initiating or preventing suspends as needed.
>
> In short, you have eliminated the userspace part of the suspend blocker
> approach just as in some of the proposals posted earlier, and you have
> replaced the in-kernel suspend blockers with new data in struct device
> and a new PM API. On the whole, it doesn't seem very different from
> the in-kernel part of suspend blockers. The most notable difference is
> the name: pm_wake_event() vs. suspend_blocker_activate(), or whatever
> it ended up being called.
>
> This is the race I was talking about:

Your confused about what problem this patch attempts to solve. There is
a pm_qos patch in the works to address the suspend blocker
functionality.
http://lists.linux-foundation.org/pipermail/linux-pm/2010-June/026760.html

--mgross

>
> > > What happens if an event arrives just before you read
> > > /sys/power/wakeup_count, but the userspace consumer doesn't realize
> > > there is a new unprocessed event until after the power manager checks
> > > it?
>
> > I think this is not the kernel's problem. In this approach the kernel makes it
> > possible for the user space to avoid the race. Whether or not the user space
> > will use this opportunity is a different matter.
>
> It is _not_ possible for userspace to avoid this race. Help from the
> kernel is needed.
>
> > > Your plan is missing a critical step: the "handoff" whereby
> > > responsibility for handling an event passes from the kernel to
> > > userspace.
>
> > > With suspend blockers, this handoff occurs when an event queue is
> > > emptied and its associate suspend blocker is deactivated. Or with some
> > > kinds of events for which the Android people have not written an
> > > explicit handoff, it occurs when a timer expires (timed suspend
> > > blockers).
> >
> > Well, quite frankly, I don't see any difference here. In either case there is
> > a possibility for user space to mess up things and the kernel can't really help
> > that.
>
> With suspend blockers, there is also the possibility for userspace to
> handle races correctly. But with your scheme there isn't -- that's the
> difference.
>
> > > This shares with the other alternatives posted recently the need for a
> > > central power-manager process. And like in-kernel suspend blockers, it
> > > requires changes to wakeup-capable drivers (the wakeup-events counter
> > > has to be incremented).
> >
> > It doesn't really require changes to drivers, but to code that knows of wakeup
> > events, like the PCI runtime wakeup code.
>
> Just like in-kernel suspend blockers.
>
> > Moreover, it doesn't require kernel
> > subsystems to know or even care when it is reasonable to allow suspend to
> > happen. The only thing they need to do is to call pm_wakeup_event() whenever
> > they see a wakeup event.
>
> That's just semantics. Obviously a wakeup event should prevent suspend
> from happening, so if subsystems know or care about one then they know
> or care about the other.
>
> > I don't really think it is too much of a requirement
> > (and quite frnakly I can't imagine anything simpler than that).
>
> This is because you have omitted the part about allowing suspends again
> (or if you prefer, about notifying the PM core that a wakeup event has
> been handed off to userspace). As a result of leaving this out, you
> haven't eliminated all the races.
>
> > Yes, it does, but I have an idea about how to implement such a power manager
> > and I'm going to actually try it.
>
> A logical design would be to use dbus for disseminating PM-related
> information. Does your idea work that way?
>
> > I don't think any of the approaches that don't use suspend blockers allows
> > one to avoid the race between the process that writes to /sys/power/state
> > and a wakeup event happening at the same time. They attempt to address another
> > issue, which is how to prevent untrusted user space processes from keeping the
> > system out of idle, but that is a different story.
>
> Well, there was one approach that didn't use suspend blockers and did
> solve the race: the original wakelocks proposal. Of course, that was
> just suspend blockers under a different name. One could make a very
> good case that your scheme is also suspend blockers under a different
> name (and with an important part missing).
>
> Alan Stern
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: tytso on 21 Jun 2010 08:20

So where are we at this point?

Discussion had completely died down for a while, and it's picked up
again, but it's not clear to me that we're any closer to reaching
consensus.

There's been one proposal that we simply merge in a set of no-op
inline functions for suspend blockers, just so we can get let the
drivers go in (assuming that Greg K-H believes this is still a
problem), but with an automatical removal of N months (N to be
decided, say 9 or 12 or 18 months).

My concern is that if we do that, we will have simply kicked the ball
down the road for N months. Another approach is to simply merge in
no-op functions and not leave any kind of deprecation schedule.
That's sort of an implicit admission of the fact that we may not reach
consensus on this issue. Or we could simply ship the patches as-is to
Linus after he gets back from vacation and ask him for a thumbs up or
thumbs down vote, which might settle things once and for all.

How do we go forward from here?

- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Florian Mickler on 21 Jun 2010 08:30

On Mon, 21 Jun 2010 13:22:34 +0100
Alan Cox <alan(a)lxorguk.ukuu.org.uk> wrote:

> > How do we go forward from here?
>
> PM QoS seems to be evolving nicely.

Still, it doesn't solve the userspace-part.

Cheers,
Flo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Alan Cox on 21 Jun 2010 08:30

> There's been one proposal that we simply merge in a set of no-op
> inline functions for suspend blockers, just so we can get let the
> drivers go in (assuming that Greg K-H believes this is still a
> problem), but with an automatical removal of N months (N to be
> decided, say 9 or 12 or 18 months).

If you automatically remove the suspend blocker wrappers in 12 months
then you can keep the drivers in tree.

The drivers are generally not a problem (ok some of them like the binder
are interesting little trips) its just those hooks, and we'd all benefit
somewhat even if the only bit google are patching back in are their hooks.

> down the road for N months. Another approach is to simply merge in
> no-op functions and not leave any kind of deprecation schedule.
> That's sort of an implicit admission of the fact that we may not reach
> consensus on this issue. Or we could simply ship the patches as-is to

Very bad precedent. What happens when every other vendor does the same ?
Keep the upstream code clean.

> Linus after he gets back from vacation and ask him for a thumbs up or
> thumbs down vote, which might settle things once and for all.

You seem desperate to just throw it at Linus - you have been all along
before the discussion, during it and now: but I don't understand why ?
It's as if you don't want progress to be made by other means ?

> How do we go forward from here?

PM QoS seems to be evolving nicely.

As for merging stuff - if its new code then it should get submitted with
the hooks left out anyway, just as all the other vendors are doing with
weird little custom hooks of their own. The hooks can still easily be
patched back in as a tiny easy to maintain vendor patch.

This works - the code still gets updated and seen by people, only the
little hook patch has to be maintained and generally it looks after
itself except for the odd .rej when something changes right up by the
merge point.

Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Florian Mickler on 21 Jun 2010 08:50

On Sun, 20 Jun 2010 22:55:22 -0700
mark gross <640e9920(a)gmail.com> wrote:

> On Sun, Jun 20, 2010 at 10:33:01PM -0400, Alan Stern wrote:
> > On Sun, 20 Jun 2010, mark gross wrote:
> > > > In what way is this better than suspend blockers?
> > >
> > > 1) I don't think suspend blockers really solve or effectively articulate
> > > the suspend wake event race problem.
> >
> > Why not?
>
> For starters the suspend block patch is about suspend blocking, at least
> at first before competing ideas starting coming out and this race issue
> was offered up as an excuse to not consider the alternative ideas, then
> suddenly suspend blocking became a wake event notification mechanism
> implementation that was baked into the implementation. The arguments
> are still a blur and irritating to recall / look up again.

I really can't follow you here. Userspace and kernelspace actively
blocking suspend when necessary guarantees that the race is won by the
right party. I.e. the race problem is solved.

You can view it as a wake event notification if you want. But that's
just another point of view.

>
> But, the point I'm trying to make is that wake event serialization /
> event handling suddenly became a big FUD-pie tightly bound to a specific
> feature for suspend blocking (or was is auto suspending? Its a magical
> solution to all the PM problems in the kernel. I'm being sarcastic.)
>
> Its much better to call out the issue and provide a clean targeted
> solution to it without binding it to some other solution to a different
> problem.

See my other mail in response to Alan Stern. In the case of a
userspace-suspend-manager, if we just use this wake-event-counting
mechanism, we have to make shure userspace get's scheduled and notifies
the suspend-manager with an 'all clear' before it suspends the machine.

Else it can happen for the userspace-manager to suspend before
userspace could notify the userspace-manager to not suspend.

Cheers,
Flo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12
Prev: [PATCH] [RFC] kvm: time accelerator/decelerator for kvm guests
Next: wait_for_completion_interruptible does not wait !!