[PATCH 1/8] PM: Opportunistic suspend support. [Kernel]

Prev: [PATCH 4/8] PM: suspend_block: Add debugfs file
Next: [PATCH] pcm551.c: use pci_ids.h defines

From: Neil Brown on 30 May 2010 04:10

On Fri, 28 May 2010 21:04:53 -0700
Arve Hjønnevåg <arve(a)android.com> wrote:

> On Fri, May 28, 2010 at 7:52 PM, mark gross <640e9920(a)gmail.com> wrote:
> > On Thu, May 27, 2010 at 05:23:54PM +1000, Neil Brown wrote:
> >> On Wed, 26 May 2010 14:20:51 +0100
> >> Matthew Garrett <mjg59(a)srcf.ucam.org> wrote:
> >>
> >> > On Wed, May 26, 2010 at 02:57:45PM +0200, Peter Zijlstra wrote:
> >> >
> >> > > I fail to see why. In both cases the woken userspace will contact a
> >> > > central governing task, either the kernel or the userspace suspend
> >> > > manager, and inform it there is work to be done, and please don't
> >> > > suspend now.
> >> >
> >> > Thinking about this, you're right - we don't have to wait, but that does
> >> > result in another problem. Imagine we get two wakeup events
> >> > approximately simultaneously. In the kernel-level universe the kernel
> >> > knows when both have been handled. In the user-level universe, we may
> >> > have one task schedule, bump the count, handle the event, drop the count
> >> > and then we attempt a suspend again because the second event handler
> >> > hasn't had an opportunity to run yet. We'll then attempt a suspend and
> >> > immediately bounce back up. That's kind of wasteful, although it'd be
> >> > somewhat mitigated by checking that right at the top of suspend entry
> >> > and returning -EAGAIN or similar.
> >> >
> >>
> >> (I'm coming a little late to this party, so excuse me if I say something that
> >> has already been covered however...)
> >>
> >> The above triggers a sequence of thoughts which (When they settled down) look
> >> a bit like this.
> >>
> >> At the hardware level, there is a thing that we could call a "suspend
> >> blocker". It is an interrupt (presumably level-triggered) that causes the
> >> processor to come out of suspend, or not to go into it.
> >>
> >> Maybe it makes sense to export a similar thing from the kernel to user-space.
> >> When any event happens that would wake the device (and drivers need to know
> >> about these already), it would present something to user-space to say that
> >> the event happened.
> >>
> >> When user-space processes the event, it clears the event indicator.
> >
> > we did I proposed making the suspend enabling a oneshot type of thing
> > and all sorts of weak arguments came spewing forth. I honestly couldn't
> > tell if I was reading valid input or fanboy BS.
> >
>
> Can you be more specific? If you are talking about only letting
> drivers abort suspend, not block it, then the main argument against
> that is that you are forcing user-space to poll until the driver stops
> aborting suspend (which according to people arguing against us using
> suspend would make the power-manager a "bad" process). Or are you
> talking about blocking the request from user-space until all other
> suspend-blockers have been released and then doing a single suspend
> cycle before returning. This would not be as bad, but it would force
> the user-space power manager to be multi-threaded since it now would
> have way to cancel the request. Either way, what problem are you
> trying to solve by making it a one-shot request?
>

I don't know exactly what Mark has in mind, but I would advocate 1-shot
simply because what we currently have (echo mem > /sys/power/state) is
1-shot and I don't believe you need to do more than fix the bugs in that.

Your question of whether to abort or block suspend in central I think - the
answer to that question will make or break a possible solution.

Simply aborting the suspend cannot work as you rightly say - the suspend
daemon would then spin until other user-space processes get into action.
Simply blocking while there are any unhandled 'wakeup events' - then aborting
if there were any - is how I think it should work. However as it
doesn't work that way now I don't think it is safe to make it work that way
unconditionally. If we did we could find that existing configurations always
block suspend indefinitely with would clearly be a regression.

I think we still need some sort of "suspend_prepare". This would have two
particular effects.
1/ it sets the start time for interpreting the word "were" above. i.e. the
suspend would abort of there were any unhandled wakeup events since the
"suspend_prepare" was issued.
2/ It would allow unhandled wakeup events to abort the suspend. If no
suspend_prepare had been issued, then only "new" wakeup events would
be allowed to abort the suspend (i.e. the old racy version of suspend).

So the suspend daemon does:

wait for there to be no user-space suspend blocks
issue suspend_prepare
check there are still no suspend blocks
if there are, loop (possibly issue suspend_abort if needed)
issue suspend request
loop

processes that handle wakeup events would

poll for event to be available
request suspend-block
consume event
release suspend-block
loop

(where consuming the event would quite possibly cause some other
suspend-block to become active - e.g. it might request that the display
be unlocked which would block suspends for a time).

NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Rafael J. Wysocki on 30 May 2010 16:00

On Sunday 30 May 2010, Neil Brown wrote:
> On Fri, 28 May 2010 21:04:53 -0700
> Arve Hjønnevåg <arve(a)android.com> wrote:
>
> > On Fri, May 28, 2010 at 7:52 PM, mark gross <640e9920(a)gmail.com> wrote:
> > > On Thu, May 27, 2010 at 05:23:54PM +1000, Neil Brown wrote:
> > >> On Wed, 26 May 2010 14:20:51 +0100
> > >> Matthew Garrett <mjg59(a)srcf.ucam.org> wrote:
> > >>
> > >> > On Wed, May 26, 2010 at 02:57:45PM +0200, Peter Zijlstra wrote:
> > >> >
> > >> > > I fail to see why. In both cases the woken userspace will contact a
> > >> > > central governing task, either the kernel or the userspace suspend
> > >> > > manager, and inform it there is work to be done, and please don't
> > >> > > suspend now.
> > >> >
> > >> > Thinking about this, you're right - we don't have to wait, but that does
> > >> > result in another problem. Imagine we get two wakeup events
> > >> > approximately simultaneously. In the kernel-level universe the kernel
> > >> > knows when both have been handled. In the user-level universe, we may
> > >> > have one task schedule, bump the count, handle the event, drop the count
> > >> > and then we attempt a suspend again because the second event handler
> > >> > hasn't had an opportunity to run yet. We'll then attempt a suspend and
> > >> > immediately bounce back up. That's kind of wasteful, although it'd be
> > >> > somewhat mitigated by checking that right at the top of suspend entry
> > >> > and returning -EAGAIN or similar.
> > >> >
> > >>
> > >> (I'm coming a little late to this party, so excuse me if I say something that
> > >> has already been covered however...)
> > >>
> > >> The above triggers a sequence of thoughts which (When they settled down) look
> > >> a bit like this.
> > >>
> > >> At the hardware level, there is a thing that we could call a "suspend
> > >> blocker". It is an interrupt (presumably level-triggered) that causes the
> > >> processor to come out of suspend, or not to go into it.
> > >>
> > >> Maybe it makes sense to export a similar thing from the kernel to user-space.
> > >> When any event happens that would wake the device (and drivers need to know
> > >> about these already), it would present something to user-space to say that
> > >> the event happened.
> > >>
> > >> When user-space processes the event, it clears the event indicator.
> > >
> > > we did I proposed making the suspend enabling a oneshot type of thing
> > > and all sorts of weak arguments came spewing forth. I honestly couldn't
> > > tell if I was reading valid input or fanboy BS.
> > >
> >
> > Can you be more specific? If you are talking about only letting
> > drivers abort suspend, not block it, then the main argument against
> > that is that you are forcing user-space to poll until the driver stops
> > aborting suspend (which according to people arguing against us using
> > suspend would make the power-manager a "bad" process). Or are you
> > talking about blocking the request from user-space until all other
> > suspend-blockers have been released and then doing a single suspend
> > cycle before returning. This would not be as bad, but it would force
> > the user-space power manager to be multi-threaded since it now would
> > have way to cancel the request. Either way, what problem are you
> > trying to solve by making it a one-shot request?
> >
>
> I don't know exactly what Mark has in mind, but I would advocate 1-shot
> simply because what we currently have (echo mem > /sys/power/state) is
> 1-shot and I don't believe you need to do more than fix the bugs in that.
>
> Your question of whether to abort or block suspend in central I think - the
> answer to that question will make or break a possible solution.
>
> Simply aborting the suspend cannot work as you rightly say - the suspend
> daemon would then spin until other user-space processes get into action.
> Simply blocking while there are any unhandled 'wakeup events' - then aborting
> if there were any - is how I think it should work. However as it
> doesn't work that way now I don't think it is safe to make it work that way
> unconditionally. If we did we could find that existing configurations always
> block suspend indefinitely with would clearly be a regression.
>
> I think we still need some sort of "suspend_prepare". This would have two
> particular effects.
> 1/ it sets the start time for interpreting the word "were" above. i.e. the
> suspend would abort of there were any unhandled wakeup events since the
> "suspend_prepare" was issued.
> 2/ It would allow unhandled wakeup events to abort the suspend. If no
> suspend_prepare had been issued, then only "new" wakeup events would
> be allowed to abort the suspend (i.e. the old racy version of suspend).
>
> So the suspend daemon does:
>
> wait for there to be no user-space suspend blocks
> issue suspend_prepare
> check there are still no suspend blocks
> if there are, loop (possibly issue suspend_abort if needed)
> issue suspend request
> loop
>
> processes that handle wakeup events would
>
> poll for event to be available
> request suspend-block
> consume event
> release suspend-block
> loop
>
> (where consuming the event would quite possibly cause some other
> suspend-block to become active - e.g. it might request that the display
> be unlocked which would block suspends for a time).

Well, please have a look at the Alan Stern's proposal here:
http://lkml.org/lkml/2010/5/29/77

Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: mark gross on 30 May 2010 16:40

On Sun, May 30, 2010 at 06:08:46PM +1000, Neil Brown wrote:
> On Fri, 28 May 2010 21:04:53 -0700
> Arve Hj�nnev�g <arve(a)android.com> wrote:
>
> > On Fri, May 28, 2010 at 7:52 PM, mark gross <640e9920(a)gmail.com> wrote:
> > > On Thu, May 27, 2010 at 05:23:54PM +1000, Neil Brown wrote:
> > >> On Wed, 26 May 2010 14:20:51 +0100
> > >> Matthew Garrett <mjg59(a)srcf.ucam.org> wrote:
> > >>
> > >> > On Wed, May 26, 2010 at 02:57:45PM +0200, Peter Zijlstra wrote:
> > >> >
> > >> > > I fail to see why. In both cases the woken userspace will contact a
> > >> > > central governing task, either the kernel or the userspace suspend
> > >> > > manager, and inform it there is work to be done, and please don't
> > >> > > suspend now.
> > >> >
> > >> > Thinking about this, you're right - we don't have to wait, but that does
> > >> > result in another problem. Imagine we get two wakeup events
> > >> > approximately simultaneously. In the kernel-level universe the kernel
> > >> > knows when both have been handled. In the user-level universe, we may
> > >> > have one task schedule, bump the count, handle the event, drop the count
> > >> > and then we attempt a suspend again because the second event handler
> > >> > hasn't had an opportunity to run yet. We'll then attempt a suspend and
> > >> > immediately bounce back up. That's kind of wasteful, although it'd be
> > >> > somewhat mitigated by checking that right at the top of suspend entry
> > >> > and returning -EAGAIN or similar.
> > >> >
> > >>
> > >> (I'm coming a little late to this party, so excuse me if I say something that
> > >> has already been covered however...)
> > >>
> > >> The above triggers a sequence of thoughts which (When they settled down) look
> > >> a bit like this.
> > >>
> > >> At the hardware level, there is a thing that we could call a "suspend
> > >> blocker". �It is an interrupt (presumably level-triggered) that causes the
> > >> processor to come out of suspend, or not to go into it.
> > >>
> > >> Maybe it makes sense to export a similar thing from the kernel to user-space.
> > >> When any event happens that would wake the device (and drivers need to know
> > >> about these already), it would present something to user-space to say that
> > >> the event happened.
> > >>
> > >> When user-space processes the event, it clears the event indicator.
> > >
> > > we did I proposed making the suspend enabling a oneshot type of thing
> > > and all sorts of weak arguments came spewing forth. �I honestly couldn't
> > > tell if I was reading valid input or fanboy BS.
> > >
> >
> > Can you be more specific? If you are talking about only letting
> > drivers abort suspend, not block it, then the main argument against
> > that is that you are forcing user-space to poll until the driver stops
> > aborting suspend (which according to people arguing against us using
> > suspend would make the power-manager a "bad" process). Or are you
> > talking about blocking the request from user-space until all other
> > suspend-blockers have been released and then doing a single suspend
> > cycle before returning. This would not be as bad, but it would force
> > the user-space power manager to be multi-threaded since it now would
> > have way to cancel the request. Either way, what problem are you
> > trying to solve by making it a one-shot request?
> >

Sorry about missing Avr's email, I've been fighting with getting my
email forwarding working right.

The problems I want to solve with the one-shot styled interface are:

1) of having to sprinkle suspend blocking sections from isr up to
usermode and get them right.

2) provide a platform / architecture independent framework supporting
other low power modes.

I've just posted a patch that expresses what I was trying to express.
Its hard to take design over email without tossing code back and forth.
It's my turn to toss some code.

>
> Simply aborting the suspend cannot work as you rightly say - the suspend
> daemon would then spin until other user-space processes get into action.

Not if you have sensible event messaging to the user-space processes.
I've posted a patch that attempts to add some messaging.

> Simply blocking while there are any unhandled 'wakeup events' - then aborting
> if there were any - is how I think it should work. However as it
> doesn't work that way now I don't think it is safe to make it work that way
> unconditionally. If we did we could find that existing configurations always
> block suspend indefinitely with would clearly be a regression.
>
> I think we still need some sort of "suspend_prepare". This would have two
> particular effects.
> 1/ it sets the start time for interpreting the word "were" above. i.e. the
> suspend would abort of there were any unhandled wakeup events since the
> "suspend_prepare" was issued.
> 2/ It would allow unhandled wakeup events to abort the suspend. If no
> suspend_prepare had been issued, then only "new" wakeup events would
> be allowed to abort the suspend (i.e. the old racy version of suspend).
>
> So the suspend daemon does:
>
> wait for there to be no user-space suspend blocks
> issue suspend_prepare
> check there are still no suspend blocks
> if there are, loop (possibly issue suspend_abort if needed)
> issue suspend request
> loop
>
> processes that handle wakeup events would
>
> poll for event to be available
> request suspend-block
> consume event
> release suspend-block
> loop

In my mind I'm thinking of something like:
attempt low power entry (suspend)
if blocked, get -EBUSY back
select on block_file until block is released
retry suspend
else // entered low power and exited
read wake_event
do something with event; typically wait a bit and loop.

--mgross

>
> (where consuming the event would quite possibly cause some other
> suspend-block to become active - e.g. it might request that the display
> be unlocked which would block suspends for a time).
>
> NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Neil Brown on 30 May 2010 19:10

On Sun, 30 May 2010 21:52:14 +0200
"Rafael J. Wysocki" <rjw(a)sisk.pl> wrote:

> On Sunday 30 May 2010, Neil Brown wrote:
> > On Fri, 28 May 2010 21:04:53 -0700
> > Arve Hjønnevåg <arve(a)android.com> wrote:
> >
> > > On Fri, May 28, 2010 at 7:52 PM, mark gross <640e9920(a)gmail.com> wrote:
> > > > On Thu, May 27, 2010 at 05:23:54PM +1000, Neil Brown wrote:
> > > >> On Wed, 26 May 2010 14:20:51 +0100
> > > >> Matthew Garrett <mjg59(a)srcf.ucam.org> wrote:
> > > >>
> > > >> > On Wed, May 26, 2010 at 02:57:45PM +0200, Peter Zijlstra wrote:
> > > >> >
> > > >> > > I fail to see why. In both cases the woken userspace will contact a
> > > >> > > central governing task, either the kernel or the userspace suspend
> > > >> > > manager, and inform it there is work to be done, and please don't
> > > >> > > suspend now.
> > > >> >
> > > >> > Thinking about this, you're right - we don't have to wait, but that does
> > > >> > result in another problem. Imagine we get two wakeup events
> > > >> > approximately simultaneously. In the kernel-level universe the kernel
> > > >> > knows when both have been handled. In the user-level universe, we may
> > > >> > have one task schedule, bump the count, handle the event, drop the count
> > > >> > and then we attempt a suspend again because the second event handler
> > > >> > hasn't had an opportunity to run yet. We'll then attempt a suspend and
> > > >> > immediately bounce back up. That's kind of wasteful, although it'd be
> > > >> > somewhat mitigated by checking that right at the top of suspend entry
> > > >> > and returning -EAGAIN or similar.
> > > >> >
> > > >>
> > > >> (I'm coming a little late to this party, so excuse me if I say something that
> > > >> has already been covered however...)
> > > >>
> > > >> The above triggers a sequence of thoughts which (When they settled down) look
> > > >> a bit like this.
> > > >>
> > > >> At the hardware level, there is a thing that we could call a "suspend
> > > >> blocker". It is an interrupt (presumably level-triggered) that causes the
> > > >> processor to come out of suspend, or not to go into it.
> > > >>
> > > >> Maybe it makes sense to export a similar thing from the kernel to user-space.
> > > >> When any event happens that would wake the device (and drivers need to know
> > > >> about these already), it would present something to user-space to say that
> > > >> the event happened.
> > > >>
> > > >> When user-space processes the event, it clears the event indicator.
> > > >
> > > > we did I proposed making the suspend enabling a oneshot type of thing
> > > > and all sorts of weak arguments came spewing forth. I honestly couldn't
> > > > tell if I was reading valid input or fanboy BS.
> > > >
> > >
> > > Can you be more specific? If you are talking about only letting
> > > drivers abort suspend, not block it, then the main argument against
> > > that is that you are forcing user-space to poll until the driver stops
> > > aborting suspend (which according to people arguing against us using
> > > suspend would make the power-manager a "bad" process). Or are you
> > > talking about blocking the request from user-space until all other
> > > suspend-blockers have been released and then doing a single suspend
> > > cycle before returning. This would not be as bad, but it would force
> > > the user-space power manager to be multi-threaded since it now would
> > > have way to cancel the request. Either way, what problem are you
> > > trying to solve by making it a one-shot request?
> > >
> >
> > I don't know exactly what Mark has in mind, but I would advocate 1-shot
> > simply because what we currently have (echo mem > /sys/power/state) is
> > 1-shot and I don't believe you need to do more than fix the bugs in that.
> >
> > Your question of whether to abort or block suspend in central I think - the
> > answer to that question will make or break a possible solution.
> >
> > Simply aborting the suspend cannot work as you rightly say - the suspend
> > daemon would then spin until other user-space processes get into action.
> > Simply blocking while there are any unhandled 'wakeup events' - then aborting
> > if there were any - is how I think it should work. However as it
> > doesn't work that way now I don't think it is safe to make it work that way
> > unconditionally. If we did we could find that existing configurations always
> > block suspend indefinitely with would clearly be a regression.
> >
> > I think we still need some sort of "suspend_prepare". This would have two
> > particular effects.
> > 1/ it sets the start time for interpreting the word "were" above. i.e. the
> > suspend would abort of there were any unhandled wakeup events since the
> > "suspend_prepare" was issued.
> > 2/ It would allow unhandled wakeup events to abort the suspend. If no
> > suspend_prepare had been issued, then only "new" wakeup events would
> > be allowed to abort the suspend (i.e. the old racy version of suspend).
> >
> > So the suspend daemon does:
> >
> > wait for there to be no user-space suspend blocks
> > issue suspend_prepare
> > check there are still no suspend blocks
> > if there are, loop (possibly issue suspend_abort if needed)
> > issue suspend request
> > loop
> >
> > processes that handle wakeup events would
> >
> > poll for event to be available
> > request suspend-block
> > consume event
> > release suspend-block
> > loop
> >
> > (where consuming the event would quite possibly cause some other
> > suspend-block to become active - e.g. it might request that the display
> > be unlocked which would block suspends for a time).
>
> Well, please have a look at the Alan Stern's proposal here:
> http://lkml.org/lkml/2010/5/29/77
>

Thanks for the reference.
Some of the details are different, but the idea seems almost exactly the same
as mine.
The apparent dependence on signals makes me feel a little uncomfortable
(interfaces that depend on using signals seem to be easy to use wrongly), but
I don't think that is a serious flaw.

Maybe the biggest difference is philosophical. Alan's proposal appears to be
adding a new feature. Mine presents as trying to fix an existing feature so
that it can be used reliably.
It is easier to argue against a feature than against a bug-fix (??).

Thanks,
NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Neil Brown on 31 May 2010 02:00

On Mon, 31 May 2010 09:03:29 +1000
Neil Brown <neilb(a)suse.de> wrote:

> > Well, please have a look at the Alan Stern's proposal here:
> > http://lkml.org/lkml/2010/5/29/77
> >
>
> Thanks for the reference.
> Some of the details are different, but the idea seems almost exactly the same
> as mine.
> The apparent dependence on signals makes me feel a little uncomfortable
> (interfaces that depend on using signals seem to be easy to use wrongly), but
> I don't think that is a serious flaw.
>
> Maybe the biggest difference is philosophical. Alan's proposal appears to be
> adding a new feature. Mine presents as trying to fix an existing feature so
> that it can be used reliably.
> It is easier to argue against a feature than against a bug-fix (??).

Addendum...

The use of signals by Alan here got me thinking.. and combining it which what
Alan Cox and Thomas Gleixner (and others?) have been saying about detecting
idleness led me down an interesting path that I thought I would share.

An absolutely minimal change to allow a user-space initiated auto-suspend to
be race-free would be achieved by placing an interruptible wait between
pm_notifier_call_chain(PM_SUSPEND_PREPARE);
and
suspend_freeze_processes();

which waits for all kernel threads to have passed through an 'idle' state.

Then any wake event that happened after the PM_SUSPEND_PREPARE would
presumably either abort the suspend later, or would cause an immediate
wake-up.
Any event that happened before the PM_SUSPEND_PREPARE would have been fully
processed by any kernel thread and so would be queued for user-space (or have
been discarded).

If we then require any process that handles wakeup events to use the
equivalent of fcntl(F_SETOWN) to direct a SIGIO (or other signal) at the
process which requests the suspend, then wakeup events won't be lost.
They will either send a signal to the suspend-daemon in time to abort the
suspend, or will be late enough to either abort the suspend in-kernel, or
trigger an immediate wakeup.

Yes: this would require quite a large change of the android code, and the
F_SETOWN usage is fairly ugly, but balanced against it actually working and
it being minimal, that might be OK.

The hardest part would be waiting for all kernel-threads to pass through idle
(or all threads to pass through idle or user-space). We'd need help to make
that work, but at least it would be a fairly localised change.

Maybe it is something to keep in mind when other possibilities have proven
intractable.

NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

First | Prev | Next | Last
Pages: 1 2 3
Prev: [PATCH 4/8] PM: suspend_block: Add debugfs file
Next: [PATCH] pcm551.c: use pci_ids.h defines