[PATCH 0/8] Suspend block api (version 8) [Kernel]

Prev: [ANN] Linux Security Summit 2010 - Announcement and CFP
Next: [PATCH 4/8] PM: suspend_block: Add debugfs file

From: mark gross on 1 Jun 2010 22:50

On Tue, Jun 01, 2010 at 04:01:25PM -0500, James Bottomley wrote:
> On Tue, 2010-06-01 at 14:51 +0100, Matthew Garrett wrote:
> > On Mon, May 31, 2010 at 04:21:09PM -0500, James Bottomley wrote:
> >
> > > You're the one mentioning x86, not me. I already explained that some
> > > MSM hardware (the G1 for example) has lower power consumption in S3
> > > (which I'm using as an ACPI shorthand for suspend to ram) than any
> > > suspend from idle C state. The fact that current x86 hardware has the
> > > same problem may be true, but it's not entirely relevant.
> >
> > As long as you can set a wakeup timer, an S state is just a C state with
> > side effects. The significant one is that entering an S state stops the
> > process scheduler and any in-kernel timers. I don't think Google care at
> > all about whether suspend is entered through an explicit transition or
> > something hooked into cpuidle - the relevant issue is that they want to
> > be able to express a set of constraints that lets them control whether
> > or not the scheduler keeps on scheduling, and which doesn't let them
> > lose wakeup events in the process.
>
> Exactly, so my understanding of where we currently are is:
>
> 1. pm_qos will be updated to be able to express the android suspend
> blockers as interactivity constraints (exact name TBD, but
> probably /dev/cpu_interactivity)
> 2. pm_qos will be updated to be callable from atomic context
> 3. pm_qos will be updated to export statistics initially closely
> matching what suspend blockers provides (simple update of the rw
> interface?)
>
> After this is done, the current android suspend block patch becomes a
> re-expression in kernel space in terms of pm_qos, with the current
> userspace wakelocks being adapted by the android framework into pm_qos
> requirements expressed to /dev/cpu_interactivity (or whatever name is
> chosen). Then opportunistic suspend is either a small add-on kernel
> patch they have in their tree to suspend when the interactivity
> constraint goes to NONE, or it could be done entirely by a userspace
> process. Long term this could migrate to the freezer and suspend from
> idle approach as the various problem timers get fixed.

This is all nice but, all this does is implement the exact same thing as
the wake lock / suspend blocker API as a pm_qos request-class. It
leaves the overlapping constraint issue from ISR to user mode in place
depending on exactly how the oppertunistic suspend is implemented.

I expect it will be via a notifier on the pm_qos request-class update
that would do exactly what the wake lock code does today. just load up
an a "suspend_on_non_interactivity" driver that registers for the call
back, have it enabled by the user mode PM, and you have the equivelent
architecture as what was proposed by the wake lock patches.

it gives the Android guys what they want, without adding a new
subsystem, minimizing the changes and makes most of the architecture
much more politicaly acceptible.

But doesn't it have the same issues with getting the overlapping
constraints right from wake up source to user mode and dealing with the
wake up envents in a sane way? Instead of sprinkling suspend-blockers
about the kernel we'll sprinkle pm_qos_requests about. I like getting
more users of pm_qos, but isn't this the same thing?

>
> I think the big unresolved issue is the stats extension. For android,
> we need just a name written along with the value, so we have something
> to hang the stats off ... current pm_qos userspace users just write a
> value, so the name would be optional. From the kernel, we probably just
> need an additional API that takes a stats name or NULL if none
> (pm_qos_add_request_named()?). Then reading the stats could be done by
> implementing a fops read routine on the misc device.

I don't think the status would be a big deal to add.

However; I am really burned out by this discussion. I am willing to
stub this out ASAP if it puts this behind us if the principles in the
discussion are in more or less agreement.

--mgross

For the record, I still like my low power event idea, which could
coexist with the above.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Gross, Mark on 1 Jun 2010 22:50

>-----Original Message-----
>From: Arve Hj�nnev�g [mailto:arve(a)android.com]
>Sent: Tuesday, June 01, 2010 6:11 PM
>To: James Bottomley
>Cc: Rafael J. Wysocki; Matthew Garrett; Thomas Gleixner; Peter Zijlstra;
>tytso(a)mit.edu; LKML; Florian Mickler; Linux PM; Linux OMAP Mailing List;
>felipe.balbi(a)nokia.com; Alan Cox; Alan Stern; Gross, Mark; Neil Brown
>Subject: Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)
>
>On Tue, Jun 1, 2010 at 3:36 PM, James Bottomley <James.Bottomley(a)suse.de>
>wrote:
>> On Wed, 2010-06-02 at 00:24 +0200, Rafael J. Wysocki wrote:
>>> On Tuesday 01 June 2010, James Bottomley wrote:
>>> > On Tue, 2010-06-01 at 14:51 +0100, Matthew Garrett wrote:
>>> > > On Mon, May 31, 2010 at 04:21:09PM -0500, James Bottomley wrote:
>>> > >
>>> > > > You're the one mentioning x86, not me. �I already explained that
>some
>>> > > > MSM hardware (the G1 for example) has lower power consumption in
>S3
>>> > > > (which I'm using as an ACPI shorthand for suspend to ram) than any
>>> > > > suspend from idle C state. �The fact that current x86 hardware has
>the
>>> > > > same problem may be true, but it's not entirely relevant.
>>> > >
>>> > > As long as you can set a wakeup timer, an S state is just a C state
>with
>>> > > side effects. The significant one is that entering an S state stops
>the
>>> > > process scheduler and any in-kernel timers. I don't think Google
>care at
>>> > > all about whether suspend is entered through an explicit transition
>or
>>> > > something hooked into cpuidle - the relevant issue is that they want
>to
>>> > > be able to express a set of constraints that lets them control
>whether
>>> > > or not the scheduler keeps on scheduling, and which doesn't let them
>>> > > lose wakeup events in the process.
>>> >
>>> > Exactly, so my understanding of where we currently are is:
>>>
>>> Thanks for the recap.
>>>
>>> > � � �1. pm_qos will be updated to be able to express the android
>suspend
>>> > � � � � blockers as interactivity constraints (exact name TBD, but
>>> > � � � � probably /dev/cpu_interactivity)
>>>
>>> I think that's not been decided yet precisely enough. �I saw a few ideas
>>> here and there in the thread, but which of them are we going to follow?
>>
>> Well, android only needs two states (block and don't block), so that
>> gets translated as 2 s32 values (say 0 and INT_MAX). �I've seen defines
>> like QOS_INTERACTIVE and QOS_NONE (or QOS_DRECKLY or QOS_MANANA) to
>> describe these, but if all we're arguing over is the define name, that's
>> progress.
>
>I think we need separate state constraints for suspend and idle low
>power modes. On the msm platform only a subset of the interrupts can
>wake up from the low power mode, so we block the use if the low power
>mode from idle while other interrupts are enabled. We do not block
>suspend however if those interrupts are not marked as wakeup
>interrupts. Most constraints that prevent suspend are not hardware
>specific and should not prevent entering low power modes from idle. In
>other words we may need to prevent low power idle modes while allowing
>suspend, and we may need to prevent suspend while allowing low power
>idle modes.
>
>It would also be good to not have an implementation that gets slower
>and slower the more clients you have. With binary constraints this is
>trivial.
[mtg: ] agreed.

>
>>
>> The other piece they need is the suspend block name, which comes with
>> the stats API, and finally we need to decide what the actual constraint
>> is called (which is how the dev node gets its name) ...
>>
>>> > � � �2. pm_qos will be updated to be callable from atomic context
>>> > � � �3. pm_qos will be updated to export statistics initially closely
>>> > � � � � matching what suspend blockers provides (simple update of the
>rw
>>> > � � � � interface?)
>
>4. It would be useful to change pm_qos_add_request to not allocate
>anything so can add constraints from init functions that currently
>cannot fail.
[mtg: ] I'm not sure how to do this but I agree it would be good. I guess we could have a block of pm_qos requests pre-allocated statically and re-use them. In practice there will not be more than a handful of requests ever. Dynamic allocation does seem like a bit of a waste.

>
>>> >
>>> > After this is done, the current android suspend block patch becomes a
>>> > re-expression in kernel space in terms of pm_qos, with the current
>>> > userspace wakelocks being adapted by the android framework into pm_qos
>>> > requirements expressed to /dev/cpu_interactivity (or whatever name is
>>> > chosen). �Then opportunistic suspend is either a small add-on kernel
>>> > patch they have in their tree to suspend when the interactivity
>>> > constraint goes to NONE, or it could be done entirely by a userspace
>>> > process. �Long term this could migrate to the freezer and suspend from
>>> > idle approach as the various problem timers get fixed.
>>> >
>>> > I think the big unresolved issue is the stats extension. �For android,
>>> > we need just a name written along with the value, so we have something
>>> > to hang the stats off ... current pm_qos userspace users just write a
>>> > value, so the name would be optional. �From the kernel, we probably
>just
>>> > need an additional API that takes a stats name or NULL if none
>>> > (pm_qos_add_request_named()?). �Then reading the stats could be done
>by
>>> > implementing a fops read routine on the misc device.
>>>
>>> Is the original idea of having that information in debugfs
>objectionable?
>>
>> Well ... debugfs is usually used to get around the sysfs rules. �In this
>> case, pm_qos has a dev interface ... I don't specifically object to
>> using debugfs, but I don't see any reason to forbid it from being a
>> simple dev read interface either.
>>
>
>We don't currently have a dev interface for stats so this is not an
>immediate requirement. The suspend blocker debugfs interface is just
>as good as the proc interface we have for wakelocks.
>
>--
>Arve Hj�nnev�g
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: mark gross on 1 Jun 2010 23:00

On Tue, Jun 01, 2010 at 09:07:37AM +0200, Florian Mickler wrote:
> On Mon, 31 May 2010 16:26:17 -0700
> mark gross <640e9920(a)gmail.com> wrote:
>
> > On Mon, May 31, 2010 at 11:38:55PM +0200, Rafael J. Wysocki wrote:
> > > On Monday 31 May 2010, Arve Hj�nnev�g wrote:
> > > > 2010/5/29 Alan Stern <stern(a)rowland.harvard.edu>:
> > > > > On Sat, 29 May 2010, Arve Hj�nnev�g wrote:
> > > > >
> > > > >> > In place of in-kernel suspend blockers, there will be a new type of QoS
> > > > >> > constraint -- call it QOS_EVENTUALLY. It's a very weak constraint,
> > > > >> > compatible with all cpuidle modes in which runnable threads are allowed
> > > > >> > to run (which is all of them), but not compatible with suspend.
> > > > >> >
> > > > >> This sound just like another API rename. It will work, but given that
> > > > >> suspend blockers was the name least objectionable last time around,
> > > > >> I'm not sure what this would solve.
> > > > >
> > > > > It's not just a rename. By changing this into a QoS constraint, we
> > > > > make it more generally useful. Instead of standing on its own, it
> > > > > becomes part of the PM-QOS framework.
> > > > >
> > > >
> > > > We cannot use the existing pm-qos framework. It is not safe to call
> > > > from atomic context.
> > >
> > > We've just merged a patch that fixed that if I'm not mistaken. Mark, did your
> > > PM QoS update fix that?
> > >
> >
> > I'm pretty sure it can be called in atomic context, and if its not I'm
> > sure we can fix that. It can be called in atomic context. I don't
> > think it was ever a problem to call it in atomic context. The problem it
> > had was that crappy list of string compares. Thats been fixed.
> >
> > --mgross
> >
>
> Well, the register call uses kzalloc. Apart from that I
> think we're good.

registering shouldn't need to be called in atomic context. Its the
update_request that needs to be callible form atomic context.

--mgross

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Arve Hjønnevåg on 1 Jun 2010 23:20

2010/6/1 Gross, Mark <mark.gross(a)intel.com>:
....
>>4. It would be useful to change pm_qos_add_request to not allocate
>>anything so can add constraints from init functions that currently
>>cannot fail.
> [mtg: ] I'm not sure how to do this but I agree it would be good. �I guess we could have a block of pm_qos requests pre-allocated statically and re-use them. �In practice there will not be more than a handful of requests ever. �Dynamic allocation does seem like a bit of a waste.

The calling code will have to store a pointer to your structure
anyway, you may as well have them provide the whole structure.

--
Arve Hj�nnev�g
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Arve Hjønnevåg on 1 Jun 2010 23:30

2010/6/1 Thomas Gleixner <tglx(a)linutronix.de>:
> On Mon, 31 May 2010, Arve Hj�nnev�g wrote:
>
>> 2010/5/31 Rafael J. Wysocki <rjw(a)sisk.pl>:
>> > On Monday 31 May 2010, Arve Hj�nnev�g wrote:
>> >> 2010/5/30 Rafael J. Wysocki <rjw(a)sisk.pl>:
>> > ...
>> >>
>> >> I think it makes more sense to block suspend while wakeup events are
>> >> pending than blocking it everywhere timers are used by code that could
>> >> be called while handling wakeup events or other critical work. Also,
>> >> even if you did block suspend everywhere timers where used you still
>> >> have the race where a wakeup interrupt happens right after you decided
>> >> to suspend. In other words, you still need to block suspend in all the
>> >> same places as with the current opportunistic suspend code, so what is
>> >> the benefit of delaying suspend until idle?
>> >
>> > Assume for a while that you don't use suspend blockers, OK? �I realize you
>> > think that anything else doesn't make sense, but evidently some other people
>> > have that opinion about suspend blockers.
>> >
>>
>> It sounded like you were suggesting that initiating suspend from idle
>> would somehow avoid the race condition with wakeup events. All I'm
>> saying is that you would need to block suspend in all the same places.
>> If you don't care about ignoring wakeup events, then sure you can
>> initiate suspend from idle.
>
> And why should you miss a wakeup there ? If you get an interrupt in
> the transition, then you are not longer idle.
>

Because suspend itself causes you to not be idle you cannot abort
suspend just because you are not idle anymore.

--
Arve Hj�nnev�g
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

First | Prev | Next | Last
Pages: 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77
Prev: [ANN] Linux Security Summit 2010 - Announcement and CFP
Next: [PATCH 4/8] PM: suspend_block: Add debugfs file