[PATCH 0/8] Suspend block api (version 8) [Kernel]

Prev: [ANN] Linux Security Summit 2010 - Announcement and CFP
Next: [PATCH 4/8] PM: suspend_block: Add debugfs file

From: Gross, Mark on 3 Jun 2010 10:30

>-----Original Message-----
>From: James Bottomley [mailto:James.Bottomley(a)suse.de]
>Sent: Thursday, June 03, 2010 6:25 AM
>To: Alan Cox
>Cc: Gross, Mark; Florian Mickler; Arve Hj�nnev�g; Neil Brown;
>tytso(a)mit.edu; Peter Zijlstra; LKML; Thomas Gleixner; Linux OMAP Mailing
>List; Linux PM; felipe.balbi(a)nokia.com
>Subject: Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)
>
>On Thu, 2010-06-03 at 11:03 +0100, Alan Cox wrote:
>> > [mtg: ] This has been a pain point for the PM_QOS implementation. They
>change the constrain back and forth at the transaction level of the i2c
>driver. The pm_qos code really wasn't made to deal with such hot path use,
>as each such change triggers a re-computation of what the aggregate qos
>request is.
>>
>> That should be trivial in the usual case because 99% of the time you can
>> hot path
>>
>> the QoS entry changing is the latest one
>> there have been no other changes
>> If it is valid I can use the cached previous aggregate I cunningly
>> saved in the top QoS entry when I computed the new one
>>
>> (ie most of the time from the kernel side you have a QoS stack)
>
>It's not just the list based computation: that's trivial to fix, as you
>say ... the other problem is the notifier chain, because that's blocking
>and could be long. Could we invoke the notifier through a workqueue?
>It doesn't seem to have veto power, so it's pure notification, does it
>matter if the notice is delayed (as long as it's in order)?
[mtg: ] true. The notifications "could be" done on as a scheduled work item
in most cases. I think there is only one user of the notification so far
any way. Most pm_qos users do a pole of the current value for whatever parameter they are interested in.

--mgross

>
>> > We've had a number of attempts at fixing this, but I think the proper
>fix is to bolt a "disable C-states > x" interface into cpu_idle that
>bypases pm_qos altogether. Or, perhaps add a new pm_qos API that does the
>equivalent operation, overriding whatever constraint is active.
>>
>> We need some of this anyway for deep power saving because there is
>> hardware which can't wake from soem states, which in turn means if that
>> device is active we need to be above the state in question.
>
>James
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Florian Mickler on 3 Jun 2010 10:40

On Thu, 3 Jun 2010 06:24:49 -0700
mark gross <640e9920(a)gmail.com> wrote:

> On Thu, Jun 03, 2010 at 12:10:03AM -0700, Arve Hj�nnev�g wrote:

> ok I'm not getting it.
> is this a fancy com-sci algorithm I should know about?
>
> --mgross

I think you are at an advantage if you have studied fancy com-sci for
this? Here is an example:

say you have 5 constraints:
qos1 with a value of 10
qos2 with 5
qos3 with 10
qos4 with 11

Now, you hash that list by the qos-values:

11 ---- 10 ----- 5
| | |
qos4 qos3 qos2
|
qos1

To compute the maximum you just walk the "----" list.

To reduce qos4 from 11 to 5 you remove it from its "|" list and
prepend it to the corresponding "|" list. (4 Pointer adjustments +
searching the "-----" list for the right place to insert.

result:

10 ---- 5
| |
qos3 qos4
| |
qos1 qos2

Cheers,
Flo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: James Bottomley on 3 Jun 2010 10:40

On Thu, 2010-06-03 at 00:10 -0700, Arve Hjønnevåg wrote:
> On Wed, Jun 2, 2010 at 10:40 PM, mark gross <640e9920(a)gmail.com> wrote:
> > On Wed, Jun 02, 2010 at 09:54:15PM -0700, Brian Swetland wrote:
> >> On Wed, Jun 2, 2010 at 8:18 PM, mark gross <640e9920(a)gmail.com> wrote:
> >> > On Wed, Jun 02, 2010 at 02:58:30PM -0700, Arve Hjønnevåg wrote:
> >> >>
> >> >> The list is not short. You have all the inactive and active
> >> >> constraints on the same list. If you change it to a two level list
> >> >> though, the list of unique values (which is the list you have to walk)
> >> >> may be short enough for a tree to be overkill.
> >> >
> >> > what have you seen in practice from the wake-lock stats?
> >> >
> >> > I'm having a hard time seeing where you could get more than just a
> >> > handfull. However; one could go to a dual list (like the scheduler) and
> >> > move inactive nodes from an active to inactive list, or we could simply
> >> > remove them from the list uppon inactivity. which would would well
> >> > after I change the api to have the client allocate the memory for the
> >> > nodes... BUT, if your moving things in and out of a list a lot, I'm not
> >> > sure the break even point where changing the structure helps.
> >> >
> >> > We'll need to try it.
> >> >
> >> > I think we will almost never see more than 10 list elements.
> >> >
> >> > --mgross
> >> >
> >> >
> >>
> >> I see about 80 (based on the batteryinfo dump) on my Nexus One
> >> (QSD8250, Android Froyo):
> >
> > shucks.
> >
> > well I think for a pm_qos class that has boolean dynamic range we can
> > get away with not walking the list on every request update. we can use
> > a counter, and the list will be for mostly for stats.
> >
>
> Did you give any thought to my suggestion to only use one entry per
> unique value on the first level list and then use secondary lists of
> identical values. That way if you only have two constraints values the
> list you have to walk when updating a request will never have more
> than two entries regardless of how many total request you have.
>
> A request update then becomes something like this:
> if on primary list {
> unlink from primary list
> if secondary list is not empty
> get next secondary entry and add in same spot on primary list
> }
> unlink from secondary list
> find new spot on primary list
> if already there
> add to secondary list
> else
> add to primary list

This is just reinventing hash bucketed lists. To get the benefits, all
we do is implement an N state constraint as backed by an N bucketed hash
list, which the kernel already has all the internal mechanics for.

James

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Thomas Gleixner on 3 Jun 2010 10:40

On Thu, 3 Jun 2010, James Bottomley wrote:

> On Thu, 2010-06-03 at 11:03 +0100, Alan Cox wrote:
> > > [mtg: ] This has been a pain point for the PM_QOS implementation. They change the constrain back and forth at the transaction level of the i2c driver. The pm_qos code really wasn't made to deal with such hot path use, as each such change triggers a re-computation of what the aggregate qos request is.
> >
> > That should be trivial in the usual case because 99% of the time you can
> > hot path
> >
> > the QoS entry changing is the latest one
> > there have been no other changes
> > If it is valid I can use the cached previous aggregate I cunningly
> > saved in the top QoS entry when I computed the new one
> >
> > (ie most of the time from the kernel side you have a QoS stack)
>
> It's not just the list based computation: that's trivial to fix, as you
> say ... the other problem is the notifier chain, because that's blocking
> and could be long. Could we invoke the notifier through a workqueue?
> It doesn't seem to have veto power, so it's pure notification, does it
> matter if the notice is delayed (as long as it's in order)?

It depends on the information type and for a lot of things we might
get away without notifiers.

The only real issue is when you need to get other cores out of their
deep idle state to make a new constraint work. That's what we do with
the DMA latency notifier right now.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Kevin Hilman on 3 Jun 2010 10:50

Peter Zijlstra <peterz(a)infradead.org> writes:

> On Thu, 2010-06-03 at 11:03 +0100, Alan Cox wrote:
>> > [mtg: ] This has been a pain point for the PM_QOS implementation.
>> They change the constrain back and forth at the transaction level of
>> the i2c driver. The pm_qos code really wasn't made to deal with such
>> hot path use, as each such change triggers a re-computation of what
>> the aggregate qos request is.
>>
>> That should be trivial in the usual case because 99% of the time you can
>> hot path
>>
>> the QoS entry changing is the latest one
>> there have been no other changes
>> If it is valid I can use the cached previous aggregate I cunningly
>> saved in the top QoS entry when I computed the new one
>>
>> (ie most of the time from the kernel side you have a QoS stack)
>
> Why would the kernel change the QoS state of a task? Why not have two
> interacting QoS variables, one for the task, one for the subsystem in
> question, and the action depends on their relative value?

Yes, having a QoS parameter per-subsystem (or even per-device) is very
important for SoCs that have independently controlled powerdomains.
If all devices/subsystems in a particular powerdomain have QoS
parameters that permit, the power state of that powerdomain can be
lowered independently from system-wide power state and power states of
other power domains.

Kevin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

First | Prev | Next | Last
Pages: 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85
Prev: [ANN] Linux Security Summit 2010 - Announcement and CFP
Next: [PATCH 4/8] PM: suspend_block: Add debugfs file