From: James Bottomley on
On Thu, 2010-06-03 at 11:03 +0100, Alan Cox wrote:
> > [mtg: ] This has been a pain point for the PM_QOS implementation. They change the constrain back and forth at the transaction level of the i2c driver. The pm_qos code really wasn't made to deal with such hot path use, as each such change triggers a re-computation of what the aggregate qos request is.
>
> That should be trivial in the usual case because 99% of the time you can
> hot path
>
> the QoS entry changing is the latest one
> there have been no other changes
> If it is valid I can use the cached previous aggregate I cunningly
> saved in the top QoS entry when I computed the new one
>
> (ie most of the time from the kernel side you have a QoS stack)

It's not just the list based computation: that's trivial to fix, as you
say ... the other problem is the notifier chain, because that's blocking
and could be long. Could we invoke the notifier through a workqueue?
It doesn't seem to have veto power, so it's pure notification, does it
matter if the notice is delayed (as long as it's in order)?

> > We've had a number of attempts at fixing this, but I think the proper fix is to bolt a "disable C-states > x" interface into cpu_idle that bypases pm_qos altogether. Or, perhaps add a new pm_qos API that does the equivalent operation, overriding whatever constraint is active.
>
> We need some of this anyway for deep power saving because there is
> hardware which can't wake from soem states, which in turn means if that
> device is active we need to be above the state in question.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: mark gross on
On Thu, Jun 03, 2010 at 10:04:29AM +0200, Peter Zijlstra wrote:
> On Wed, 2010-06-02 at 20:18 -0700, mark gross wrote:
> > However; one could go to a dual list (like the scheduler) and
> > move inactive nodes from an active to inactive list,
>
> /me suggests you take a new look at the scheduler, those lists
> disappeared more than 10 releases ago. We use RB-trees these days.

your dating me ;)

I haven't taken a hard look at the scheduler for a while.

/me hides.

--mgross
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: David Brownell on

> > > If "suspend" is the thing we are used to via
> /sys/power/state then the
> > > race will persist forever except for the suspend blocker workaround,

True, because device wakeups are enabled
by device.driver.suspend() methods, which are
invoked towards the end of the activities
triggered by writing /sys/power/state.

Now, there can be platforms (mostly embedded)
where the drivers adopt a policy that not only
do they keep their devices in as low a power
state as practical at all times, but they also
keep the hardware wakeup mechanisms enabled (they
may be needed to kick the hardware out of those
low power states) ... That is, suspend() might be
superfluous (a NOP) in those platforms' drivers.

Such platforms might also be (non-ACPI) ones
where idle C-states and S3/STR have the same
power consumption ... but that would be a
platform-specific issue, not a generic thing
which all Linux implementations could rely on.

- Dave


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Florian Mickler on
On Thu, 3 Jun 2010 00:10:03 -0700
Arve Hj�nnev�g <arve(a)android.com> wrote:

> On Wed, Jun 2, 2010 at 10:40 PM, mark gross <640e9920(a)gmail.com> wrote:

> > well I think for a pm_qos class that has boolean dynamic range we can
> > get away with not walking the list on every request update. �we can use
> > a counter, and the list will be for mostly for stats.
> >
>
> Did you give any thought to my suggestion to only use one entry per
> unique value on the first level list and then use secondary lists of
> identical values. That way if you only have two constraints values the
> list you have to walk when updating a request will never have more
> than two entries regardless of how many total request you have.
>
> A request update then becomes something like this:
> if on primary list {
> unlink from primary list
> if secondary list is not empty
> get next secondary entry and add in same spot on primary list
> }
> unlink from secondary list
> find new spot on primary list
> if already there
> add to secondary list
> else
> add to primary list
>

Yes. I think that would be good. If we keep the primary list sorted,
then this becomes a nice priority queue implementation which does
GetMax in constant time and Insert and Delete in logarithmic
complexity to the number of different values.

Cheers,
Flo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Florian Mickler on
On Thu, 03 Jun 2010 08:24:31 -0500
James Bottomley <James.Bottomley(a)suse.de> wrote:

> On Thu, 2010-06-03 at 11:03 +0100, Alan Cox wrote:
> > > [mtg: ] This has been a pain point for the PM_QOS implementation. They change the constrain back and forth at the transaction level of the i2c driver. The pm_qos code really wasn't made to deal with such hot path use, as each such change triggers a re-computation of what the aggregate qos request is.
> >
> > That should be trivial in the usual case because 99% of the time you can
> > hot path
> >
> > the QoS entry changing is the latest one
> > there have been no other changes
> > If it is valid I can use the cached previous aggregate I cunningly
> > saved in the top QoS entry when I computed the new one
> >
> > (ie most of the time from the kernel side you have a QoS stack)
>
> It's not just the list based computation: that's trivial to fix, as you
> say ... the other problem is the notifier chain, because that's blocking
> and could be long. Could we invoke the notifier through a workqueue?
> It doesn't seem to have veto power, so it's pure notification, does it
> matter if the notice is delayed (as long as it's in order)?

I think schedule_work() (worqueue.h) can take care of that.
Thats how the rfkill subsystem does it.

Cheers,
Flo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/