From: Rafael J. Wysocki on
On Friday, August 06, 2010, Kevin Hilman wrote:
> Arjan van de Ven <arjan(a)linux.intel.com> writes:
>
> > +
> > +/**
> > + * update_pm_runtime_accounting - Update the time accounting of power
> > states
> > + * @dev: Device to update the accounting for
> > + *
> > + * In order to be able to have time accounting of the various power states
> > + * (as used by programs such as PowerTOP to show the effectiveness of
> > runtime
> > + * PM), we need to track the time spent in each state.
> > + * update_pm_runtime_accounting must be called each time before the
> > + * runtime_status field is updated, to account the time in the old state
> > + * correctly.
> > + */
> > +void update_pm_runtime_accounting(struct device *dev)
> > +{
> > + unsigned long now = jiffies;
> > + int delta;
> > +
> > + delta = now - dev->power.accounting_timestamp;
> > +
> > + if (delta < 0)
> > + delta = 0;
> > +
> > + dev->power.accounting_timestamp = now;
> > +
> > + if (dev->power.disable_depth > 0)
> > + return;
> > +
> > + if (dev->power.runtime_status == RPM_SUSPENDED)
> > + dev->power.suspended_jiffies += delta;
> > + else
> > + dev->power.active_jiffies += delta;
> > +}
>
> By using jiffies, I think we might miss events in drivers that are doing
> runtime PM transitions in short bursts. On embedded systems with slow
> HZ, there could potentially be lots of transitions between ticks.
>
> It would be nicer to use clocksource-based time so transitions between
> jiffies could still be factored into the accounting.

Patch please?

Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Arjan van de Ven on
On 8/5/2010 4:20 PM, Kevin Hilman wrote:
> Arjan van de Ven<arjan(a)linux.intel.com> writes:
>
>
>> +
>> +/**
>> + * update_pm_runtime_accounting - Update the time accounting of power
>> states
>> + * @dev: Device to update the accounting for
>> + *
>> + * In order to be able to have time accounting of the various power states
>> + * (as used by programs such as PowerTOP to show the effectiveness of
>> runtime
>> + * PM), we need to track the time spent in each state.
>> + * update_pm_runtime_accounting must be called each time before the
>> + * runtime_status field is updated, to account the time in the old state
>> + * correctly.
>> + */
>> +void update_pm_runtime_accounting(struct device *dev)
>> +{
>> + unsigned long now = jiffies;
>> + int delta;
>> +
>> + delta = now - dev->power.accounting_timestamp;
>> +
>> + if (delta< 0)
>> + delta = 0;
>> +
>> + dev->power.accounting_timestamp = now;
>> +
>> + if (dev->power.disable_depth> 0)
>> + return;
>> +
>> + if (dev->power.runtime_status == RPM_SUSPENDED)
>> + dev->power.suspended_jiffies += delta;
>> + else
>> + dev->power.active_jiffies += delta;
>> +}
>>
> By using jiffies, I think we might miss events in drivers that are doing
> runtime PM transitions in short bursts. On embedded systems with slow
> HZ, there could potentially be lots of transitions between ticks.
>
> It would be nicer to use clocksource-based time so transitions between
> jiffies could still be factored into the accounting.
>

you're absolutely right that the current mechanism is more "sampling
accuracy" (similar to most /proc info that shows up with top and such).

on the "slow HZ".. there is no more valid reason to not set HZ to
1000... so we'll get 1 msec sampling rate basically.

the problem with a more accurate clocksource is that it's expensive. And
more... the path to such clocksource itself might be subject to power
management ;-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Kevin Hilman on
Arjan van de Ven <arjan(a)linux.intel.com> writes:

> On 8/5/2010 4:20 PM, Kevin Hilman wrote:
>> Arjan van de Ven<arjan(a)linux.intel.com> writes:
>>
>>
>>> +
>>> +/**
>>> + * update_pm_runtime_accounting - Update the time accounting of power
>>> states
>>> + * @dev: Device to update the accounting for
>>> + *
>>> + * In order to be able to have time accounting of the various power states
>>> + * (as used by programs such as PowerTOP to show the effectiveness of
>>> runtime
>>> + * PM), we need to track the time spent in each state.
>>> + * update_pm_runtime_accounting must be called each time before the
>>> + * runtime_status field is updated, to account the time in the old state
>>> + * correctly.
>>> + */
>>> +void update_pm_runtime_accounting(struct device *dev)
>>> +{
>>> + unsigned long now = jiffies;
>>> + int delta;
>>> +
>>> + delta = now - dev->power.accounting_timestamp;
>>> +
>>> + if (delta< 0)
>>> + delta = 0;
>>> +
>>> + dev->power.accounting_timestamp = now;
>>> +
>>> + if (dev->power.disable_depth> 0)
>>> + return;
>>> +
>>> + if (dev->power.runtime_status == RPM_SUSPENDED)
>>> + dev->power.suspended_jiffies += delta;
>>> + else
>>> + dev->power.active_jiffies += delta;
>>> +}
>>>
>> By using jiffies, I think we might miss events in drivers that are doing
>> runtime PM transitions in short bursts. On embedded systems with slow
>> HZ, there could potentially be lots of transitions between ticks.
>>
>> It would be nicer to use clocksource-based time so transitions between
>> jiffies could still be factored into the accounting.
>>
>
> you're absolutely right that the current mechanism is more "sampling
> accuracy" (similar to most /proc info that shows up with top and
> such).
>
> on the "slow HZ".. there is no more valid reason to not set HZ to
> 1000...

Probably, especially with tickless idle, but not so sure there is total
agreement on this in the embedded world though...

> so we'll get 1 msec sampling rate basically.
>
> the problem with a more accurate clocksource is that it's
> expensive. And more... the path to such clocksource itself might be
> subject to power management ;-)

What about using read_persistent_clock() then? Then the arch/platform
definition of this will determine the max sampling rate.

Kevin



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/