From: Guenter Roeck on
On Thu, Jul 22, 2010 at 12:22:23PM -0400, Fenghua Yu wrote:
>
> Document for package level thermal hwmon driver.
>
> Signed-off-by: Fenghua Yu <fenghua.yu(a)intel.com>
> Reviewed-by: Len Brown <len.brown(a)intel.com>
> ---
>
> pkgtemp | 36 ++++++++++++++++++++++++++++++++++++
> 1 files changed, 36 insertions(+)
>
> diff --git a/Documentation/hwmon/pkgtemp b/Documentation/hwmon/pkgtemp
> new file mode 100644
> index 0000000..a60d286
> --- /dev/null
> +++ b/Documentation/hwmon/pkgtemp
> @@ -0,0 +1,36 @@
> +Kernel driver pkgtemp
> +======================
> +
> +Supported chips:
> + * Intel family
> + Prefix: 'pkgtemp'
> + CPUID:
> + Datasheet: Intel 64 and IA-32 Architectures Software Developer's Manual
> + Volume 3A: System Programming Guide
> +
> +Author: Fenghua Yu
> +
> +Description
> +-----------
> +
> +This driver permits reading package level temperature sensor embedded inside
> +Intel CPU package. The sensors can be in core, uncore, memroy controller, or

memroy --> memory

> +other componenets in a package. The feature is first implemented in Intel Sandy

componenets --> components

> +Bridge platform.
> +
Just for clarification - you mention a number of sensors, but unless
I am missing something only the package sensor is implemented. Is that correct ?

> +Temperature is measured in degrees Celsius and measurement resolution is
> +1 degree C. Valid temperatures are from 0 to TjMax degrees C, because the actual
> +value of temperature register is in fact a delta from TjMax.
> +
From the code, it seems that negative values can be reported. Is it guaranteed
by the chip that (TjMax - MSR_IA32_TEMPERATURE_TARGET) >= 0 ? Otherwise, the
minimum temperature would be (TjMax - 127).

> +Temperature known as TjMax is the maximum junction temperature of package.
> +Intel defines this temperature as 125C. At this temperature, protection

Your driver bails out at TjMax >= 120, so there is some inconsistency.
Also, it seems that this is not a constant, since you are reading
MSR_IA32_TEMPERATURE_TARGET to get the value.

Since the CPUs supporting the package sensor presumably also all support
reading TjMax, maybe you can reword the above text to reflect this.

> +mechanism will perform actions to forcibly cool down the processor. Alarm
> +may be raised, if the temperature grows enough (more than TjMax) to trigger
> +the Out-Of-Spec bit. Following table summarizes the exported sysfs files:
> +
> +temp1_input - Package temperature (in millidegrees Celsius).
> +temp1_crit - Maximum junction temperature (in millidegrees Celsius).
> +temp1_crit_alarm - Set when Out-of-spec bit is set, never clears.
> + Correct CPU operation is no longer guaranteed.
> +temp1_label - Contains string "Pysical package id X", where X is physical

Pysical --> Physical.
I would suggest to drop the "id" for consistency. Core sensor names don't
include "id" either.

I am not sure if "physical" should be included in the first place.
Also, above description suggests that future CPUs might add more sensors.
If so, the name should probably reflect the location of the current sensor,
ie be something like "Package core X" or "Core package X" or "Package X (core)".

> + package id.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Guenter Roeck on
On Thu, Jul 22, 2010 at 01:52:25PM -0400, Fenghua Yu wrote:
> > > +This driver permits reading package level temperature sensor embedded inside
> > > +Intel CPU package. The sensors can be in core, uncore, memroy controller, or
> >
> > memroy --> memory
> Will change the typo.
>
> > > +other componenets in a package. The feature is first implemented in Intel Sandy
> >
> > componenets --> components
> >
> Ditto.
>
> > > +Bridge platform.
> > > +
> > Just for clarification - you mention a number of sensors, but unless
> > I am missing something only the package sensor is implemented. Is that correct ?
> Do you mean "only the core sensor"? I think the Sandy Bridge only implements
> core sensor now. I would think in the future more sensors could be implemented.
> If only core sensors are implemented, there is no need to have the package
> level thermal hw MSRs because OS can get all package level thermal info by
> enumerating all core sensors in a package.
>
Now you lost me. I understand there are the per-core sensors, read through coretemp,
plus one package level sensor. Do you associate this existing package level sensor
with any of the locations mentioned above (core, uncore, memory, other), or is
that unknown ? If it is unknown, you might want to mention it.

> >
> > > +Temperature is measured in degrees Celsius and measurement resolution is
> > > +1 degree C. Valid temperatures are from 0 to TjMax degrees C, because the actual
> > > +value of temperature register is in fact a delta from TjMax.
> > > +
> > From the code, it seems that negative values can be reported. Is it guaranteed
> > by the chip that (TjMax - MSR_IA32_TEMPERATURE_TARGET) >= 0 ? Otherwise, the
> > minimum temperature would be (TjMax - 127).
> >
> > > +Temperature known as TjMax is the maximum junction temperature of package.
> > > +Intel defines this temperature as 125C. At this temperature, protection
> >
> > Your driver bails out at TjMax >= 120, so there is some inconsistency.
> > Also, it seems that this is not a constant, since you are reading
> > MSR_IA32_TEMPERATURE_TARGET to get the value.
> >
> > Since the CPUs supporting the package sensor presumably also all support
> > reading TjMax, maybe you can reword the above text to reflect this.
> >
> You are right. I forgot to update this part. I'll reword it.
>
> > > +mechanism will perform actions to forcibly cool down the processor. Alarm
> > > +may be raised, if the temperature grows enough (more than TjMax) to trigger
> > > +the Out-Of-Spec bit. Following table summarizes the exported sysfs files:
> > > +
> > > +temp1_input - Package temperature (in millidegrees Celsius).
> > > +temp1_crit - Maximum junction temperature (in millidegrees Celsius).
> > > +temp1_crit_alarm - Set when Out-of-spec bit is set, never clears.
> > > + Correct CPU operation is no longer guaranteed.
> > > +temp1_label - Contains string "Pysical package id X", where X is physical
> >
> > Pysical --> Physical.
> > I would suggest to drop the "id" for consistency. Core sensor names don't
> > include "id" either.
> >
> > I am not sure if "physical" should be included in the first place.
> > Also, above description suggests that future CPUs might add more sensors.
> > If so, the name should probably reflect the location of the current sensor,
> > ie be something like "Package core X" or "Core package X" or "Package X (core)".
> How about just using "package X" to align with coretemp for now? Currently there
> is no interface to tell which sensor it is.
>
Fine with me. Maybe wait for feedback from others.

You use the argument that there may be other package level sensors in the future.
Are there any plans for this, or is this just a theory ?

Next question is how to handle future sensor types. One hwmon instance per sensor,
additional sensors in this driver, or even a new driver ?

We had was a separate discussion if the coretemp driver should be redesigned
to one instance per CPU. The package sensor would fit into that model,
since you would have

coretemp-isa-0000
Core0
Core1
....
CoreN
Package

coretemp-isa-0001
Core0
Core1
....
CoreM
Package

I personally would prefer that approach. It would avoid ambiguity associating Package X
with specific cores, and it would also easily expand to additional non-core future sensors.

Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Guenter Roeck on
On Wed, 2010-07-28 at 15:00 -0400, Fenghua Yu wrote:
> From: Fenghua Yu <fenghua.yu(a)intel.com>
>
> Document for package level thermal hwmon driver.
>
> Signed-off-by: Fenghua Yu <fenghua.yu(a)intel.com>
> Reviewed-by: Len Brown <len.brown(a)intel.com>
> Reviewed-by: Guenter Roeck <guenter.roeck(a)ericsson.com>

That I provided feedback does not mean that I endorsed a "Reviewed-by"
signature. It is inappropriate to add such a signature without my
approval.

Guenter


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/