From: Jiri Kosina on
On Thu, 22 Apr 2010, Greg KH wrote:

> 2.6.32-stable review patch. If anyone has any objections, please let us know.
>
> ------------------
>
> From: Andreas Herrmann <herrmann.der.user(a)googlemail.com>
>
> commit 9d260ebc09a0ad6b5c73e17676df42c7bc75ff64 upstream.
>
> Use NodeId MSR to get NodeId and number of nodes per processor.
>
> Signed-off-by: Andreas Herrmann <andreas.herrmann3(a)amd.com>
> LKML-Reference: <20091216144355.GB28798(a)alberich.amd.com>
> Signed-off-by: H. Peter Anvin <hpa(a)zytor.com>
> Signed-off-by: Greg Kroah-Hartman <gregkh(a)suse.de>
>
> ---
> arch/x86/include/asm/cpufeature.h | 1
> arch/x86/include/asm/msr-index.h | 1
> arch/x86/kernel/cpu/amd.c | 53 ++++++++++----------------------------
> 3 files changed, 17 insertions(+), 38 deletions(-)
>
> --- a/arch/x86/include/asm/cpufeature.h
> +++ b/arch/x86/include/asm/cpufeature.h
> @@ -153,6 +153,7 @@
> #define X86_FEATURE_SSE5 (6*32+11) /* SSE-5 */
> #define X86_FEATURE_SKINIT (6*32+12) /* SKINIT/STGI instructions */
> #define X86_FEATURE_WDT (6*32+13) /* Watchdog timer */
> +#define X86_FEATURE_NODEID_MSR (6*32+19) /* NodeId MSR */
>
> /*
> * Auxiliary flags: Linux defined - For features scattered in various
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -125,6 +125,7 @@
> #define FAM10H_MMIO_CONF_BUSRANGE_SHIFT 2
> #define FAM10H_MMIO_CONF_BASE_MASK 0xfffffff
> #define FAM10H_MMIO_CONF_BASE_SHIFT 20
> +#define MSR_FAM10H_NODE_ID 0xc001100c
>
> /* K8 MSRs */
> #define MSR_K8_TOP_MEM1 0xc001001a
> --- a/arch/x86/kernel/cpu/amd.c
> +++ b/arch/x86/kernel/cpu/amd.c
> @@ -254,59 +254,36 @@ static int __cpuinit nearby_node(int api
>
> /*
> * Fixup core topology information for AMD multi-node processors.
> - * Assumption 1: Number of cores in each internal node is the same.
> - * Assumption 2: Mixed systems with both single-node and dual-node
> - * processors are not supported.
> + * Assumption: Number of cores in each internal node is the same.
> */
> #ifdef CONFIG_X86_HT
> static void __cpuinit amd_fixup_dcm(struct cpuinfo_x86 *c)
> {
> -#ifdef CONFIG_PCI
> - u32 t, cpn;
> - u8 n, n_id;
> + unsigned long long value;
> + u32 nodes, cores_per_node;
> int cpu = smp_processor_id();
>
> + if (!cpu_has(c, X86_FEATURE_NODEID_MSR))
> + return;
> +
> /* fixup topology information only once for a core */
> if (cpu_has(c, X86_FEATURE_AMD_DCM))
> return;
>
> - /* check for multi-node processor on boot cpu */
> - t = read_pci_config(0, 24, 3, 0xe8);
> - if (!(t & (1 << 29)))
> + rdmsrl(MSR_FAM10H_NODE_ID, value);
> +
> + nodes = ((value >> 3) & 7) + 1;
> + if (nodes == 1)
> return;
>
> set_cpu_cap(c, X86_FEATURE_AMD_DCM);
> + cores_per_node = c->x86_max_cores / nodes;
>
> - /* cores per node: each internal node has half the number of cores */
> - cpn = c->x86_max_cores >> 1;
> -
> - /* even-numbered NB_id of this dual-node processor */
> - n = c->phys_proc_id << 1;
> + /* store NodeID, use llc_shared_map to store sibling info */
> + per_cpu(cpu_llc_id, cpu) = value & 7;
>
> - /*
> - * determine internal node id and assign cores fifty-fifty to
> - * each node of the dual-node processor
> - */
> - t = read_pci_config(0, 24 + n, 3, 0xe8);
> - n = (t>>30) & 0x3;
> - if (n == 0) {
> - if (c->cpu_core_id < cpn)
> - n_id = 0;
> - else
> - n_id = 1;
> - } else {
> - if (c->cpu_core_id < cpn)
> - n_id = 1;
> - else
> - n_id = 0;
> - }
> -
> - /* compute entire NodeID, use llc_shared_map to store sibling info */
> - per_cpu(cpu_llc_id, cpu) = (c->phys_proc_id << 1) + n_id;
> -
> - /* fixup core id to be in range from 0 to cpn */
> - c->cpu_core_id = c->cpu_core_id % cpn;
> -#endif
> + /* fixup core id to be in range from 0 to (cores_per_node - 1) */
> + c->cpu_core_id = c->cpu_core_id % cores_per_node;
> }
> #endif

This patch is causing kernel panic on boot on Magny Cours CPU here (cpu
family 16, model 9, stepping 1).

Please see screenshot from the paniced kernel on

http://www.jikos.cz/jikos/junk/screenshot.jpg

(sorry for bad quality of this screenshot, it was captured only from the
IPMI/HTTP console so far, I will try to grab something better tomorrow)

Also this screenshot probably doesn't show the complete picture -- see the
"end trace" marker on the very first line. I will try to have something
better tomorrow, but maybe this rings bell for someone right away.

--
Jiri Kosina
SUSE Labs, Novell Inc.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Andreas Herrmann on
On Mon, Apr 26, 2010 at 07:10:37PM +0200, Jiri Kosina wrote:
> On Thu, 22 Apr 2010, Greg KH wrote:
>
> > 2.6.32-stable review patch. If anyone has any objections, please let us know.

[...]

>
> This patch is causing kernel panic on boot on Magny Cours CPU here (cpu
> family 16, model 9, stepping 1).
>
> Please see screenshot from the paniced kernel on
>
> http://www.jikos.cz/jikos/junk/screenshot.jpg
>
> (sorry for bad quality of this screenshot, it was captured only from the
> IPMI/HTTP console so far, I will try to grab something better tomorrow)
>
> Also this screenshot probably doesn't show the complete picture -- see the
> "end trace" marker on the very first line. I will try to have something
> better tomorrow, but maybe this rings bell for someone right away.

This looks like a BIOS issue to me. Obviously this is a pre-production
test system where an old BIOS is installed. A BIOS update should fix
this issue for you.


Regards,

Andreas

--
Operating | Advanced Micro Devices GmbH
System | Karl-Hammerschmidt-Str. 34, 85609 Dornach b. M�nchen, Germany
Research | Gesch�ftsf�hrer: Andrew Bowd, Thomas M. McCoy, Giuliano Meroni
Center | Sitz: Dornach, Gemeinde Aschheim, Landkreis M�nchen
(OSRC) | Registergericht M�nchen, HRB Nr. 43632


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Jiri Kosina on
On Tue, 27 Apr 2010, Andreas Herrmann wrote:

> > > 2.6.32-stable review patch. If anyone has any objections, please let us know.
> >
> > This patch is causing kernel panic on boot on Magny Cours CPU here
> > (cpu family 16, model 9, stepping 1).
> >
> > Please see screenshot from the paniced kernel on
> >
> > http://www.jikos.cz/jikos/junk/screenshot.jpg
> >
> > (sorry for bad quality of this screenshot, it was captured only from the
> > IPMI/HTTP console so far, I will try to grab something better tomorrow)
> >
> > Also this screenshot probably doesn't show the complete picture -- see the
> > "end trace" marker on the very first line. I will try to have something
> > better tomorrow, but maybe this rings bell for someone right away.
>
> This looks like a BIOS issue to me. Obviously this is a pre-production
> test system where an old BIOS is installed. A BIOS update should fix
> this issue for you.

I can confirm that on different Magny Cours system (16/9/1) with a newer
BIOS, the NodeID is provided correctly via MSR, and the issue doesn't
trigger.

Thanks,

--
Jiri Kosina
SUSE Labs, Novell Inc.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/