From: Johannes Hirte on
Am Dienstag 15 Dezember 2009 16:30:26 schrieb Borislav Petkov:
> On Tue, Dec 15, 2009 at 08:08:04AM +0100, Johannes Hirte wrote:
> > Northbridge Error, node 0, core: -1
> > amd_decode_nb_mce: NBSL: 0x0005001b, NBSL: 0xa4000000
> > K8 ECC error.
>
> Yep, this is a benign GART TLB error which is not being reported but
> you're using the amd64_edac module and it trips since the error is still
> being logged and the module sees it. There are two fixes:
>
> 1. If you have a BIOS option with a wording like:
>
> "Gart Table Walk Error MC reporting: Disabled/Enabled."
>
> which should disable it.

Yes, there is such an option that was enabled. I was shure I had disabled it,
especially as the BIOS help says too that it's only for graphic driver
developers. I've disabled it now, will test the patch later.

> 2. If no BIOS option, the patch below should fix it. Can you please
> test (against v2.6.32).

This patch (as the BIOS option) will only disable the error reports. The error
itself will still occur, right? So necessary to find out why the radeon driver
trigger this error.


regards,
Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Borislav Petkov on
On Tue, Dec 15, 2009 at 11:00:46PM +0100, Johannes Hirte wrote:

>
> This patch (as the BIOS option) will only disable the error reports. The error
> itself will still occur, right? So necessary to find out why the radeon driver
> trigger this error.

Because the graphics driver does aperture accesses with no
matching GART translation, and the hw generates mchecks for
that. The whole story on GART table walk errors is in section
"13.10.1 GART Table Walk Error Reporting" in the document here:
http://support.amd.com/us/Processor_TechDocs/32559.pdf

I can't say for sure about your BIOS, but if it is done as described in
the abovementioned section, the BIOS option should disable logging of
the error, which implies reporting too.

The patch is still needed for machines that do not have that BIOS
option.

--
Regards/Gruss,
Boris.

Operating | Advanced Micro Devices GmbH
System | Karl-Hammerschmidt-Str. 34, 85609 Dornach b. M�nchen, Germany
Research | Gesch�ftsf�hrer: Andrew Bowd, Thomas M. McCoy, Giuliano Meroni
Center | Sitz: Dornach, Gemeinde Aschheim, Landkreis M�nchen
(OSRC) | Registergericht M�nchen, HRB Nr. 43632

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/