From: Ingo Molnar on

* Yinghai Lu <yinghai(a)kernel.org> wrote:

> On Sat, Jan 9, 2010 at 6:30 PM, Ananth N Mavinakayanahalli
> <ananth(a)in.ibm.com> wrote:
> > On Sat, Jan 09, 2010 at 01:13:39PM -0800, Yinghai Lu wrote:
> >> On Sat, Jan 9, 2010 at 2:10 AM, Ananth N Mavinakayanahalli
> >> <ananth(a)in.ibm.com> wrote:
> >> > On an 8-way system with Intel Xeon X7350 CPUs, booting 2.6.32 or newer
> >> > kernels fails at:
> >> >
> >> > ...
> >> > CPU0: Intel(R) Xeon(R) CPU ? ? ? ? ? X7350 ?@ 2.93GHz stepping 0b
> >> > Booting Node ? 0, Processors ?#1 #2 #3 #4 #5 #6 #7 Ok.
> >> > Brought up 8 CPUs
> >> > Total of 8 processors activated (46906.05 BogoMIPS).
> >> >
> >> > Git bisect showed 2fbd07a5f as the offending commit.
> >> >
> >> > With the patch below, I am able to boot the latest Linus' git tree on
> >> > the machine. If this patch is correct, it needs to get into the stable
> >> > tree too.
> >> >
> >> > Signed-off-by: Ananth N Mavinakayanahalli <ananth(a)in.ibm.com>
> >> > ---
> >> > Index: linux-2.6/arch/x86/kernel/apic/probe_64.c
> >> > ===================================================================
> >> > --- linux-2.6.orig/arch/x86/kernel/apic/probe_64.c ? ? ?2010-01-09 14:54:29.000000000 +0530
> >> > +++ linux-2.6/arch/x86/kernel/apic/probe_64.c ? 2010-01-09 14:57:53.000000000 +0530
> >> > @@ -70,7 +70,7 @@
> >> > ? ? ? ?if (apic == &apic_flat) {
> >> > ? ? ? ? ? ? ? ?switch (boot_cpu_data.x86_vendor) {
> >> > ? ? ? ? ? ? ? ?case X86_VENDOR_INTEL:
> >> > - ? ? ? ? ? ? ? ? ? ? ? if (num_processors > 8)
> >> > + ? ? ? ? ? ? ? ? ? ? ? if (num_processors >= 8)
> >> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?apic = &apic_physflat;
> >> > ? ? ? ? ? ? ? ? ? ? ? ?break;
> >> > ? ? ? ? ? ? ? ?case X86_VENDOR_AMD:
> >>
> >> can you send out whole bootlog with apic=debug?
> >
> > Here it is:
> > ACPI: LAPIC (acpi_id[0x00] lapic_id[0x0c] enabled)
> > ACPI: LAPIC (acpi_id[0x01] lapic_id[0x10] enabled)
> > ACPI: LAPIC (acpi_id[0x02] lapic_id[0x0d] enabled)
> > ACPI: LAPIC (acpi_id[0x03] lapic_id[0x11] enabled)
> > ACPI: LAPIC (acpi_id[0x04] lapic_id[0x0e] enabled)
> > ACPI: LAPIC (acpi_id[0x05] lapic_id[0x12] enabled)
> > ACPI: LAPIC (acpi_id[0x06] lapic_id[0x0f] enabled)
> > ACPI: LAPIC (acpi_id[0x07] lapic_id[0x13] enabled)
> ...
> > Setting APIC routing to flat
> > Getting VERSION: 50014
> > Getting VERSION: 50014
> > Getting ID: c000000
> > Getting ID: f3000000
> > Getting LVT0: 700
> > Getting LVT1: 400
> > enabled ExtINT on CPU#0
> > ESR value before enabling vector: 0x00000040 ?after: 0x00000000
> > ENABLING IO-APIC IRQs
> > ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
> > CPU0: Intel(R) Xeon(R) CPU ? ? ? ? ? X7350 ?@ 2.93GHz stepping 0b
> ...
>
> the BSP's physical apic id is 0x0c instead of 0.
>
> not sure Suresh test that or not.

In any case this commit needs to be reverted as the assumption that it's safe
to do this optimization is evidently not true.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Suresh Siddha on
On Sat, 2010-01-09 at 10:11 -0800, Linus Torvalds wrote:
>
> On Sat, 9 Jan 2010, Ananth N Mavinakayanahalli wrote:
> >
> > On an 8-way system with Intel Xeon X7350 CPUs, booting 2.6.32 or newer
> > kernels fails at:
> >
> > ...
> > CPU0: Intel(R) Xeon(R) CPU X7350 @ 2.93GHz stepping 0b
> > Booting Node 0, Processors #1 #2 #3 #4 #5 #6 #7 Ok.
> > Brought up 8 CPUs
> > Total of 8 processors activated (46906.05 BogoMIPS).
> >
> > Git bisect showed 2fbd07a5f as the offending commit.

hmm. Let me check and get back to you on what is wrong. In the legacy
apic case, irrespective of the apic id, if we have 8 or less logical
cpu's, we should be able to use logical flat mode.

>
> Ok, that commit definitely is buggy.
>
> > With the patch below, I am able to boot the latest Linus' git tree on
> > the machine. If this patch is correct, it needs to get into the stable
> > tree too.
>
> I don't think the patch is correct, though. The thing is, the AMD check
> seems to be the correct one: you can only use 'apic_flat' if all the APIC
> ID's are < 8.
>
> It doesn't matter _how_ many CPU's you have. If you have two CPU's, but
> one of them has an APIC ID >= 8, then you cannot use the flat APIC model,
> since it depends on a 8-bit bitfield.

flat APIC model has nothing to do with the actual physical apic id's, as
OS programs logical LDR as a bit mask and that is what we use.

> So your patch doesn't seem right either, because it still tests
> num_processors, which is bogus.
>
> In fact, I can't for the life of me understand why it treats different
> vendors differently. Why is that code not just a simple
>
> /* Flat apic mode requires that all APIC ID's are in the range 0..7 */
> if (apic == &apic_flat && max_physical_apicid >= 8)
> apic = &apic_physflat;
>
> instead, with no crazy vendor tests.
>
> What am I missing?

If I remember, Yinghai mentioned that AMD platforms have some issues
with using flat mode on some systems where the total logical cpus are <=
8. Intel platforms have no such issues.

thanks,
suresh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Yinghai Lu on
On Sun, Jan 10, 2010 at 2:26 AM, Ingo Molnar <mingo(a)elte.hu> wrote:
>
> * Yinghai Lu <yinghai(a)kernel.org> wrote:
>
>> On Sat, Jan 9, 2010 at 6:30 PM, Ananth N Mavinakayanahalli
>> <ananth(a)in.ibm.com> wrote:
>> > On Sat, Jan 09, 2010 at 01:13:39PM -0800, Yinghai Lu wrote:
>> >> On Sat, Jan 9, 2010 at 2:10 AM, Ananth N Mavinakayanahalli
>> >> <ananth(a)in.ibm.com> wrote:
>> >> > On an 8-way system with Intel Xeon X7350 CPUs, booting 2.6.32 or newer
>> >> > kernels fails at:
>> >> >
>> >> > ...
>> >> > CPU0: Intel(R) Xeon(R) CPU ? ? ? ? ? X7350 ?@ 2.93GHz stepping 0b
>> >> > Booting Node ? 0, Processors ?#1 #2 #3 #4 #5 #6 #7 Ok.
>> >> > Brought up 8 CPUs
>> >> > Total of 8 processors activated (46906.05 BogoMIPS).
>> >> >
>> >> > Git bisect showed 2fbd07a5f as the offending commit.
>> >> >
>> >> > With the patch below, I am able to boot the latest Linus' git tree on
>> >> > the machine. If this patch is correct, it needs to get into the stable
>> >> > tree too.
>> >> >
>> >> > Signed-off-by: Ananth N Mavinakayanahalli <ananth(a)in.ibm.com>
>> >> > ---
>> >> > Index: linux-2.6/arch/x86/kernel/apic/probe_64.c
>> >> > ===================================================================
>> >> > --- linux-2.6.orig/arch/x86/kernel/apic/probe_64.c ? ? ?2010-01-09 14:54:29.000000000 +0530
>> >> > +++ linux-2.6/arch/x86/kernel/apic/probe_64.c ? 2010-01-09 14:57:53.000000000 +0530
>> >> > @@ -70,7 +70,7 @@
>> >> > ? ? ? ?if (apic == &apic_flat) {
>> >> > ? ? ? ? ? ? ? ?switch (boot_cpu_data.x86_vendor) {
>> >> > ? ? ? ? ? ? ? ?case X86_VENDOR_INTEL:
>> >> > - ? ? ? ? ? ? ? ? ? ? ? if (num_processors > 8)
>> >> > + ? ? ? ? ? ? ? ? ? ? ? if (num_processors >= 8)
>> >> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?apic = &apic_physflat;
>> >> > ? ? ? ? ? ? ? ? ? ? ? ?break;
>> >> > ? ? ? ? ? ? ? ?case X86_VENDOR_AMD:
>> >>
>> >> can you send out whole bootlog with apic=debug?
>> >
>> > Here it is:
>> > ACPI: LAPIC (acpi_id[0x00] lapic_id[0x0c] enabled)
>> > ACPI: LAPIC (acpi_id[0x01] lapic_id[0x10] enabled)
>> > ACPI: LAPIC (acpi_id[0x02] lapic_id[0x0d] enabled)
>> > ACPI: LAPIC (acpi_id[0x03] lapic_id[0x11] enabled)
>> > ACPI: LAPIC (acpi_id[0x04] lapic_id[0x0e] enabled)
>> > ACPI: LAPIC (acpi_id[0x05] lapic_id[0x12] enabled)
>> > ACPI: LAPIC (acpi_id[0x06] lapic_id[0x0f] enabled)
>> > ACPI: LAPIC (acpi_id[0x07] lapic_id[0x13] enabled)
>> ...
>> > Setting APIC routing to flat
>> > Getting VERSION: 50014
>> > Getting VERSION: 50014
>> > Getting ID: c000000
>> > Getting ID: f3000000
>> > Getting LVT0: 700
>> > Getting LVT1: 400
>> > enabled ExtINT on CPU#0
>> > ESR value before enabling vector: 0x00000040 ?after: 0x00000000
>> > ENABLING IO-APIC IRQs
>> > ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
>> > CPU0: Intel(R) Xeon(R) CPU ? ? ? ? ? X7350 ?@ 2.93GHz stepping 0b
>> ...
>>
>> the BSP's physical apic id is 0x0c instead of 0.
>>
>> not sure Suresh test that or not.
>
> In any case this commit needs to be reverted as the assumption that it's safe
> to do this optimization is evidently not true.
>

use attached debug patch on one of my intel system and with nr_cpus=8,
it seems logical flat works.
that system BSP apic id is 0x20.

YH
From: Suresh Siddha on
On Mon, 2010-01-11 at 13:43 -0800, Yinghai Lu wrote:
> On Sun, Jan 10, 2010 at 2:26 AM, Ingo Molnar <mingo(a)elte.hu> wrote:
> >
> > In any case this commit needs to be reverted as the assumption that it's safe
> > to do this optimization is evidently not true.
> >
>
> use attached debug patch on one of my intel system and with nr_cpus=8,
> it seems logical flat works.
> that system BSP apic id is 0x20.

Same here. I don't see any issues with my testing on two different
platforms.

This sounds more like an IBM platform issue. So reverting the 2fbd07a5f
commit is not the correct solution.

thanks,
suresh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Yinghai Lu on
if the system can not use logical flat.

should have some bit set in FADT...

static int physflat_acpi_madt_oem_check(char *oem_id, char *oem_table_id)
{
#ifdef CONFIG_ACPI
/*
* Quirk: some x86_64 machines can only use physical APIC mode
* regardless of how many processors are present (x86_64 ES7000
* is an example).
*/
if (acpi_gbl_FADT.header.revision >= FADT2_REVISION_ID &&
(acpi_gbl_FADT.flags & ACPI_FADT_APIC_PHYSICAL)) {
printk(KERN_DEBUG "system APIC only can use physical flat");
return 1;
}
#endif

return 0;
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/