From: Ingo Molnar on

* Yinghai Lu <yinghai(a)kernel.org> wrote:

> On 08/03/2010 02:38 PM, Eric W. Biederman wrote:
> >
> > A clean solution would be to scrub the input from the MP table before
> > we attempt to use of it, instead of scrubbing the data as we use in
> > code paths like pin_2_irq. Just touching pin_2_irq is certainly an
> > incomplete solution because you have not resolved if the pins should
> > be edge or level triggered, and what polarity we should be sampling
> > them at.
> >
> > Until I see a plausible scenario where not handling buggy MP tables
> > exactly as we have done in the past I don't see hacks like you
> > are proposing making much sense at all.
>
> ok, how about this one?
>
> it will try to add entries to mp_irqs[] with some checking.

Yep, more sanity testing of BIOS data is welcome.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Eric W. Biederman on
Yinghai Lu <yinghai(a)kernel.org> writes:

> On 08/03/2010 02:38 PM, Eric W. Biederman wrote:
>>
>> A clean solution would be to scrub the input from the MP table before
>> we attempt to use of it, instead of scrubbing the data as we use in
>> code paths like pin_2_irq. Just touching pin_2_irq is certainly an
>> incomplete solution because you have not resolved if the pins should
>> be edge or level triggered, and what polarity we should be sampling
>> them at.
>>
>> Until I see a plausible scenario where not handling buggy MP tables
>> exactly as we have done in the past I don't see hacks like you
>> are proposing making much sense at all.
>
> ok, how about this one?
>
> it will try to add entries to mp_irqs[] with some checking.

This looks roughly like the right approach.
However I don't see what would force the ioapic entries before
the intsrc entries in the table. Especially when the assumption
is we are dealing with a buggy bios.

> ---
> arch/x86/kernel/mpparse.c | 44 +++++++++++++++++++++++++++++++++++++-------
> 1 file changed, 37 insertions(+), 7 deletions(-)
>
> Index: linux-2.6/arch/x86/kernel/mpparse.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/kernel/mpparse.c
> +++ linux-2.6/arch/x86/kernel/mpparse.c
> @@ -173,17 +173,17 @@ static int __init mp_irq_mpc_intsrc_cmp(
> {
> if (mp_irq->dstapic != m->dstapic)
> return 1;
> - if (mp_irq->type != m->type)
> + if (mp_irq->dstirq != m->dstirq)
> return 2;
> - if (mp_irq->irqtype != m->irqtype)
> + if (mp_irq->srcbus != m->srcbus)
> return 3;
> - if (mp_irq->irqflag != m->irqflag)
> + if (mp_irq->type != m->type)
> return 4;
> - if (mp_irq->srcbus != m->srcbus)
> + if (mp_irq->irqtype != m->irqtype)
> return 5;
> - if (mp_irq->srcbusirq != m->srcbusirq)
> + if (mp_irq->irqflag != m->irqflag)
> return 6;
> - if (mp_irq->dstirq != m->dstirq)
> + if (mp_irq->srcbusirq != m->srcbusirq)
> return 7;
>
> return 0;
> @@ -195,9 +195,39 @@ static void __init MP_intsrc_info(struct
>
> print_MP_intsrc_info(m);
>
> + /*
> + * Assume BUS, and IOAPIC entries come first all before
> + * INTSRC entries
> + */
> +
> + /* check if dstapic is right */
> + for (i = 0; i < nr_ioapics; i++) {
> + if (mp_ioapics[idx].apicid == m->dstapic)
> + break;
> + }
> + if (i == nr_ioapics)
> + return;

If we can make the algorithm two pass picking up the ioapic
entries first this should be fine.

> for (i = 0; i < mp_irq_entries; i++) {
> - if (!mp_irq_mpc_intsrc_cmp(&mp_irqs[i], m))
> + int ret = mp_irq_mpc_intsrc_cmp(&mp_irqs[i], m);
> +
> + /* duplicated entries ? */
> + if (!ret)
> return;
> +
> + /* same apic/pin, but different bus */
> + if (ret == 3) {
> + /* overwrite wrong legacy one */

In practice your test of looking at mp_bus_not_pci is essentially what
we do. I wonder if it could be made to be a test of polarity and edge
mismatch instead.

Also if we are ditching a non-duplicate intsrc we want to print a message
so there is a chance of debugging things if our heuristic for fixing
up broken BIOS's goes wrong, on some common configuration.

> + if (test_bit(mp_irqs[i].srcbus, mp_bus_not_pci) &&
> + !test_bit(m->srcbus, mp_bus_not_pci)) {
> + assign_to_mp_irq(m, &mp_irqs[mp_irq_entries]);

That should be.
> + assign_to_mp_irq(m, &mp_irqs[i]);

> + return;
> + }
> + /* dump this legacy one */
> + if (!test_bit(mp_irqs[i].srcbus, mp_bus_not_pci) &&
> + test_bit(m->srcbus, mp_bus_not_pci))
> + return;
> + }
> }
>
> assign_to_mp_irq(m, &mp_irqs[mp_irq_entries]);

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Yinghai Lu on
On 08/04/2010 05:12 AM, Eric W. Biederman wrote:


>
> In practice your test of looking at mp_bus_not_pci is essentially what
> we do. I wonder if it could be made to be a test of polarity and edge
> mismatch instead.
>

Dave's system mptable pol and trig is wrong ...

Int: type 0, pol 0, trig 0, bus 0a, IRQ 00, APIC ID 8, APIC INT 00
Int: type 0, pol 0, trig 0, bus 0a, IRQ 01, APIC ID 8, APIC INT 01
Int: type 3, pol 1, trig 1, bus 0a, IRQ 00, APIC ID 8, APIC INT 02
Int: type 0, pol 0, trig 0, bus 0a, IRQ 03, APIC ID 8, APIC INT 03
Int: type 0, pol 0, trig 0, bus 0a, IRQ 04, APIC ID 8, APIC INT 04
Int: type 0, pol 0, trig 0, bus 0a, IRQ 05, APIC ID 8, APIC INT 05
Int: type 0, pol 0, trig 0, bus 0a, IRQ 06, APIC ID 8, APIC INT 06
Int: type 0, pol 0, trig 0, bus 0a, IRQ 07, APIC ID 8, APIC INT 07
Int: type 0, pol 0, trig 0, bus 0a, IRQ 08, APIC ID 8, APIC INT 08
Int: type 0, pol 0, trig 0, bus 0a, IRQ 09, APIC ID 8, APIC INT 09
Int: type 0, pol 0, trig 0, bus 0a, IRQ 0a, APIC ID 8, APIC INT 0a
Int: type 0, pol 0, trig 0, bus 0a, IRQ 0b, APIC ID 8, APIC INT 0b
Int: type 0, pol 0, trig 0, bus 0a, IRQ 0c, APIC ID 8, APIC INT 0c
Int: type 0, pol 0, trig 0, bus 0a, IRQ 0e, APIC ID 8, APIC INT 0e
Int: type 0, pol 0, trig 0, bus 0a, IRQ 0f, APIC ID 8, APIC INT 0f
Int: type 0, pol 0, trig 0, bus 00, IRQ 28, APIC ID 8, APIC INT 09
Int: type 0, pol 0, trig 0, bus 00, IRQ 2c, APIC ID 8, APIC INT 0a
Int: type 0, pol 0, trig 0, bus 00, IRQ 2d, APIC ID 8, APIC INT 0a
Int: type 0, pol 0, trig 0, bus 00, IRQ 34, APIC ID 8, APIC INT 0e
Int: type 0, pol 0, trig 0, bus 00, IRQ 38, APIC ID 8, APIC INT 0b
Int: type 0, pol 0, trig 0, bus 00, IRQ 38, APIC ID 8, APIC INT 0b
Int: type 0, pol 0, trig 0, bus 00, IRQ 38, APIC ID 8, APIC INT 0b

Do you mean check pol/trig in addition to bus in this case?

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Eric W. Biederman on
Yinghai Lu <yinghai(a)kernel.org> writes:

> On 08/04/2010 05:12 AM, Eric W. Biederman wrote:
>
>
>>
>> In practice your test of looking at mp_bus_not_pci is essentially what
>> we do. I wonder if it could be made to be a test of polarity and edge
>> mismatch instead.
>>
>
> Dave's system mptable pol and trig is wrong ...

The are inconsistent. Everything is set to the default for the bus.

> Int: type 0, pol 0, trig 0, bus 0a, IRQ 00, APIC ID 8, APIC INT 00
> Int: type 0, pol 0, trig 0, bus 0a, IRQ 01, APIC ID 8, APIC INT 01
> Int: type 3, pol 1, trig 1, bus 0a, IRQ 00, APIC ID 8, APIC INT 02
> Int: type 0, pol 0, trig 0, bus 0a, IRQ 03, APIC ID 8, APIC INT 03
> Int: type 0, pol 0, trig 0, bus 0a, IRQ 04, APIC ID 8, APIC INT 04
> Int: type 0, pol 0, trig 0, bus 0a, IRQ 05, APIC ID 8, APIC INT 05
> Int: type 0, pol 0, trig 0, bus 0a, IRQ 06, APIC ID 8, APIC INT 06
> Int: type 0, pol 0, trig 0, bus 0a, IRQ 07, APIC ID 8, APIC INT 07
> Int: type 0, pol 0, trig 0, bus 0a, IRQ 08, APIC ID 8, APIC INT 08
> Int: type 0, pol 0, trig 0, bus 0a, IRQ 09, APIC ID 8, APIC INT 09
> Int: type 0, pol 0, trig 0, bus 0a, IRQ 0a, APIC ID 8, APIC INT 0a
> Int: type 0, pol 0, trig 0, bus 0a, IRQ 0b, APIC ID 8, APIC INT 0b
> Int: type 0, pol 0, trig 0, bus 0a, IRQ 0c, APIC ID 8, APIC INT 0c
> Int: type 0, pol 0, trig 0, bus 0a, IRQ 0e, APIC ID 8, APIC INT 0e
> Int: type 0, pol 0, trig 0, bus 0a, IRQ 0f, APIC ID 8, APIC INT 0f
> Int: type 0, pol 0, trig 0, bus 00, IRQ 28, APIC ID 8, APIC INT 09
> Int: type 0, pol 0, trig 0, bus 00, IRQ 2c, APIC ID 8, APIC INT 0a
> Int: type 0, pol 0, trig 0, bus 00, IRQ 2d, APIC ID 8, APIC INT 0a
> Int: type 0, pol 0, trig 0, bus 00, IRQ 34, APIC ID 8, APIC INT 0e
> Int: type 0, pol 0, trig 0, bus 00, IRQ 38, APIC ID 8, APIC INT 0b
> Int: type 0, pol 0, trig 0, bus 00, IRQ 38, APIC ID 8, APIC INT 0b
> Int: type 0, pol 0, trig 0, bus 00, IRQ 38, APIC ID 8, APIC INT 0b
>
> Do you mean check pol/trig in addition to bus in this case?

No. I was thinking it would be nice if we could check the
polarity and the irq trigger mode in this case. Unfortunately
I don't think we have access to everything thing we need to perform
that check this early.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Yinghai Lu on
Please check if this one address your concerns.

Thanks

Yinghai

---
arch/x86/kernel/mpparse.c | 88 ++++++++++++++++++++++++++++++++++++++++------
1 file changed, 77 insertions(+), 11 deletions(-)

Index: linux-2.6/arch/x86/kernel/mpparse.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/mpparse.c
+++ linux-2.6/arch/x86/kernel/mpparse.c
@@ -173,17 +173,17 @@ static int __init mp_irq_mpc_intsrc_cmp(
{
if (mp_irq->dstapic != m->dstapic)
return 1;
- if (mp_irq->type != m->type)
+ if (mp_irq->dstirq != m->dstirq)
return 2;
- if (mp_irq->irqtype != m->irqtype)
+ if (mp_irq->srcbus != m->srcbus)
return 3;
- if (mp_irq->irqflag != m->irqflag)
+ if (mp_irq->type != m->type)
return 4;
- if (mp_irq->srcbus != m->srcbus)
+ if (mp_irq->irqtype != m->irqtype)
return 5;
- if (mp_irq->srcbusirq != m->srcbusirq)
+ if (mp_irq->irqflag != m->irqflag)
return 6;
- if (mp_irq->dstirq != m->dstirq)
+ if (mp_irq->srcbusirq != m->srcbusirq)
return 7;

return 0;
@@ -195,9 +195,45 @@ static void __init MP_intsrc_info(struct

print_MP_intsrc_info(m);

+ /*
+ * All BUS, and IOAPIC entries are processed already
+ */
+
+ /* check if dstapic is right */
+ for (i = 0; i < nr_ioapics; i++) {
+ if (mp_ioapics[i].apicid == m->dstapic)
+ break;
+ }
+ if (i == nr_ioapics) {
+ apic_printk(APIC_VERBOSE, "intsrc with wrong apic id is skipped\n");
+ return;
+ }
+
for (i = 0; i < mp_irq_entries; i++) {
- if (!mp_irq_mpc_intsrc_cmp(&mp_irqs[i], m))
+ int ret = mp_irq_mpc_intsrc_cmp(&mp_irqs[i], m);
+
+ /* duplicated entries ? */
+ if (!ret) {
+ apic_printk(APIC_VERBOSE, "Duplicated intsrc is skipped\n");
return;
+ }
+
+ /* same apic/pin, but different bus */
+ if (ret == 3) {
+ /* overwrite wrong legacy one */
+ if (test_bit(mp_irqs[i].srcbus, mp_bus_not_pci) &&
+ !test_bit(m->srcbus, mp_bus_not_pci)) {
+ apic_printk(APIC_VERBOSE, "Wrong legacy entry with same apic/pin is removed\n");
+ assign_to_mp_irq(m, &mp_irqs[i]);
+ return;
+ }
+ /* dump this legacy one */
+ if (!test_bit(mp_irqs[i].srcbus, mp_bus_not_pci) &&
+ test_bit(m->srcbus, mp_bus_not_pci)) {
+ apic_printk(APIC_VERBOSE, "Wrong legacy entry is skipped\n");
+ return;
+ }
+ }
}

assign_to_mp_irq(m, &mp_irqs[mp_irq_entries]);
@@ -280,8 +316,8 @@ static int __init smp_read_mpc(struct mp
char str[16];
char oem[10];

- int count = sizeof(*mpc);
- unsigned char *mpt = ((unsigned char *)mpc) + count;
+ int count;
+ unsigned char *mpt;

if (!smp_check_mpc(mpc, oem, str))
return 0;
@@ -305,8 +341,9 @@ static int __init smp_read_mpc(struct mp
/*
* Now process the configuration blocks.
*/
+ count = sizeof(*mpc);
+ mpt = ((unsigned char *)mpc) + count;
x86_init.mpparse.mpc_record(0);
-
while (count < mpc->length) {
switch (*mpt) {
case MP_PROCESSOR:
@@ -324,7 +361,7 @@ static int __init smp_read_mpc(struct mp
skip_entry(&mpt, &count, sizeof(struct mpc_ioapic));
break;
case MP_INTSRC:
- MP_intsrc_info((struct mpc_intsrc *)mpt);
+ /* check that next pass */
skip_entry(&mpt, &count, sizeof(struct mpc_intsrc));
break;
case MP_LINTSRC:
@@ -337,6 +374,35 @@ static int __init smp_read_mpc(struct mp
count = mpc->length;
break;
}
+ x86_init.mpparse.mpc_record(1);
+ }
+
+ /* second pass for INTSRC */
+ count = sizeof(*mpc);
+ mpt = ((unsigned char *)mpc) + count;
+ x86_init.mpparse.mpc_record(0);
+ while (count < mpc->length) {
+ switch (*mpt) {
+ case MP_PROCESSOR:
+ skip_entry(&mpt, &count, sizeof(struct mpc_cpu));
+ break;
+ case MP_BUS:
+ skip_entry(&mpt, &count, sizeof(struct mpc_bus));
+ break;
+ case MP_IOAPIC:
+ skip_entry(&mpt, &count, sizeof(struct mpc_ioapic));
+ break;
+ case MP_INTSRC:
+ MP_intsrc_info((struct mpc_intsrc *)mpt);
+ skip_entry(&mpt, &count, sizeof(struct mpc_intsrc));
+ break;
+ case MP_LINTSRC:
+ skip_entry(&mpt, &count, sizeof(struct mpc_lintsrc));
+ break;
+ default:
+ count = mpc->length;
+ break;
+ }
x86_init.mpparse.mpc_record(1);
}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/