From: Yinghai Lu on
On 08/03/2010 12:19 AM, Yinghai Lu wrote:
> On 08/02/2010 08:13 PM, Eric W. Biederman wrote:
>> Yinghai Lu <yinghai(a)kernel.org> writes:
>>
>>> On 08/02/2010 06:32 PM, Yinghai Lu wrote:
>>>> On 08/02/2010 04:17 PM, Dave Airlie wrote:
>>>>>>
>>>>>> the kernel is using mptable, and the system have mcp55, so how come
>>>>>> with irq 35?
>>>>>> assume we should only have ioapic irq 0 - 23 ...
>>>>>>
>>>>>> Can you send out boot log with "debug apic=debug pci=routeirq" with
>>>>>> 2.6.32 and 2.6.35?
>>>>>
>>>>> Okay el6log is from a RHEL6 2.6.32 kernel, but it should give a good
>>>>> baseline, the 2.6.35 oops even earlier with all those options and is
>>>>> in the second attachment.
>>>>
>>>
>>
>> This patch is wrong and there is no reason to even suspect it will
>> affect this problem. At best this patch will trade one set of bugs
>> for another because at least on some platforms we always did something
>> like this. Having an irq 35 is odd and certainly a result of recent
>> changes, but in this case it doesn't look like it has anything to do
>> with the problem.
>>
>> Nacked-by: "Eric W. Biederman" <ebiederm(a)xmission.com>
>>
>>> please use this one instead..., forget to run quilt refresh before sending it.
>>>
>>> [PATCH -v2] x86: fix pin_2_irq mapping
>>>
>>> We should not twist gsi to irq mapping if acpi is not used.
>>>
>>> -v2 remove not used irq_to_gsi()
>>>
>>> Signed-off-by: Yinghai Lu <yinghai(a)kernel.org>
>>>
>>> ---
>>> arch/x86/include/asm/io_apic.h | 10 ++++++++++
>>> arch/x86/kernel/acpi/boot.c | 4 ++--
>>> arch/x86/kernel/apic/io_apic.c | 5 +----
>>> 3 files changed, 13 insertions(+), 6 deletions(-)
>>>
>>> Index: linux-2.6/arch/x86/include/asm/io_apic.h
>>> ===================================================================
>>> --- linux-2.6.orig/arch/x86/include/asm/io_apic.h
>>> +++ linux-2.6/arch/x86/include/asm/io_apic.h
>>> @@ -185,6 +185,16 @@ int mp_find_ioapic_pin(int ioapic, u32 g
>>> void __init mp_register_ioapic(int id, u32 address, u32 gsi_base);
>>> extern void __init pre_init_apic_IRQ0(void);
>>>
>>> +#ifdef CONFIG_ACPI
>>> +unsigned int gsi_to_irq(unsigned int gsi);
>>> +u32 irq_to_gsi(int irq);
>>> +#else
>>> +static inline unsigned int gsi_to_irq(unsigned int gsi)
>>> +{
>>> + return gsi;
>>> +}
>>> +#endif
>>> +
>>> #else /* !CONFIG_X86_IO_APIC */
>>>
>>> #define io_apic_assign_pci_irqs 0
>>> Index: linux-2.6/arch/x86/kernel/acpi/boot.c
>>> ===================================================================
>>> --- linux-2.6.orig/arch/x86/kernel/acpi/boot.c
>>> +++ linux-2.6/arch/x86/kernel/acpi/boot.c
>>> @@ -100,7 +100,7 @@ static u32 isa_irq_to_gsi[NR_IRQS_LEGACY
>>> 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
>>> };
>>>
>>> -static unsigned int gsi_to_irq(unsigned int gsi)
>>> +unsigned int gsi_to_irq(unsigned int gsi)
>>> {
>>> unsigned int irq = gsi + NR_IRQS_LEGACY;
>>> unsigned int i;
>>> @@ -123,7 +123,7 @@ static unsigned int gsi_to_irq(unsigned
>>> return irq;
>>> }
>>>
>>> -static u32 irq_to_gsi(int irq)
>>> +u32 irq_to_gsi(int irq)
>>> {
>>> unsigned int gsi;
>>>
>>> Index: linux-2.6/arch/x86/kernel/apic/io_apic.c
>>> ===================================================================
>>> --- linux-2.6.orig/arch/x86/kernel/apic/io_apic.c
>>> +++ linux-2.6/arch/x86/kernel/apic/io_apic.c
>>> @@ -1029,10 +1029,7 @@ static int pin_2_irq(int idx, int apic,
>>> } else {
>>> u32 gsi = mp_gsi_routing[apic].gsi_base + pin;
>>>
>>> - if (gsi >= NR_IRQS_LEGACY)
>>> - irq = gsi;
>>> - else
>>> - irq = gsi_top + gsi;
>>> + irq = gsi_to_irq(gsi);
>>> }
>>>
>>> #ifdef CONFIG_X86_32
>
> what is the point for making irq = gsi_top + gsi when mptable is used instead of acpi?
>

just tried those blind shifting gsi cause kernel with acpi crash in virtual box.

[ 5.536000] querying PCI -> IRQ mapping bus:0, slot:11, pin:0.
[ 5.540000] ehci_hcd 0000:00:0b.0: can't find IRQ for PCI INT A; probably buggy MP table
[

and on kvm it got:
[ 4.352280] e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k6-NAPI
[ 4.356012] e1000: Copyright (c) 1999-2006 Intel Corporation.
[ 4.360120] querying PCI -> IRQ mapping bus:0, slot:3, pin:0.
[ 4.364006] PCI BIOS passed nonexistent PCI bus 0!
[ 4.368007] e1000 0000:00:03.0: can't find IRQ for PCI INT A; probably buggy MP table
[ 4.372049] e1000 0000:00:03.0: setting latency timer to 64

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Eric W. Biederman on
Yinghai Lu <yinghai(a)kernel.org> writes:

> On 08/03/2010 12:19 AM, Yinghai Lu wrote:
>> On 08/02/2010 08:13 PM, Eric W. Biederman wrote:
>>> Yinghai Lu <yinghai(a)kernel.org> writes:

>>>> Index: linux-2.6/arch/x86/kernel/apic/io_apic.c
>>>> ===================================================================
>>>> --- linux-2.6.orig/arch/x86/kernel/apic/io_apic.c
>>>> +++ linux-2.6/arch/x86/kernel/apic/io_apic.c
>>>> @@ -1029,10 +1029,7 @@ static int pin_2_irq(int idx, int apic,
>>>> } else {
>>>> u32 gsi = mp_gsi_routing[apic].gsi_base + pin;
>>>>
>>>> - if (gsi >= NR_IRQS_LEGACY)
>>>> - irq = gsi;
>>>> - else
>>>> - irq = gsi_top + gsi;
>>>> + irq = gsi_to_irq(gsi);
>>>> }
>>>>
>>>> #ifdef CONFIG_X86_32
>>
>> what is the point for making irq = gsi_top + gsi when mptable is used instead of acpi?
>>
>
> just tried those blind shifting gsi cause kernel with acpi crash in virtual box.

What configuration did you try and had problems with?

> [ 5.536000] querying PCI -> IRQ mapping bus:0, slot:11, pin:0.
> [ 5.540000] ehci_hcd 0000:00:0b.0: can't find IRQ for PCI INT A; probably buggy MP table
> [

I don't have a clue what the mpptable looks like in virtual box. My guess is that it
is buggy and untested like so many mptables these days.

> and on kvm it got:
> [ 4.352280] e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k6-NAPI
> [ 4.356012] e1000: Copyright (c) 1999-2006 Intel Corporation.
> [ 4.360120] querying PCI -> IRQ mapping bus:0, slot:3, pin:0.
> [ 4.364006] PCI BIOS passed nonexistent PCI bus 0!
> [ 4.368007] e1000 0000:00:03.0: can't find IRQ for PCI INT A; probably buggy MP table
> [ 4.372049] e1000 0000:00:03.0: setting latency timer to 64


This example failed because mpparse said bus 0 was ISA. Which is a
pretty bizarre thing to do, especially when bus 0 is pretty clearly
PCI. That does sound like a buggy MP table.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Eric W. Biederman on
Yinghai Lu <yinghai(a)kernel.org> writes:

> On 08/03/2010 01:00 AM, Eric W. Biederman wrote:
>> Yinghai Lu <yinghai(a)kernel.org> writes:
>>
>>>>> Index: linux-2.6/arch/x86/kernel/apic/io_apic.c
>>>>> ===================================================================
>>>>> --- linux-2.6.orig/arch/x86/kernel/apic/io_apic.c
>>>>> +++ linux-2.6/arch/x86/kernel/apic/io_apic.c
>>>>> @@ -1029,10 +1029,7 @@ static int pin_2_irq(int idx, int apic,
>>>>> } else {
>>>>> u32 gsi = mp_gsi_routing[apic].gsi_base + pin;
>>>>>
>>>>> - if (gsi >= NR_IRQS_LEGACY)
>>>>> - irq = gsi;
>>>>> - else
>>>>> - irq = gsi_top + gsi;
>>>>> + irq = gsi_to_irq(gsi);
>>>>> }
>>>>>
>>>>> #ifdef CONFIG_X86_32
>>>
>>> what is the point for making irq = gsi_top + gsi when mptable is used instead of acpi?
>>
>> Because it is only convention that when mptables are used that the
>> first apic pins 0-15 are the ISA irqs. This thread witnessed and a
>> pci irq that came in pin < 16 that was not an ISA irq. The truly rare
>> and exotic case would be for the ISA irqs to be outside the first 16
>> ioapic pins but the es7000 did exactly that.
>
> nvidia chipset if acpi is enabled, external pci device will use ioapic from 16 to 23.
>
> if mptable is used, external pci device will not use pin from 16 to 23..., and lot of devices will share same pin.

Exactly. Pins < 16 are not necessarily ISA irqs, and can be possibly
shared level triggered PCI irqs. Unfortunately there are strange
boards like the es7000 where pins > 16 are ISA irqs.

The other thing that is gained by having pin_2_irq always remap pins <
16 is we can get away with the numerous hard codes in the arch/x86 and elsewhere
that assume irq < 16 is an ISA irq.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Yinghai Lu on
On 08/03/2010 01:56 AM, Eric W. Biederman wrote:
> Yinghai Lu <yinghai(a)kernel.org> writes:
>
>> On 08/03/2010 01:00 AM, Eric W. Biederman wrote:
>>> Yinghai Lu <yinghai(a)kernel.org> writes:
>>>
>>>>>> Index: linux-2.6/arch/x86/kernel/apic/io_apic.c
>>>>>> ===================================================================
>>>>>> --- linux-2.6.orig/arch/x86/kernel/apic/io_apic.c
>>>>>> +++ linux-2.6/arch/x86/kernel/apic/io_apic.c
>>>>>> @@ -1029,10 +1029,7 @@ static int pin_2_irq(int idx, int apic,
>>>>>> } else {
>>>>>> u32 gsi = mp_gsi_routing[apic].gsi_base + pin;
>>>>>>
>>>>>> - if (gsi >= NR_IRQS_LEGACY)
>>>>>> - irq = gsi;
>>>>>> - else
>>>>>> - irq = gsi_top + gsi;
>>>>>> + irq = gsi_to_irq(gsi);
>>>>>> }
>>>>>>
>>>>>> #ifdef CONFIG_X86_32
>>>>
>>>> what is the point for making irq = gsi_top + gsi when mptable is used instead of acpi?
>>>
>>> Because it is only convention that when mptables are used that the
>>> first apic pins 0-15 are the ISA irqs. This thread witnessed and a
>>> pci irq that came in pin < 16 that was not an ISA irq. The truly rare
>>> and exotic case would be for the ISA irqs to be outside the first 16
>>> ioapic pins but the es7000 did exactly that.
>>
>> nvidia chipset if acpi is enabled, external pci device will use ioapic from 16 to 23.
>>
>> if mptable is used, external pci device will not use pin from 16 to 23..., and lot of devices will share same pin.
>
> Exactly. Pins < 16 are not necessarily ISA irqs, and can be possibly
> shared level triggered PCI irqs. Unfortunately there are strange
> boards like the es7000 where pins > 16 are ISA irqs.
>
> The other thing that is gained by having pin_2_irq always remap pins <
> 16 is we can get away with the numerous hard codes in the arch/x86 and elsewhere
> that assume irq < 16 is an ISA irq.

how about this one ?

---
arch/x86/kernel/apic/io_apic.c | 31 ++++++++++++++++++++++++++++---
1 file changed, 28 insertions(+), 3 deletions(-)

Index: linux-2.6/arch/x86/kernel/apic/io_apic.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/apic/io_apic.c
+++ linux-2.6/arch/x86/kernel/apic/io_apic.c
@@ -1013,6 +1013,28 @@ static inline int irq_trigger(int idx)
return MPBIOS_trigger(idx);
}

+static int shared_with_legacy(int apic, int pin)
+{
+ int i;
+
+ for (i = 0; i < mp_irq_entries; i++) {
+ int bus = mp_irqs[i].srcbus;
+
+ if (!test_bit(bus, mp_bus_not_pci))
+ continue;
+
+ if (mp_ioapics[apic].apicid != mp_irqs[i].dstapic)
+ continue;
+
+ if (mp_irqs[i].dstirq != pin)
+ continue;
+
+ return mp_irqs[i].srcbusirq;
+ }
+
+ return -1;
+}
+
static int pin_2_irq(int idx, int apic, int pin)
{
int irq;
@@ -1029,10 +1051,13 @@ static int pin_2_irq(int idx, int apic,
} else {
u32 gsi = mp_gsi_routing[apic].gsi_base + pin;

- if (gsi >= NR_IRQS_LEGACY)
+ if (gsi >= NR_IRQS_LEGACY) {
irq = gsi;
- else
- irq = gsi_top + gsi;
+ } else {
+ irq = shared_with_legacy(apic, pin);
+ if (irq < 0)
+ irq = gsi_top + gsi;
+ }
}

#ifdef CONFIG_X86_32
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Eric W. Biederman on
Yinghai Lu <yinghai(a)kernel.org> writes:

> On 08/03/2010 01:56 AM, Eric W. Biederman wrote:
>> Yinghai Lu <yinghai(a)kernel.org> writes:
>>
>>> On 08/03/2010 01:00 AM, Eric W. Biederman wrote:
>>>> Yinghai Lu <yinghai(a)kernel.org> writes:
>>>>
>>>>>>> Index: linux-2.6/arch/x86/kernel/apic/io_apic.c
>>>>>>> ===================================================================
>>>>>>> --- linux-2.6.orig/arch/x86/kernel/apic/io_apic.c
>>>>>>> +++ linux-2.6/arch/x86/kernel/apic/io_apic.c
>>>>>>> @@ -1029,10 +1029,7 @@ static int pin_2_irq(int idx, int apic,
>>>>>>> } else {
>>>>>>> u32 gsi = mp_gsi_routing[apic].gsi_base + pin;
>>>>>>>
>>>>>>> - if (gsi >= NR_IRQS_LEGACY)
>>>>>>> - irq = gsi;
>>>>>>> - else
>>>>>>> - irq = gsi_top + gsi;
>>>>>>> + irq = gsi_to_irq(gsi);
>>>>>>> }
>>>>>>>
>>>>>>> #ifdef CONFIG_X86_32
>>>>>
>>>>> what is the point for making irq = gsi_top + gsi when mptable is used instead of acpi?
>>>>
>>>> Because it is only convention that when mptables are used that the
>>>> first apic pins 0-15 are the ISA irqs. This thread witnessed and a
>>>> pci irq that came in pin < 16 that was not an ISA irq. The truly rare
>>>> and exotic case would be for the ISA irqs to be outside the first 16
>>>> ioapic pins but the es7000 did exactly that.
>>>
>>> nvidia chipset if acpi is enabled, external pci device will use ioapic from 16 to 23.
>>>
>>> if mptable is used, external pci device will not use pin from 16 to 23..., and lot of devices will share same pin.
>>
>> Exactly. Pins < 16 are not necessarily ISA irqs, and can be possibly
>> shared level triggered PCI irqs. Unfortunately there are strange
>> boards like the es7000 where pins > 16 are ISA irqs.
>>
>> The other thing that is gained by having pin_2_irq always remap pins <
>> 16 is we can get away with the numerous hard codes in the arch/x86 and elsewhere
>> that assume irq < 16 is an ISA irq.
>
> how about this one ?

You can't share an edge triggered ISA irq, it isn't really physically
possible. So I don't see how this extra complexity will change anything.

Eric


> ---
> arch/x86/kernel/apic/io_apic.c | 31 ++++++++++++++++++++++++++++---
> 1 file changed, 28 insertions(+), 3 deletions(-)
>
> Index: linux-2.6/arch/x86/kernel/apic/io_apic.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/kernel/apic/io_apic.c
> +++ linux-2.6/arch/x86/kernel/apic/io_apic.c
> @@ -1013,6 +1013,28 @@ static inline int irq_trigger(int idx)
> return MPBIOS_trigger(idx);
> }
>
> +static int shared_with_legacy(int apic, int pin)
> +{
> + int i;
> +
> + for (i = 0; i < mp_irq_entries; i++) {
> + int bus = mp_irqs[i].srcbus;
> +
> + if (!test_bit(bus, mp_bus_not_pci))
> + continue;
> +
> + if (mp_ioapics[apic].apicid != mp_irqs[i].dstapic)
> + continue;
> +
> + if (mp_irqs[i].dstirq != pin)
> + continue;
> +
> + return mp_irqs[i].srcbusirq;
> + }
> +
> + return -1;
> +}
> +
> static int pin_2_irq(int idx, int apic, int pin)
> {
> int irq;
> @@ -1029,10 +1051,13 @@ static int pin_2_irq(int idx, int apic,
> } else {
> u32 gsi = mp_gsi_routing[apic].gsi_base + pin;
>
> - if (gsi >= NR_IRQS_LEGACY)
> + if (gsi >= NR_IRQS_LEGACY) {
> irq = gsi;
> - else
> - irq = gsi_top + gsi;
> + } else {
> + irq = shared_with_legacy(apic, pin);
> + if (irq < 0)
> + irq = gsi_top + gsi;
> + }
> }
>
> #ifdef CONFIG_X86_32
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/