From: Maciej W. Rozycki on
On Mon, 2 Nov 2009, tip-bot for Suresh Siddha wrote:

> Commit-ID: b3ec0a37a7907813bb4fb85a2d94102c152470b7
> Gitweb: http://git.kernel.org/tip/b3ec0a37a7907813bb4fb85a2d94102c152470b7
> Author: Suresh Siddha <suresh.b.siddha(a)intel.com>
> AuthorDate: Mon, 26 Oct 2009 14:24:35 -0800
> Committer: Ingo Molnar <mingo(a)elte.hu>
> CommitDate: Mon, 2 Nov 2009 15:56:36 +0100
>
> x86: Use EOI register in io-apic on intel platforms
>
> IO-APIC's in intel chipsets support EOI register starting from
> IO-APIC version 2. Use that when ever we need to clear the
> IO-APIC RTE's RemoteIRR bit explicitly.
>
> Signed-off-by: Suresh Siddha <suresh.b.siddha(a)intel.com>
> Acked-by: Gary Hade <garyhade(a)us.ibm.com>
> Cc: Eric W. Biederman <ebiederm(a)xmission.com>
> LKML-Reference: <20091026230001.947855317(a)sbs-t61.sc.intel.com>
> [ Marked use_eio_reg as __read_mostly, fixed small details ]
> Signed-off-by: Ingo Molnar <mingo(a)elte.hu>
> ---
[...]
> +static int ioapic_supports_eoi(void)
> +{
> + struct pci_dev *root;
> +
> + root = pci_get_bus_and_slot(0, PCI_DEVFN(0, 0));
> + if (root && root->vendor == PCI_VENDOR_ID_INTEL &&
> + mp_ioapics[0].apicver >= 0x2) {
> + use_eoi_reg = 1;
> + printk(KERN_INFO "IO-APIC supports EOI register\n");
> + } else
> + printk(KERN_INFO "IO-APIC doesn't support EOI\n");
> +
> + return 0;
> +}

This is wrong -- the 82093AA I/O APIC has its version set to 0x11 and it
does not support the EOI register. Similarly I/O APICs integrated into
the 82379AB south bridge and the 82374EB/SB EISA component.

Overall values below 0x10 are reserved for the 82489DX -- are you sure
you didn't mean 0x12 or 0x20?

Maciej
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Suresh Siddha on
On Tue, 2009-11-03 at 16:53 -0800, Maciej W. Rozycki wrote:
> On Mon, 2 Nov 2009, tip-bot for Suresh Siddha wrote:
>
> > Commit-ID: b3ec0a37a7907813bb4fb85a2d94102c152470b7
> > Gitweb: http://git.kernel.org/tip/b3ec0a37a7907813bb4fb85a2d94102c152470b7
> > Author: Suresh Siddha <suresh.b.siddha(a)intel.com>
> > AuthorDate: Mon, 26 Oct 2009 14:24:35 -0800
> > Committer: Ingo Molnar <mingo(a)elte.hu>
> > CommitDate: Mon, 2 Nov 2009 15:56:36 +0100
> >
> > x86: Use EOI register in io-apic on intel platforms
> >
> > IO-APIC's in intel chipsets support EOI register starting from
> > IO-APIC version 2. Use that when ever we need to clear the
> > IO-APIC RTE's RemoteIRR bit explicitly.
> >
> > Signed-off-by: Suresh Siddha <suresh.b.siddha(a)intel.com>
> > Acked-by: Gary Hade <garyhade(a)us.ibm.com>
> > Cc: Eric W. Biederman <ebiederm(a)xmission.com>
> > LKML-Reference: <20091026230001.947855317(a)sbs-t61.sc.intel.com>
> > [ Marked use_eio_reg as __read_mostly, fixed small details ]
> > Signed-off-by: Ingo Molnar <mingo(a)elte.hu>
> > ---
> [...]
> > +static int ioapic_supports_eoi(void)
> > +{
> > + struct pci_dev *root;
> > +
> > + root = pci_get_bus_and_slot(0, PCI_DEVFN(0, 0));
> > + if (root && root->vendor == PCI_VENDOR_ID_INTEL &&
> > + mp_ioapics[0].apicver >= 0x2) {
> > + use_eoi_reg = 1;
> > + printk(KERN_INFO "IO-APIC supports EOI register\n");
> > + } else
> > + printk(KERN_INFO "IO-APIC doesn't support EOI\n");
> > +
> > + return 0;
> > +}
>
> This is wrong -- the 82093AA I/O APIC has its version set to 0x11 and it
> does not support the EOI register. Similarly I/O APICs integrated into
> the 82379AB south bridge and the 82374EB/SB EISA component.

Maciej, There might be some confusion (mostly on my side). When I looked
at ICH2 spec http://www.intel.com/assets/pdf/datasheet/290687.pdf it
says it has an EOI register and it is version 2.

All I heard internally was we have it from version 2 and it works (our
experiments on ICH4 etc worked).

But I do agree that I overlooked the version 11h of 82093AA (which is
older than ICH2).

> Overall values below 0x10 are reserved for the 82489DX

This is for local apic right?

> -- are you sure
> you didn't mean 0x12 or 0x20?

Let me check tomorrow and see where the confusion is.

thanks for looking at this.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Suresh Siddha on
On Tue, 2009-11-03 at 18:24 -0800, Suresh Siddha wrote:
> Maciej, There might be some confusion (mostly on my side). When I looked
> at ICH2 spec http://www.intel.com/assets/pdf/datasheet/290687.pdf it
> says it has an EOI register and it is version 2.

Ok, Issue is in the specs documentation. ICH2 to ICH5 seems to document
that they use version 0x2 ioapic's, while infact they use 0x20 version
and has a working EOI register. So we should use 0x20 as the version
check and not 0x2.

Maciej, can I have your ack for the appended patch?
---

From: Suresh Siddha <suresh.b.siddha(a)intel.com>
Subject: x86, ioapic: fix io-apic version check for the EOI reg presence

Maciej W. Rozycki reported:
> 82093AA I/O APIC has its version set to 0x11 and it
> does not support the EOI register. Similarly I/O APICs integrated into
> the 82379AB south bridge and the 82374EB/SB EISA component.

IO-APIC versions below 0x20 don't support EOI register.

Some of the Intel ICH Specs (ICH2 to ICH5) documents the io-apic
version as 0x2. This is an error with documentation and these ICH chips
use io-apic's of version 0x20 and indeed has a working EOI register
for the io-apic.

Fix the ioapic_supports_eoi() to check for version 0x20 and beyond.

Reported-by: Maciej W. Rozycki <macro(a)linux-mips.org>
Acked-by: Rajesh Sankaran <rajesh.sankaran(a)intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha(a)intel.com>
---

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 47d95c3..68510d4 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -2520,13 +2520,29 @@ static void eoi_ioapic_irq(struct irq_desc *desc)
spin_unlock_irqrestore(&ioapic_lock, flags);
}

+/*
+ * IO-APIC versions below 0x20 don't support EOI register.
+ * For the record, here is the information about various versions:
+ * 0Xh 82489DX
+ * 1Xh I/OAPIC or I/O(x)APIC which are not PCI 2.2 Compliant
+ * 2Xh I/O(x)APIC which is PCI 2.2 Compliant
+ * 30h-FFh Reserved
+ *
+ * Some of the Intel ICH Specs (ICH2 to ICH5) documents the io-apic
+ * version as 0x2. This is an error with documentation and these ICH chips
+ * use io-apic's of version 0x20.
+ */
static int ioapic_supports_eoi(void)
{
struct pci_dev *root;

root = pci_get_bus_and_slot(0, PCI_DEVFN(0, 0));
+ /*
+ * Perhaps we can do this for all vendors. But for now,
+ * no one else seems to have version >= 0x20 ??
+ */
if (root && root->vendor == PCI_VENDOR_ID_INTEL &&
- mp_ioapics[0].apicver >= 0x2) {
+ mp_ioapics[0].apicver >= 0x20) {
use_eoi_reg = 1;
printk(KERN_INFO "IO-APIC supports EOI register\n");
} else


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Maciej W. Rozycki on
On Wed, 4 Nov 2009, Suresh Siddha wrote:

> > Maciej, There might be some confusion (mostly on my side). When I looked
> > at ICH2 spec http://www.intel.com/assets/pdf/datasheet/290687.pdf it
> > says it has an EOI register and it is version 2.

For the record: the original ICH/ICH0 (82801AA/AB) seems to have the
version set to 0x11 as well. Now its datasheet mentions the EOI register,
but even if it is not a mistake, its usefulness is rather limited as the
APIC does not support FSB delivery.

Also for the record: the 82489DX is a combined local and I/O APIC chip --
quite reasonable given it was intended for 486-class processors which
needed both units and the cost of manufacturing a separate chip for each
unit plus the extra PCB space needed in that case certainly outweighed the
loss from having a part of silicon unused on some chips.

> Ok, Issue is in the specs documentation. ICH2 to ICH5 seems to document
> that they use version 0x2 ioapic's, while infact they use 0x20 version
> and has a working EOI register. So we should use 0x20 as the version
> check and not 0x2.

I've checked some docs too and you may want to check whether to qualify
the use of the EOI register further with the DT (Delivery Type) bit in the
Boot Configuration register. It affects ICH2 at least (but some I/O APICs
do this differently; e.g. the PID or 82094AA changes its version between
0x13 and 0x21 depending on whether FSB delivery is used or not).
Otherwise you may have interrupts delivered twice (though it may not be a
big problem after all).

Note that we have a piece of code to report the existence of the DT bit
already, so we've got the knowledge about APIC versions already. ;) I
guess it should be modified a bit though as reportedly the PID does not
have the Boot Configuration register, so the version checked there should
be 0x20 exactly, rather than greater or equal.

Also you may want to see whether the complication in ack_apic_level()
that is meant to deal with an APIC erratum really matters for FSB delivery
-- I guess not, because if you explicitly ACK an interrupt in the I/O
unit, then even if it was incorrectly recorded as edge-triggered in the
local unit, the IRR bit will be correctly reset and the next message
delivered properly. Given you introduce a conditional statement anyway,
you can place more code within it and there will be no performance hit for
the other path and certainly a gain for this one.

Additionally it seems to me that code to migrate an IRQ is placed
incorrectly -- the erratum workaround should be placed above it as it
complements ack_APIC_irq() -- the IRR bit will not have been reset before
the workaround has been executed if the erratum was hit. Will you care
about this problem?

> Maciej, can I have your ack for the appended patch?

Certainly, it looks good to me.

Acked-by: Maciej W. Rozycki <macro(a)linux-mips.org>

> ---
>
> From: Suresh Siddha <suresh.b.siddha(a)intel.com>
> Subject: x86, ioapic: fix io-apic version check for the EOI reg presence
>
> Maciej W. Rozycki reported:
> > 82093AA I/O APIC has its version set to 0x11 and it
> > does not support the EOI register. Similarly I/O APICs integrated into
> > the 82379AB south bridge and the 82374EB/SB EISA component.
>
> IO-APIC versions below 0x20 don't support EOI register.
>
> Some of the Intel ICH Specs (ICH2 to ICH5) documents the io-apic
> version as 0x2. This is an error with documentation and these ICH chips
> use io-apic's of version 0x20 and indeed has a working EOI register
> for the io-apic.
>
> Fix the ioapic_supports_eoi() to check for version 0x20 and beyond.
>
> Reported-by: Maciej W. Rozycki <macro(a)linux-mips.org>
> Acked-by: Rajesh Sankaran <rajesh.sankaran(a)intel.com>
> Signed-off-by: Suresh Siddha <suresh.b.siddha(a)intel.com>
> ---
>
> diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
> index 47d95c3..68510d4 100644
> --- a/arch/x86/kernel/apic/io_apic.c
> +++ b/arch/x86/kernel/apic/io_apic.c
> @@ -2520,13 +2520,29 @@ static void eoi_ioapic_irq(struct irq_desc *desc)
> spin_unlock_irqrestore(&ioapic_lock, flags);
> }
>
> +/*
> + * IO-APIC versions below 0x20 don't support EOI register.
> + * For the record, here is the information about various versions:
> + * 0Xh 82489DX
> + * 1Xh I/OAPIC or I/O(x)APIC which are not PCI 2.2 Compliant
> + * 2Xh I/O(x)APIC which is PCI 2.2 Compliant
> + * 30h-FFh Reserved
> + *
> + * Some of the Intel ICH Specs (ICH2 to ICH5) documents the io-apic
> + * version as 0x2. This is an error with documentation and these ICH chips
> + * use io-apic's of version 0x20.
> + */
> static int ioapic_supports_eoi(void)
> {
> struct pci_dev *root;
>
> root = pci_get_bus_and_slot(0, PCI_DEVFN(0, 0));
> + /*
> + * Perhaps we can do this for all vendors. But for now,
> + * no one else seems to have version >= 0x20 ??
> + */
> if (root && root->vendor == PCI_VENDOR_ID_INTEL &&
> - mp_ioapics[0].apicver >= 0x2) {
> + mp_ioapics[0].apicver >= 0x20) {
> use_eoi_reg = 1;
> printk(KERN_INFO "IO-APIC supports EOI register\n");
> } else
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Suresh Siddha on
On Thu, 2009-11-05 at 06:46 -0800, Maciej W. Rozycki wrote:
> For the record: the original ICH/ICH0 (82801AA/AB) seems to have the
> version set to 0x11 as well. Now its datasheet mentions the EOI register,
> but even if it is not a mistake,

hmm. what a mess.

I found a kernel log in an archive
(http://www.mail-archive.com/linux-usb-users(a)lists.sourceforge.net/msg13255.html) for ICH0 which shows that it uses an io-apic of version 0x20. So mostly the version in the datasheet is wrong here aswell :-(

And our IO/chipset architect (Rajesh) also confirmed that ICH0 indeed
has an EOI reg.

So, version 0x20 seems to be pretty safe to use EOI register. And
according to Gary Hade's experiments on IBM platforms having an io-apic
version 0x11, magic of mask+edge and unmask+level seems to clear
remoteIRR.

So the current logic looks safe to me.

> Also for the record: the 82489DX is a combined local and I/O APIC chip --

Thanks. This is what I learnt internally too yesterday and hence the
documentation update ;)

> I've checked some docs too and you may want to check whether to qualify
> the use of the EOI register further with the DT (Delivery Type) bit in the
> Boot Configuration register. It affects ICH2 at least

For relatively newer ICH's like ICH5, boot configuration register is
marked as a reserved register (perhaps with the serial bus going away,
so did this register but again with out the io-apic version change).

Anyways, our understanding is that EOI register in ICH2 should work
irrespective of DT bit in the boot config register.

> Also you may want to see whether the complication in ack_apic_level()
> that is meant to deal with an APIC erratum really matters for FSB delivery
> -- I guess not, because if you explicitly ACK an interrupt in the I/O
> unit, then even if it was incorrectly recorded as edge-triggered in the
> local unit, the IRR bit will be correctly reset and the next message
> delivered properly. Given you introduce a conditional statement anyway,
> you can place more code within it and there will be no performance hit for
> the other path and certainly a gain for this one.

I am not sure if I follow. With the recent changes (tip commit
5231a68614b94f60e8f6c56bc6e3d75955b9e75e), we use ipi on a new cpu to
handle a pending level-triggered interrupt on the cpu that is going
offline. It's just not only in the case of io-apic erratum for 0x11, we
see level triggered interrupt as edge interrupt at the cpu.


> Additionally it seems to me that code to migrate an IRQ is placed
> incorrectly -- the erratum workaround should be placed above it as it
> complements ack_APIC_irq() -- the IRR bit will not have been reset before
> the workaround has been executed if the erratum was hit. Will you care
> about this problem?

Yes. I see this issue and agree with your assesment. The result is that
we missed an irq migration attempt and delay it to the next arrival.

I will post a different fix and also update some of the code comments
around this to reflect new changes in the code.

> > Maciej, can I have your ack for the appended patch?
>
> Certainly, it looks good to me.
>
> Acked-by: Maciej W. Rozycki <macro(a)linux-mips.org>

Thanks. Ingo, Can you please queue this patch too? I am planning to do
couple of more cleanups on top of this. I will post them shortly.

thanks,
suresh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/