From: Lennart Sorensen on
On Wed, Jan 20, 2010 at 11:37:07AM -0500, Alan Stern wrote:
> It's true that the USB controllers appear to be the only devices using
> IRQ 5. But your startup log shows that only two of the three
> controllers have an associated driver. Maybe the third controller is
> generating the unwanted interrupts?

Could be. But I never saw the problem with 2.6.26. That's puzzling me.

Now I think the 3rd controller may be a client mode interface on the
Geode LX, which I certainly don't use if it has such a thing.

I know what EHCI and OHCI mean. I don't know what UDC is. I see mention
of UDC under USB gadget support. I will try enabling that driver and
see what it does.

--
Len Sorensen
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Lennart Sorensen on
On Wed, Jan 20, 2010 at 02:42:01PM -0500, Lennart Sorensen wrote:
> On Wed, Jan 20, 2010 at 11:37:07AM -0500, Alan Stern wrote:
> > It's true that the USB controllers appear to be the only devices using
> > IRQ 5. But your startup log shows that only two of the three
> > controllers have an associated driver. Maybe the third controller is
> > generating the unwanted interrupts?
>
> Could be. But I never saw the problem with 2.6.26. That's puzzling me.
>
> Now I think the 3rd controller may be a client mode interface on the
> Geode LX, which I certainly don't use if it has such a thing.
>
> I know what EHCI and OHCI mean. I don't know what UDC is. I see mention
> of UDC under USB gadget support. I will try enabling that driver and
> see what it does.

So the only difference is that now it happens 3 times instead of 2.
Once each for ehci, ohci and amd5536udc.

Interestingly the number of interrupts listed in /proc/interrupts is
exactly 300000.

CPU0
0: 62675 XT-PIC-XT timer
2: 0 XT-PIC-XT cascade
3: 7 XT-PIC-XT
4: 772 XT-PIC-XT serial
5: 300000 XT-PIC-XT ehci_hcd:usb1, ohci_hcd:usb2, amd5536udc
7: 1 XT-PIC-XT
8: 2 XT-PIC-XT rtc0
10: 1 XT-PIC-XT geode-mfgpt, eth2
11: 68 XT-PIC-XT eth1
14: 2237 XT-PIC-XT pata_cs5536
NMI: 0 Non-maskable interrupts
LOC: 0 Local timer interrupts
SPU: 0 Spurious interrupts
PMI: 0 Performance monitoring interrupts
PND: 0 Performance pending work
ERR: 1
MIS: 0

That's a lot of interrupts for a system that was just booted.

Hmm, I just tried booting 2.6.26 again, and now it too is failing.
I think my box just broke. Aghhh!

So I think that means there is no problem, I just have to go get
another box. I hate hardware problems. :)

--
Len Sorensen
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Lennart Sorensen on
On Wed, Jan 20, 2010 at 03:20:46PM -0500, Alan Stern wrote:
> Okay, this means that something else is affecting the IRQ line.

Yep. I went and disabled irq5 on the LPC bus, and that didn't help.

> > Interestingly the number of interrupts listed in /proc/interrupts is
> > exactly 300000.
>
> That's probably because the kernel disabled the IRQ after 100000
> unhandled interrupts, and re-enabled it each time a new device was
> registered for that IRQ.

Certainly seems to be the case.

> Or you could just stop using USB on the old box. :-)

Well I just tried another box, and same thing.

If I boot with 'irqpoll' then everything seems fine.

Any idea how I can figure out why irqpoll makes it happy? Does it give
any reports anywhere about misrouted irqs?

--
Len Sorensen
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Lennart Sorensen on
On Wed, Jan 20, 2010 at 05:06:53PM -0500, Alan Stern wrote:
> On Wed, 20 Jan 2010, Lennart Sorensen wrote:
>
> > On Wed, Jan 20, 2010 at 03:20:46PM -0500, Alan Stern wrote:
> > > Okay, this means that something else is affecting the IRQ line.
> >
> > Yep. I went and disabled irq5 on the LPC bus, and that didn't help.
> >
> > > > Interestingly the number of interrupts listed in /proc/interrupts is
> > > > exactly 300000.
> > >
> > > That's probably because the kernel disabled the IRQ after 100000
> > > unhandled interrupts, and re-enabled it each time a new device was
> > > registered for that IRQ.
> >
> > Certainly seems to be the case.
> >
> > > Or you could just stop using USB on the old box. :-)
> >
> > Well I just tried another box, and same thing.
> >
> > If I boot with 'irqpoll' then everything seems fine.
> >
> > Any idea how I can figure out why irqpoll makes it happy? Does it give
> > any reports anywhere about misrouted irqs?
>
> Not that I know of.

That's unfortunate. After all it would be nice if irqpoll found an
unhandled irq and called all the other handlers and the irq went away,
then a report of which handler cleared it would be very helpful.

> Does 2.6.26 fail on the new machine?

Yes.

As far as I can tell some machines don't see the problem. I am still
investigating that.

> Are you using a .config different from the one that used to work with
> 2.6.26?

Well of course, but not for the USB settings. It is supposed to be the
same as far as possible. I will go compare them in more detail.

--
Len Sorensen
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Lennart Sorensen on
On Wed, Jan 20, 2010 at 08:52:03PM -0500, Alan Stern wrote:
> You could try changing the set of drivers loaded in memory. For
> example, after booting with irqpoll you can unload ehci-hcd plus some
> other driver, and then load ehci-hcd again. If the IRQ storm occurs
> then you'll know where to look.
>
> > > Does 2.6.26 fail on the new machine?
> >
> > Yes.
> >
> > As far as I can tell some machines don't see the problem. I am still
> > investigating that.
>
> Same kernel, initramfs, and everything else?

Same 2.6.26 kernel, there is no initramfs involved. Some user space
code has been updated on this one.

> What I meant was: Is your current 2.6.26 still built using the same
> .config as the old 2.6.26 which used to work okay?

I haven't actually rebuilt 2.6.26. It is the same kernel image.

I wonder if some of the cpu modules have a flaw that's making the IRQ
line go crazy. I wasn't normally using it before except when a cellular
usb modem was installed.

I will try booting the same release of software on my box as the one
that works and see if it still sees a problem...

So it turns out every box behaves the same, but only if I change the
order things are loaded in. If I load the usb driver before programming
the FPGA on the LPC bus, then I get the IRQ flood. If I program the
FPGA first, then no problem. I don't understand why since the LPC bus
has been told not to handle IRQ 5 at all, and the FPGA has the pin tri
stated and it is pulled up as per LPC SERIRQ spec, yet somehow it is
driving the system mad. So the whole problem seems to be that by
adding udev I made usb load earlier than it used to, and then when
I tried building USB into the kernel instead, it loads even earlier,
and in both cases it complains. Previously I just loaded things manually
in the order that made sense, but I really want to go to more modern
ways of doing things. I guess I will have to tell udev to ignore the
usb controller and load that one manually after the FPGA. At least
things make some sense now.

--
Len Sorensen
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/