From: Andreas Mohr on
Hi,

> Furthermore, the patch shows that the second-to-last argument to
> usb_fill_bulk_urb() -- the completion function -- is NULL. That is
> strictly illegal and it should have caused an oops as soon as the URB
> was used.

Then there's definitely a WARN_ON or so missing in
static inline void usb_fill_bulk_urb()

And highly likely more checks in those areas that are causing my (and
other people's) ftdi_sio tests and USB audio (MIPS mmap) to fail.
Followup soon.

Andreas Mohr
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Alan Stern on
On Sat, 5 Dec 2009, Andreas Mohr wrote:

> Hi,
>
> > Furthermore, the patch shows that the second-to-last argument to
> > usb_fill_bulk_urb() -- the completion function -- is NULL. That is
> > strictly illegal and it should have caused an oops as soon as the URB
> > was used.
>
> Then there's definitely a WARN_ON or so missing in
> static inline void usb_fill_bulk_urb()

No there isn't. That inline just fills in a bunch of fields.

You could argue that there is a WARN_ON missing in usb_submit_urb().
I don't think one is necessary, but you might disagree. Either way,
both of us missed the fact that right at the start of usb_submit_urb()
is a check for urb->complete being NULL; if it is NULL then the
submission simply fails (and there is no oops).

> And highly likely more checks in those areas that are causing my (and
> other people's) ftdi_sio tests and USB audio (MIPS mmap) to fail.
> Followup soon.

Sometimes having too many checks is worse than having too few,
especially if the failure modes are relatively easy to handle.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Andreas Mohr on
Hi,

On Sat, Dec 05, 2009 at 12:16:13PM -0500, Alan Stern wrote:
> On Sat, 5 Dec 2009, Andreas Mohr wrote:
>
> > Hi,
> >
> > > Furthermore, the patch shows that the second-to-last argument to
> > > usb_fill_bulk_urb() -- the completion function -- is NULL. That is
> > > strictly illegal and it should have caused an oops as soon as the URB
> > > was used.
> >
> > Then there's definitely a WARN_ON or so missing in
> > static inline void usb_fill_bulk_urb()
>
> No there isn't. That inline just fills in a bunch of fields.

After the fact I've been thinking that yes, such an inline helper
isn't really an appropriate location.

> > And highly likely more checks in those areas that are causing my (and
> > other people's) ftdi_sio tests and USB audio (MIPS mmap) to fail.
> > Followup soon.
>
> Sometimes having too many checks is worse than having too few,
> especially if the failure modes are relatively easy to handle.

True, many checks in all sorts of user places instead of the one core
place where it matters can clutter things. Especially since the WARN_ON
checks are unconditional, not a debug-only setting.

Andreas Mohr
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Ondrej Zary on
On Friday 04 December 2009, Alan Stern wrote:
> On Fri, 4 Dec 2009, Ondrej Zary wrote:
> > [ 3.712039] usb 2-1: new full speed USB device using uhci_hcd and
> > address 2 [ 3.726791] kjournald starting. Commit interval 5 seconds
> > [ 3.726817] EXT3-fs (sda2): mounted filesystem with ordered data mode
> > [ 3.851384] usb 2-1: not running at top speed; connect to a high speed
> > hub [ 3.859379] usb 2-1: New USB device found, idVendor=2001,
> > idProduct=f103 [ 3.859387] usb 2-1: New USB device strings: Mfr=0,
> > Product=0, SerialNumber=0 [ 3.862635] hub 2-1:1.0: USB hub found
> > [ 3.864385] hub 2-1:1.0: 7 ports detected
>
> That's the problem. Notice the "not running at top speed" message?
> Something went wrong when the hub was detected. It could be a problem
> in your EHCI controller or a problem in the hub.
>
> You can get more information about this by unplugging the hub, running
> usbmon (on the 0u file), and then plugging the hub back in.
>
> > diff between ehci and uhci logs:
> > There seems to be some problem with timing.
> > Also order of ehci_usb vs uhci_usb loading changes.
>
> That order doesn't matter much. But this...

Looks like it does matter. I compiled ehci_hcd in kernel and left uhci_hcd as
module - and the hub was always attached to ehci. Compiled uhci_hcd in kernel
and ehci_hcd as module - and it was always attached to uhci. So the HW is
probably OK.

> > --- dmesg-ehci.txt- 2009-12-04 20:01:39.000000000 +0100
> > +++ dmesg-uhci.txt- 2009-12-04 20:01:31.000000000 +0100
> > @@ -144,10 +144,9 @@
> > Console: colour VGA+ 80x25
> > console [tty0] enabled
> > hpet clockevent registered
> > - Fast TSC calibration failed
> > - TSC: PIT calibration matches HPET. 1 loops
> > - Detected 1608.000 MHz processor.
> > - Calibrating delay loop (skipped), value calculated using timer
> > frequency.. 3216.00 BogoMIPS (lpj=6432000) + Fast TSC calibration using
> > PIT
> > + Detected 1608.123 MHz processor.
> > + Calibrating delay loop (skipped), value calculated using timer
> > frequency.. 3216.24 BogoMIPS (lpj=6432492) Security Framework initialized
> > SELinux: Disabled at boot.
> > Mount-cache hash table entries: 512
> > @@ -180,7 +179,7 @@
> > CPU1: Thermal monitoring enabled (TM2)
> > CPU1: Intel(R) Atom(TM) CPU N270 @ 1.60GHz stepping 02
> > Brought up 2 CPUs
> > - Total of 2 processors activated (9326.27 BogoMIPS).
> > + Total of 2 processors activated (6432.20 BogoMIPS).
>
> Those two differences seem strange to me. You might want to report it
> in a new email thread on LKML. You might also want to see if the same
> thing happens with a 2.6.32 kernel.
>
> > > This may be a bug in ehci-hcd, a bug in your EHCI hardware, or a bug in
> > > the hub. Can you try using a different high-speed hub to see if it
> > > makes any difference?
> >
> > Yes, I'll try it next week (I have only remote access now).
> > I have different 7-port hub available to test (should be with Philips
> > chipset).
>
> It's worth a try. Still, the original problem you saw (the oops in
> ehci-hcd) is in software, not in hardware, so the hub can't be entirely
> responsible.
>
> Alan Stern

--
Ondrej Zary
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Ondrej Zary on
On Friday 04 December 2009, Alan Stern wrote:
> > With uhci_hcd, rmmod works fine. With ehci_hcd, rmmod hangs the bus - all
> > urbs fail with -ENOENT:
> > f67265e8 1428021080 S Bi:1:009:2 -115 128 <
> > f67265e8 1431508327 C Bi:1:009:2 -108 0
> > f6726718 1458252464 S Co:1:007:0 s 40 09 0001 0000 0000 0
> > f6726718 1463261404 C Co:1:007:0 -2 0
> > f6726978 1463261428 S Co:1:002:0 s 23 08 0070 0001 0000 0
> > f6726718 1463261509 S Co:1:007:0 s 40 00 0000 0000 0000 0
> > f6726978 1464273397 C Co:1:002:0 -2 0
> > f6726718 1468273397 C Co:1:007:0 -2 0
>
> This may be a bug in ehci-hcd, a bug in your EHCI hardware, or a bug in
> the hub. Can you try using a different high-speed hub to see if it
> makes any difference?

Just tried another hub. Now there are two hubs connected to separate ports
on the machine. Nexio is the only device connected to the "new" hub. No matter
where I connect the device or the 2nd hub, it always appears on "Bus 001":

Bus 002 Device 002: ID 041e:4068 Creative Technology, Ltd Webcam Live! Notebook
Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 003 Device 002: ID 413c:2003 Dell Computer Corp. Keyboard
Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 005 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 004 Device 002: ID 0bda:0158 Realtek Semiconductor Corp. USB 2.0 multicard reader
Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 001 Device 010: ID 1870:0001 Nexio Co., Ltd iNexio Touchscreen controller
Bus 001 Device 009: ID 088c:2030 Swecoin AB Ticket Printer TTP 2030
Bus 001 Device 008: ID 0403:6001 Future Technology Devices International, Ltd FT232 USB-Serial (UART) IC
Bus 001 Device 007: ID 065a:0001 Optoelectronics Co., Ltd Barcode scanner
Bus 001 Device 002: ID 2001:f103 D-Link Corp. [hex] DUB-H7 7-port USB 2.0 hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 001 Device 005: ID 04cc:1521 Philips Semiconductors USB 2.0 Hub

The problem is still the same. Removing the module causes devices on the other
hub to fail.

Disconnecting the touchscreen first and then removing the module does not
cause any problems (with either of the hubs) - so it must be a software
problem.

--
Ondrej Zary
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/