From: Justin Piszcz on


On Thu, 25 Mar 2010, Justin Piszcz wrote:

The same problem has been reported by another person, he says his entire
system freezes, which, it appears to do unless you can SSH into the box:
http://www.openoffice.org/issues/show_bug.cgi?id=76797

Look at his lspci listing.
james(a)dv6105us:~$ lspci
00:00.0 RAM memory: nVidia Corporation C51 Host Bridge (rev a2)
00:00.1 RAM memory: nVidia Corporation C51 Memory Controller 0 (rev a2)
00:00.2 RAM memory: nVidia Corporation C51 Memory Controller 1 (rev a2)

Here is mine:
$ lspci
00:00.0 RAM memory: nVidia Corporation C51 Host Bridge (rev a2)
00:00.1 RAM memory: nVidia Corporation C51 Memory Controller 0 (rev a2)
00:00.2 RAM memory: nVidia Corporation C51 Memory Controller 1 (rev a2)
00:00.3 RAM memory: nVidia Corporation C51 Memory Controller 5 (rev a2)

Looks like the bug may be in the USB subsystem for this chipset.

Justin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Justin Piszcz on


On Thu, 25 Mar 2010, Justin Piszcz wrote:

>
>
> On Thu, 25 Mar 2010, Justin Piszcz wrote:
>
> The same problem has been reported by another person, he says his entire
> system freezes, which, it appears to do unless you can SSH into the box:
> http://www.openoffice.org/issues/show_bug.cgi?id=76797
>
> Look at his lspci listing.
> james(a)dv6105us:~$ lspci
> 00:00.0 RAM memory: nVidia Corporation C51 Host Bridge (rev a2)
> 00:00.1 RAM memory: nVidia Corporation C51 Memory Controller 0 (rev a2)
> 00:00.2 RAM memory: nVidia Corporation C51 Memory Controller 1 (rev a2)
>
> Here is mine:
> $ lspci
> 00:00.0 RAM memory: nVidia Corporation C51 Host Bridge (rev a2)
> 00:00.1 RAM memory: nVidia Corporation C51 Memory Controller 0 (rev a2)
> 00:00.2 RAM memory: nVidia Corporation C51 Memory Controller 1 (rev a2)
> 00:00.3 RAM memory: nVidia Corporation C51 Memory Controller 5 (rev a2)
>
> Looks like the bug may be in the USB subsystem for this chipset.
>
> Justin.
>
>

Hi,

And there it goes again *LOCK*
root 2190 0.5 1.5 37832 31424 tty7 Ds+ 09:00 0:12 /usr/bin/X :0 vt7 -nolisten tcp -auth /var/lib/xdm/authdir/authfiles/A:0-N5V00o

When this happens, the keyboard also stops working (if you try to go to
the console), e.g. caps lock doesn't light up anymore, etc..

Here is the oops with frame pointer enabled:

2280.291281] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2280.291283] Xorg D f38d5edc 0 2190 2177 0x00400004
[ 2280.291287] f38d5ef0 00203082 00000002 f38d5edc c3203324 00000000 f38d5ec8 c13eb91d
[ 2280.291292] c13f4c40 f7115234 f69904dc 00203292 001af807 f3845f40 c3203324 f38d5ecc
[ 2280.291296] c15da000 c15dd324 c15e0480 00000001 f71150b0 f80ae94e f38d5ed0 f38d5ed4
[ 2280.291301] Call Trace:
[ 2280.291309] [<c13eb91d>] ? schedule_timeout+0x11d/0x170
[ 2280.291318] [<f80ae94e>] ? unlink1+0x2e/0xe0 [usbcore]
[ 2280.291322] [<c128a13f>] ? put_device+0xf/0x20
[ 2280.291327] [<f80afdcd>] usb_kill_urb+0x5d/0x90 [usbcore]
[ 2280.291331] [<c105f840>] ? autoremove_wake_function+0x0/0x40
[ 2280.291334] [<f812b7ae>] usbhid_close+0x6e/0x80 [usbhid]
[ 2280.291338] [<c1304936>] hidinput_close+0x16/0x20
[ 2280.291342] [<c12ed5c9>] input_close_device+0x49/0x70
[ 2280.291344] [<c12f08f0>] evdev_release+0x80/0xa0
[ 2280.291348] [<c10ab99e>] __fput+0xce/0x1c0
[ 2280.291351] [<c10abaa5>] fput+0x15/0x20
[ 2280.291353] [<c10a8a97>] filp_close+0x47/0x70
[ 2280.291356] [<c10a8b1a>] sys_close+0x5a/0xa0
[ 2280.291358] [<c1023550>] sysenter_do_call+0x12/0x26
[ 2400.291277] INFO: task Xorg:2190 blocked for more than 120 seconds.
[ 2400.291280] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2400.291283] Xorg D f38d5edc 0 2190 2177 0x00400004
[ 2400.291287] f38d5ef0 00203082 00000002 f38d5edc c3203324 00000000 f38d5ec8 c13eb91d
[ 2400.291292] c13f4c40 f7115234 f69904dc 00203292 001af807 f3845f40 c3203324 f38d5ecc
[ 2400.291296] c15da000 c15dd324 c15e0480 00000001 f71150b0 f80ae94e f38d5ed0 f38d5ed4
[ 2400.291300] Call Trace:
[ 2400.291307] [<c13eb91d>] ? schedule_timeout+0x11d/0x170
[ 2400.291315] [<f80ae94e>] ? unlink1+0x2e/0xe0 [usbcore]
[ 2400.291319] [<c128a13f>] ? put_device+0xf/0x20
[ 2400.291324] [<f80afdcd>] usb_kill_urb+0x5d/0x90 [usbcore]
[ 2400.291328] [<c105f840>] ? autoremove_wake_function+0x0/0x40
[ 2400.291331] [<f812b7ae>] usbhid_close+0x6e/0x80 [usbhid]
[ 2400.291335] [<c1304936>] hidinput_close+0x16/0x20
[ 2400.291338] [<c12ed5c9>] input_close_device+0x49/0x70
[ 2400.291341] [<c12f08f0>] evdev_release+0x80/0xa0
[ 2400.291345] [<c10ab99e>] __fput+0xce/0x1c0
[ 2400.291347] [<c10abaa5>] fput+0x15/0x20
[ 2400.291350] [<c10a8a97>] filp_close+0x47/0x70
[ 2400.291352] [<c10a8b1a>] sys_close+0x5a/0xa0
[ 2400.291355] [<c1023550>] sysenter_do_call+0x12/0x26

Justin.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Alan Stern on
On Thu, 25 Mar 2010, Justin Piszcz wrote:

> On Thu, 25 Mar 2010, Justin Piszcz wrote:
>
> >
> >
> > On Thu, 25 Mar 2010, Justin Piszcz wrote:
> >
> > The same problem has been reported by another person, he says his entire
> > system freezes, which, it appears to do unless you can SSH into the box:
> > http://www.openoffice.org/issues/show_bug.cgi?id=76797
> >
> > Look at his lspci listing.
> > james(a)dv6105us:~$ lspci
> > 00:00.0 RAM memory: nVidia Corporation C51 Host Bridge (rev a2)
> > 00:00.1 RAM memory: nVidia Corporation C51 Memory Controller 0 (rev a2)
> > 00:00.2 RAM memory: nVidia Corporation C51 Memory Controller 1 (rev a2)
> >
> > Here is mine:
> > $ lspci
> > 00:00.0 RAM memory: nVidia Corporation C51 Host Bridge (rev a2)
> > 00:00.1 RAM memory: nVidia Corporation C51 Memory Controller 0 (rev a2)
> > 00:00.2 RAM memory: nVidia Corporation C51 Memory Controller 1 (rev a2)
> > 00:00.3 RAM memory: nVidia Corporation C51 Memory Controller 5 (rev a2)
> >
> > Looks like the bug may be in the USB subsystem for this chipset.
> >
> > Justin.
> >
> >
>
> Hi,
>
> And there it goes again *LOCK*
> root 2190 0.5 1.5 37832 31424 tty7 Ds+ 09:00 0:12 /usr/bin/X :0 vt7 -nolisten tcp -auth /var/lib/xdm/authdir/authfiles/A:0-N5V00o
>
> When this happens, the keyboard also stops working (if you try to go to
> the console), e.g. caps lock doesn't light up anymore, etc..
>
> Here is the oops with frame pointer enabled:

Please don't call it an "oops". It's not; it's just an informational
message.

> 2280.291281] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 2280.291283] Xorg D f38d5edc 0 2190 2177 0x00400004
> [ 2280.291287] f38d5ef0 00203082 00000002 f38d5edc c3203324 00000000 f38d5ec8 c13eb91d
> [ 2280.291292] c13f4c40 f7115234 f69904dc 00203292 001af807 f3845f40 c3203324 f38d5ecc
> [ 2280.291296] c15da000 c15dd324 c15e0480 00000001 f71150b0 f80ae94e f38d5ed0 f38d5ed4
> [ 2280.291301] Call Trace:
> [ 2280.291309] [<c13eb91d>] ? schedule_timeout+0x11d/0x170
> [ 2280.291318] [<f80ae94e>] ? unlink1+0x2e/0xe0 [usbcore]
> [ 2280.291322] [<c128a13f>] ? put_device+0xf/0x20
> [ 2280.291327] [<f80afdcd>] usb_kill_urb+0x5d/0x90 [usbcore]
> [ 2280.291331] [<c105f840>] ? autoremove_wake_function+0x0/0x40
> [ 2280.291334] [<f812b7ae>] usbhid_close+0x6e/0x80 [usbhid]
> [ 2280.291338] [<c1304936>] hidinput_close+0x16/0x20
> [ 2280.291342] [<c12ed5c9>] input_close_device+0x49/0x70
> [ 2280.291344] [<c12f08f0>] evdev_release+0x80/0xa0
> [ 2280.291348] [<c10ab99e>] __fput+0xce/0x1c0
> [ 2280.291351] [<c10abaa5>] fput+0x15/0x20
> [ 2280.291353] [<c10a8a97>] filp_close+0x47/0x70
> [ 2280.291356] [<c10a8b1a>] sys_close+0x5a/0xa0
> [ 2280.291358] [<c1023550>] sysenter_do_call+0x12/0x26
> [ 2400.291277] INFO: task Xorg:2190 blocked for more than 120 seconds.
> [ 2400.291280] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 2400.291283] Xorg D f38d5edc 0 2190 2177 0x00400004
> [ 2400.291287] f38d5ef0 00203082 00000002 f38d5edc c3203324 00000000 f38d5ec8 c13eb91d
> [ 2400.291292] c13f4c40 f7115234 f69904dc 00203292 001af807 f3845f40 c3203324 f38d5ecc
> [ 2400.291296] c15da000 c15dd324 c15e0480 00000001 f71150b0 f80ae94e f38d5ed0 f38d5ed4
> [ 2400.291300] Call Trace:
> [ 2400.291307] [<c13eb91d>] ? schedule_timeout+0x11d/0x170
> [ 2400.291315] [<f80ae94e>] ? unlink1+0x2e/0xe0 [usbcore]
> [ 2400.291319] [<c128a13f>] ? put_device+0xf/0x20
> [ 2400.291324] [<f80afdcd>] usb_kill_urb+0x5d/0x90 [usbcore]
> [ 2400.291328] [<c105f840>] ? autoremove_wake_function+0x0/0x40
> [ 2400.291331] [<f812b7ae>] usbhid_close+0x6e/0x80 [usbhid]
> [ 2400.291335] [<c1304936>] hidinput_close+0x16/0x20
> [ 2400.291338] [<c12ed5c9>] input_close_device+0x49/0x70
> [ 2400.291341] [<c12f08f0>] evdev_release+0x80/0xa0
> [ 2400.291345] [<c10ab99e>] __fput+0xce/0x1c0
> [ 2400.291347] [<c10abaa5>] fput+0x15/0x20
> [ 2400.291350] [<c10a8a97>] filp_close+0x47/0x70
> [ 2400.291352] [<c10a8b1a>] sys_close+0x5a/0xa0
> [ 2400.291355] [<c1023550>] sysenter_do_call+0x12/0x26

It does look like there's a problem with one of the USB host
controllers. Does the dmesg log show anything when the mouse and
keyboard stop working (as opposed to 120 seconds later)?

What does /sys/kernel/debug/usb/devices contain? (You'll probably have
to look at it _before_ the problem occurs.)

Suppose you don't run X at all. Do the mouse and keyboard eventually
stop working even then?

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Justin Piszcz on


On Thu, 25 Mar 2010, Alan Stern wrote:

> Please don't call it an "oops". It's not; it's just an informational
> message.
Ok.

> It does look like there's a problem with one of the USB host
> controllers. Does the dmesg log show anything when the mouse and
It must be the chipset because it did affect the other person's host too.

> keyboard stop working (as opposed to 120 seconds later)?
Hi, I did not see anything else in the kernel log when it stopped.

Here is the kernel log:
[ 4.853144] Adding 4008176k swap on /dev/sda1. Priority:-1 extents:1 across:4008176k
[ 4.885030] usb 2-5: New USB device found, idVendor=413c, idProduct=2005
[ 4.885092] usb 2-5: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[ 4.885145] usb 2-5: Product: DELL USB Keyboard
[ 4.885197] usb 2-5: Manufacturer: DELL
[ 4.911127] usbcore: registered new interface driver hiddev
[ 4.917473] input: DELL DELL USB Keyboard as /devices/pci0000:00/0000:00:0b.0/usb2/2-5/2-5:1.0/input/input7
[ 4.917592] generic-usb 0003:413C:2005.0001: input,hidraw0: USB HID v1.10 Keyboard [DELL DELL USB Keyboard] on usb-0000:00:0b.0-5/input0
[ 4.917674] usbcore: registered new interface driver usbhid
[ 4.917727] usbhid: USB HID core driver
[ 5.105278] usb 2-6: new low speed USB device using ohci_hcd and address 3
[ 5.251029] usb 2-6: New USB device found, idVendor=046d, idProduct=c018
[ 5.251091] usb 2-6: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[ 5.251144] usb 2-6: Product: USB Optical Mouse
[ 5.251193] usb 2-6: Manufacturer: Logitech
[ 5.261426] input: Logitech USB Optical Mouse as /devices/pci0000:00/0000:00:0b.0/usb2/2-6/2-6:1.0/input/input8
[ 5.261555] generic-usb 0003:046D:C018.0002: input,hidraw1: USB HID v1.11 Mouse [Logitech USB Optical Mouse] on usb-0000:00:0b.0-6/input0
[ 5.911624] tg3 0000:02:00.0: irq 28 for MSI/MSI-X
[ 9.144766] tg3: eth0: Link is up at 1000 Mbps, full duplex.
[ 9.144829] tg3: eth0: Flow control is off for TX and off for RX.

At this point its still hung.
>
> What does /sys/kernel/debug/usb/devices contain? (You'll probably have
> to look at it _before_ the problem occurs.)
Rebooting and will paste this. Ok, rebooted, here is the output:

# cat /sys/kernel/debug/usb/devices

T: Bus=02 Lev=00 Prnt=00 Port=00 Cnt=00 Dev#= 1 Spd=12 MxCh= 8
B: Alloc= 26/900 us ( 3%), #Int= 2, #Iso= 0
D: Ver= 1.10 Cls=09(hub ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1
P: Vendor=1d6b ProdID=0001 Rev= 2.06
S: Manufacturer=Linux 2.6.33 ohci_hcd
S: Product=OHCI Host Controller
S: SerialNumber=0000:00:0b.0
C:* #Ifs= 1 Cfg#= 1 Atr=e0 MxPwr= 0mA
I:* If#= 0 Alt= 0 #EPs= 1 Cls=09(hub ) Sub=00 Prot=00 Driver=hub
E: Ad=81(I) Atr=03(Int.) MxPS= 2 Ivl=255ms

T: Bus=02 Lev=01 Prnt=01 Port=04 Cnt=01 Dev#= 2 Spd=1.5 MxCh= 0
D: Ver= 1.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 8 #Cfgs= 1
P: Vendor=413c ProdID=2005 Rev= 1.05
S: Manufacturer=DELL
S: Product=DELL USB Keyboard
C:* #Ifs= 1 Cfg#= 1 Atr=a0 MxPwr=100mA
I:* If#= 0 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=01 Prot=01 Driver=usbhid
E: Ad=81(I) Atr=03(Int.) MxPS= 8 Ivl=10ms

T: Bus=02 Lev=01 Prnt=01 Port=05 Cnt=02 Dev#= 3 Spd=1.5 MxCh= 0
D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 8 #Cfgs= 1
P: Vendor=046d ProdID=c018 Rev=43.01
S: Manufacturer=Logitech
S: Product=USB Optical Mouse
C:* #Ifs= 1 Cfg#= 1 Atr=a0 MxPwr=100mA
I:* If#= 0 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=01 Prot=02 Driver=usbhid
E: Ad=81(I) Atr=03(Int.) MxPS= 5 Ivl=10ms

T: Bus=01 Lev=00 Prnt=00 Port=00 Cnt=00 Dev#= 1 Spd=480 MxCh= 8
B: Alloc= 0/800 us ( 0%), #Int= 0, #Iso= 0
D: Ver= 2.00 Cls=09(hub ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1
P: Vendor=1d6b ProdID=0002 Rev= 2.06
S: Manufacturer=Linux 2.6.33 ehci_hcd
S: Product=EHCI Host Controller
S: SerialNumber=0000:00:0b.1
C:* #Ifs= 1 Cfg#= 1 Atr=e0 MxPwr= 0mA
I:* If#= 0 Alt= 0 #EPs= 1 Cls=09(hub ) Sub=00 Prot=00 Driver=hub
E: Ad=81(I) Atr=03(Int.) MxPS= 4 Ivl=256ms

>
> Suppose you don't run X at all. Do the mouse and keyboard eventually
> stop working even then?
I have not tried that, the reason I use the host locally is for X.


Justin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Alan Stern on
On Thu, 25 Mar 2010, Justin Piszcz wrote:

> # cat /sys/kernel/debug/usb/devices
>
> T: Bus=02 Lev=00 Prnt=00 Port=00 Cnt=00 Dev#= 1 Spd=12 MxCh= 8
> B: Alloc= 26/900 us ( 3%), #Int= 2, #Iso= 0
> D: Ver= 1.10 Cls=09(hub ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1
> P: Vendor=1d6b ProdID=0001 Rev= 2.06
> S: Manufacturer=Linux 2.6.33 ohci_hcd
> S: Product=OHCI Host Controller
> S: SerialNumber=0000:00:0b.0
> C:* #Ifs= 1 Cfg#= 1 Atr=e0 MxPwr= 0mA
> I:* If#= 0 Alt= 0 #EPs= 1 Cls=09(hub ) Sub=00 Prot=00 Driver=hub
> E: Ad=81(I) Atr=03(Int.) MxPS= 2 Ivl=255ms
>
> T: Bus=02 Lev=01 Prnt=01 Port=04 Cnt=01 Dev#= 2 Spd=1.5 MxCh= 0
> D: Ver= 1.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 8 #Cfgs= 1
> P: Vendor=413c ProdID=2005 Rev= 1.05
> S: Manufacturer=DELL
> S: Product=DELL USB Keyboard
> C:* #Ifs= 1 Cfg#= 1 Atr=a0 MxPwr=100mA
> I:* If#= 0 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=01 Prot=01 Driver=usbhid
> E: Ad=81(I) Atr=03(Int.) MxPS= 8 Ivl=10ms
>
> T: Bus=02 Lev=01 Prnt=01 Port=05 Cnt=02 Dev#= 3 Spd=1.5 MxCh= 0
> D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 8 #Cfgs= 1
> P: Vendor=046d ProdID=c018 Rev=43.01
> S: Manufacturer=Logitech
> S: Product=USB Optical Mouse
> C:* #Ifs= 1 Cfg#= 1 Atr=a0 MxPwr=100mA
> I:* If#= 0 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=01 Prot=02 Driver=usbhid
> E: Ad=81(I) Atr=03(Int.) MxPS= 5 Ivl=10ms

All right, the problematic mouse is attached to bus 2 (as is the
keyboard). So the next step is to run a kernel with CONFIG_USB_DEBUG
enabled. When the mouse stops working, go to
/sys/kernel/debug/usb/ohci/0000:00:0b.0 and see what the various files
in that directory contain. Also, check if any additional debugging
info shows up in the system log.

> > Suppose you don't run X at all. Do the mouse and keyboard eventually
> > stop working even then?
> I have not tried that, the reason I use the host locally is for X.

You should try it anyway. If the failure still occurs, then you can
rule out any connection with X or the nVidia graphics.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/