From: Andreas Mohr on
[added another __bzero coherency crash victim, see
http://lkml.org/lkml/2008/6/9/14 ]

On Tue, Feb 02, 2010 at 03:52:19PM +0100, Oliver Neukum wrote:
> Am Dienstag, 2. Februar 2010 15:42:49 schrieb Clemens Ladisch:
> > > Or... usb-audio? I should have verified that it is using bulk endpoints
> > > (and thus the patch applies to my case).
> > > usb-audio probably uses isochronous transfers, thus that would be
> > > an obvious reason why the patch didn't work for me.
> >
> > snd-usb-audio indeed uses isochronous transfers, but those buffers are
> > never mapped into user space. The intermediate vmalloc()ed buffer is,
> > however, and there was a bugfix for this recently. Do you have these
> > patches in your tree?
>
> Now that I think about it, several video drivers do map it to user space.

OK, then the urb loop needs to also handle isochronous pipes,
and IMHO we should have a generic helper for this instead of open-coding
it, since it probably needs to be done in a couple affected HCDs
(and, most importantly, only on affected architectures - which the helper
could handle transparently).

Clemens: no, both of these patches haven't been applied (yet!!),
many thanks for the notification!

Will apply both patches and the isochronous addition, hopefully that
improves things (will be painful to check which of these things managed to
fix it - in case it does! -, though). Nope, will apply step by step,
both patches, then isochronous as a last resort.

Andreas Mohr
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Catalin Marinas on
On Tue, 2010-02-02 at 15:10 +0000, Andreas Mohr wrote:
> [added another __bzero coherency crash victim, see
> http://lkml.org/lkml/2008/6/9/14 ]
>
> On Tue, Feb 02, 2010 at 03:52:19PM +0100, Oliver Neukum wrote:
> > Am Dienstag, 2. Februar 2010 15:42:49 schrieb Clemens Ladisch:
> > > > Or... usb-audio? I should have verified that it is using bulk endpoints
> > > > (and thus the patch applies to my case).
> > > > usb-audio probably uses isochronous transfers, thus that would be
> > > > an obvious reason why the patch didn't work for me.
> > >
> > > snd-usb-audio indeed uses isochronous transfers, but those buffers are
> > > never mapped into user space. The intermediate vmalloc()ed buffer is,
> > > however, and there was a bugfix for this recently. Do you have these
> > > patches in your tree?
> >
> > Now that I think about it, several video drivers do map it to user space.
>
> OK, then the urb loop needs to also handle isochronous pipes,
> and IMHO we should have a generic helper for this instead of open-coding
> it, since it probably needs to be done in a couple affected HCDs
> (and, most importantly, only on affected architectures - which the helper
> could handle transparently).

I'm planning to send a proposal to linux-arch for a flush_dcache_range()
function.

--
Catalin

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Alan Stern on
On Tue, 2 Feb 2010, Oliver Neukum wrote:

> Am Dienstag, 2. Februar 2010 13:39:35 schrieb Catalin Marinas:
> > > For storage that is correct. But what about other sources of pages,
> > > for example iSCSI?
> >
> > In the iSCSI case, does the HCD driver write directly to a page cache
> > page? Or it just fills in network packets that are copied to page cache
> > pages by the iSCSI code (sorry, I'm not familiar with this part of the
> > kernel). If the latter, the cache flushing in the HCD driver would not
> > help and it needs to be done in the iSCSI code.
>
> As far as I can tell iSCSI does a private copy. But I don't know how
> many methods to transfer code pages over USB exist. I'd say the
> conservative solution is to flush for everything but control transfers.

This doesn't make any sense. Nobody would ever use isochronous
transfers to store data into a code page because isochronous is
unreliable. (Audio isn't a counterexample -- audio data may be mapped
to userspace, but only to data pages, not code pages. And the problem
here is to maintain consistency between the D and I caches.)

In principle interrupt transfers could be used, but it is most
unlikely. They are intended for bounded-latency transfers, not
transfers of potentially large amounts of data.

The only transfer type that makes sense to worry about is bulk.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Catalin Marinas on
On Tue, 2010-02-02 at 17:11 +0000, Alan Stern wrote:
> On Tue, 2 Feb 2010, Oliver Neukum wrote:
>
> > Am Dienstag, 2. Februar 2010 13:39:35 schrieb Catalin Marinas:
> > > > For storage that is correct. But what about other sources of pages,
> > > > for example iSCSI?
> > >
> > > In the iSCSI case, does the HCD driver write directly to a page cache
> > > page? Or it just fills in network packets that are copied to page cache
> > > pages by the iSCSI code (sorry, I'm not familiar with this part of the
> > > kernel). If the latter, the cache flushing in the HCD driver would not
> > > help and it needs to be done in the iSCSI code.
> >
> > As far as I can tell iSCSI does a private copy. But I don't know how
> > many methods to transfer code pages over USB exist. I'd say the
> > conservative solution is to flush for everything but control transfers.
>
> This doesn't make any sense. Nobody would ever use isochronous
> transfers to store data into a code page because isochronous is
> unreliable. (Audio isn't a counterexample -- audio data may be mapped
> to userspace, but only to data pages, not code pages. And the problem
> here is to maintain consistency between the D and I caches.)

My issues is with both I-D coherency and D-cache aliasing caused by
pages mapped in both user and kernel space (with different colours). The
flush_dcache_page() call should target both cases.

--
Catalin

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Andreas Mohr on
On Tue, Feb 02, 2010 at 03:42:49PM +0100, Clemens Ladisch wrote:
> Andreas Mohr wrote:
> > Or... usb-audio? I should have verified that it is using bulk endpoints
> > (and thus the patch applies to my case).
> > usb-audio probably uses isochronous transfers, thus that would be
> > an obvious reason why the patch didn't work for me.
>
> snd-usb-audio indeed uses isochronous transfers, but those buffers are
> never mapped into user space. The intermediate vmalloc()ed buffer is,
> however, and there was a bugfix for this recently. Do you have these
> patches in your tree?
> http://git.kernel.org/?p=linux/kernel/git/tiwai/sound-2.6.git;a=commit;h=3e879d7bac705be4813a0ec9560cbe31db4b269f
> http://git.kernel.org/?p=linux/kernel/git/tiwai/sound-2.6.git;a=commit;h=c32d977b8157bf67cdf47729ce7dd054a26eb534

OK, I've now added both patches to my quilt series (and pushed everything),
rebuilt, reflashed image and copied modules, and it still bombs
just the very same way.
And this also with Catalins latest patch version (the one using != PIPE_CONTROL
to hit iso transfers etc. as well).
So it seems I still haven't got to the core of the issue despite all these
rather different patch attempts.

I'm afraid if it turns out that keeping open the sound device manually
via another process manages to workaround it, then I'll simply
give it all up completely and live with the current semi-satisfying situation
on my custom 2.6.31.9 build.

Any further ideas or patches that I could try?
(I might investigate the issue myself in a serious way sometime later,
but don't count on it)

Thanks!

Andreas Mohr

netconsole log (some previous crashes were at __bzero, now it was two times
at __copy_user - maybe the patches changed something for real?):

Instruction bus error, epc == 80004dd8, ra == 80000018
Oops[#1]:
Cpu 0
$ 0 : 00000000 1000d000 00000000 00000000
$ 4 : 7f9e6be8 81ee7ec4 00000004 00000000
$ 8 : 00000000 00000000 00000000 81fac000
$12 : 4b688a80 80340000 81d6e868 00000400
$16 : 81ee7f00 00000000 7f9e6be8 00000001
$20 : 81ee7eb8 00000000 7f9e6c9c 7f9eb320
$24 : 00000000 2b565ed0
$28 : 81ee6000 81ee7ea8 7f9f6c98 80000018
Hi : 00000000
Lo : 00000000
epc : 80004dd8 __copy_user+0xd4/0x2bc
Not tainted
ra : 80000018 0x80000018
Status: 1000d003 KERNEL EXL IE
Cause : 00800018
PrId : 00029029 (Broadcom BCM3302)
Modules linked in: snd_usb_audio snd_pcm_oss snd_mixer_oss snd_pcm evdev snd_timer snd_page_alloc snd_usb_lib snd_rawmidi usbhid snd_seq_device snd_hwdep hid snd input_core soundcore ipv6 arc4 ecb cryptomgr aead pcompress crypto_blkcipher crypto_hash crypto_algapi b43 mac80211 cfg80211
Process mpd (pid: 1310, threadinfo=81ee6000, task=81d92838, tls=00000000)
Stack : 00000000 800afd4c 81ee56f8 00000219 00000000 00000000 00000000 00000000
ffffffff 3b043616 81ee7f00 7f9e6be8 00000000 00000000 00000000 800b1370
7f9e6ca0 81ee5680 000182fc 00493a83 81ee7f00 8009f6b0 00000219 03b1daad
00000000 00002710 00000000 00000000 7f9f6ca0 7f9e6ca0 7f9ec528 00000127
00000000 800031f0 00000000 2ae49060 7f9f75a8 2ae49060 7f9e6be8 7f9f5bd8
...
Call Trace:
[<80004dd8>] __copy_user+0xd4/0x2bc


Code: 8ca80000 24a50004 24c6fffc <ac880000> 1706fffb 24840004 10c00040 00864821 240a0020
Disabling lock debugging due to kernel taint
Instruction bus error, epc == 80096fa0, ra == 80000018
Oops[#2]:
Cpu 0
$ 0 : 00000000 1000d000 c0156064 00000064
$ 4 : 00000032 803b5514 00000032 81fa8000
$ 8 : 8037f840 00080000 81040000 00000003
$12 : 00000010 8037f840 00000004 00000000
$16 : 00000032 81fa8d54 2ab55000 00398f45
$20 : 2ab56000 0064d613 00000000 00000000
$24 : 00000000 80018ff0
$28 : 81ee6000 81ee7c58 00000000 80000018
Hi : 00000000
Lo : 00000000
epc : 80096fa0 swap_info_get+0x74/0xfc
Tainted: G D
ra : 80000018 0x80000018
Status: 1000d003 KERNEL EXL IE
Cause : 00800018
PrId : 00029029 (Broadcom BCM3302)
Modules linked in: snd_usb_audio snd_pcm_oss snd_mixer_oss snd_pcm evdev snd_timer snd_page_alloc snd_usb_lib snd_rawmidi usbhid snd_seq_device snd_hwdep hid snd input_core soundcore ipv6 arc4 ecb cryptomgr aead pcompress crypto_blkcipher crypto_hash crypto_algapi b43 mac80211 cfg80211
Process mpd (pid: 1310, threadinfo=81ee6000, task=81d92838, tls=00000000)
Stack : ffffffff 800736a0 00000004 81d6e870 80350fd0 80099498 81d929cc 81036e40
81fa8bfc 2aaff000 00064000 81fa8d54 2ab55000 80088a30 8033dab8 80029d78
803402d4 00000006 2ab55fff 8033d9c0 81eaae60 81f0c2a8 81f0c2a8 2ab56000
00000000 00000001 80c4c0bc 81eaae60 8037f840 81eaae94 81d92838 00000000
00000001 7f9eb320 7f9f6c98 8008db1c 81ee7d00 80c4ce90 00000000 ffffffff
...
Call Trace:
[<80096fa0>] swap_info_get+0x74/0xfc
[<80099498>] free_swap_and_cache+0x1c/0x218
[<80088a30>] unmap_vmas+0x418/0x63c
[<8008db1c>] exit_mmap+0xb8/0x148
[<8002e3c4>] mmput+0xc0/0x1d8
[<800333e8>] exit_mm+0x260/0x298
[<800357cc>] do_exit+0x1cc/0x688
[<80014658>] nmi_exception_handler+0x0/0x34


Code: 00041840 8ca20020 00431021 <94440000> 1480001d 8fbf0014 3c048030 3c05802c 24a5f280
Fixing recursive fault but reboot is needed!
Instruction bus error, epc == 80004dd8, ra == 80000018
Oops[#3]:
Cpu 0
$ 0 : 00000000 1000d000 00000000 00000000
$ 4 : 7fcd81b0 81c0ddd0 00000000 1000d001
$ 8 : 00000000 00000000 00000000 806f8000
$12 : 4b688a84 7f9f7f18 81d6e868 00000000
$16 : 00000004 00000000 81c0ddc0 81c0ddc0
$20 : 7fcd81b0 00000000 81c0ddcc 00000000
$24 : 00000000 2b565ed0
$28 : 81c0c000 81c0dd98 00000001 80000018
Hi : 0000007d
Lo : eb254400
epc : 80004dd8 __copy_user+0xd4/0x2bc
Tainted: G D
ra : 80000018 0x80000018
Status: 1000d003 KERNEL EXL IE
Cause : 00800018
PrId : 00029029 (Broadcom BCM3302)
Modules linked in: snd_usb_audio snd_pcm_oss snd_mixer_oss snd_pcm evdev snd_timer snd_page_alloc snd_usb_lib snd_rawmidi usbhid snd_seq_device snd_hwdep hid snd input_core soundcore ipv6 arc4 ecb cryptomgr aead pcompress crypto_blkcipher crypto_hash crypto_algapi b43 mac80211 cfg80211
Process init (pid: 1, threadinfo=81c0c000, task=81c08480, tls=00000000)
Stack : 00000000 81c0dda8 81c0df00 80350fb0 81c0ddc0 81c0ddc4 81c0ddc8 81c0ddcc
81c0ddd0 81c0ddd4 00000400 00000000 00000000 00000000 00000000 00000000
00000000 00000000 81d98000 ffffff9c 81c0dea8 0044a234 7fcd8598 8009b864
00000001 81c0dea8 00000001 81d98000 ffffff9c 800a2d18 00000003 00000002
00000003 00000003 0000000d 00000000 00000000 00000000 000000c9 00001180
...
Call Trace:
[<80004dd8>] __copy_user+0xd4/0x2bc


Code: 8ca80000 24a50004 24c6fffc <ac880000> 1706fffb 24840004 10c00040 00864821 240a0020
kobject: 'ep_01' (81e52f10): kobject_uevent_env
kobject: 'ep_01' (81e52f10): kobject_uevent_env: filter function caused the event to drop!
Kernel panic - not syncing: Attempted to kill init!

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/