From: Daniel Mack on
Hi,

I was pointed to https://bugzilla.kernel.org/show_bug.cgi?id=15580 by
Pedro Ribeiro, and unfortunately I'm pretty late in the game. I wasn't
aware of that thread until yesterday.

While the report is quite confusing, the reason seams pretty clear to me
as I've been thru quite some time-demanding debugging of a very similar
issue on Mac OS X. But I'm not totally sure whether we really hit the
same issue here, so I'd like to have your opinions first.

The problem is appearantly the way the transfer buffer is allocated in
the drivers. In the snd-usb-caiaq driver, I used kzalloc() to get memory
which works fine on 32bit systems. On x86_64, however, it seems that
kzalloc() hands out memory beyond the 32bit addressable boundary, which
the DMA controller of the 32bit PCI-connected EHCI controller is unable
to write to or read from. Am I correct on this conclusion?

Depending on the condition of the memory management, things might work
or not, and especially right after a reboot, there's a better chance to
get lower memory.

The fix is to use usb_buffer_alloc() for that purpose which ensures
memory that is suitable for DMA. And on x86_64, this also means that the
upper 32 bits of the address returned are all 0's.

If what I've stated is true, there are quite some more drivers affected
by this issue. I collected a list of places where similar fixes are
needed, and I can send patches if I get a thumbs-up.

Pedro is currently testing a patch I sent out yesterday.

Thanks,
Daniel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Alan Stern on
On Wed, 7 Apr 2010, Daniel Mack wrote:

> Hi,
>
> I was pointed to https://bugzilla.kernel.org/show_bug.cgi?id=15580 by
> Pedro Ribeiro, and unfortunately I'm pretty late in the game. I wasn't
> aware of that thread until yesterday.
>
> While the report is quite confusing, the reason seams pretty clear to me
> as I've been thru quite some time-demanding debugging of a very similar
> issue on Mac OS X. But I'm not totally sure whether we really hit the
> same issue here, so I'd like to have your opinions first.
>
> The problem is appearantly the way the transfer buffer is allocated in
> the drivers. In the snd-usb-caiaq driver, I used kzalloc() to get memory
> which works fine on 32bit systems. On x86_64, however, it seems that
> kzalloc() hands out memory beyond the 32bit addressable boundary, which
> the DMA controller of the 32bit PCI-connected EHCI controller is unable
> to write to or read from. Am I correct on this conclusion?

That seems like the right answer. You are correct that an EHCI
controller capable only of 32-bit memory accesses would not be able to
use a buffer above the 4 GB line.

> Depending on the condition of the memory management, things might work
> or not, and especially right after a reboot, there's a better chance to
> get lower memory.
>
> The fix is to use usb_buffer_alloc() for that purpose which ensures
> memory that is suitable for DMA. And on x86_64, this also means that the
> upper 32 bits of the address returned are all 0's.

That is not a good fix. usb_buffer_alloc() provides coherent memory,
which is not what we want. I believe the correct fix is to specify the
GFP_DMA32 flag in the kzalloc() call.

Of course, some EHCI hardware _is_ capable of using 64-bit addresses.
But not all, and other controller types aren't. In principle we could
create a new allocation routine, which would take a pointer to the USB
bus as an additional argument and use it to decide whether the memory
needs to lie below 4 GB. I'm not sure adding this extra complexity
would be worthwhile.

> If what I've stated is true, there are quite some more drivers affected
> by this issue.

Practically every USB driver, I should think. And maybe a lot of
non-USB drivers too.

> I collected a list of places where similar fixes are
> needed, and I can send patches if I get a thumbs-up.
>
> Pedro is currently testing a patch I sent out yesterday.

Good work -- it certainly would have taken me a long time to figure
this out.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Daniel Mack on
On Wed, Apr 07, 2010 at 10:59:47AM -0400, Alan Stern wrote:
> On Wed, 7 Apr 2010, Daniel Mack wrote:
> > Depending on the condition of the memory management, things might work
> > or not, and especially right after a reboot, there's a better chance to
> > get lower memory.
> >
> > The fix is to use usb_buffer_alloc() for that purpose which ensures
> > memory that is suitable for DMA. And on x86_64, this also means that the
> > upper 32 bits of the address returned are all 0's.
>
> That is not a good fix. usb_buffer_alloc() provides coherent memory,
> which is not what we want. I believe the correct fix is to specify the
> GFP_DMA32 flag in the kzalloc() call.
>
> Of course, some EHCI hardware _is_ capable of using 64-bit addresses.
> But not all, and other controller types aren't. In principle we could
> create a new allocation routine, which would take a pointer to the USB
> bus as an additional argument and use it to decide whether the memory
> needs to lie below 4 GB. I'm not sure adding this extra complexity
> would be worthwhile.

Well, I thought this is exactly what the usb_buffer_alloc() abstraction
functions are there for. We already pass a pointer to a struct
usb_device, so the routine knows which host controller it operates on.
So we can safely dispatch all the magic inside this function, no?

If not, I would rather introduce a new function than adding GFP_ flags
to all existing drivers.

I vote for a clean solution, a fixup of existing implementations and
a clear note about how to allocate buffers for USB drivers. I believe
faulty allocations of this kind can explain quite a lot of problems on
x86_64 machines.

> > If what I've stated is true, there are quite some more drivers affected
> > by this issue.
>
> Practically every USB driver, I should think. And maybe a lot of
> non-USB drivers too.

I found many such problems in drivers/media/{dvb,video},
drivers/usb/serial, sound/usb and even in the USB core. If we agree on
how to fix that nicely, I can take some of them.

Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Greg KH on
On Wed, Apr 07, 2010 at 05:11:25PM +0200, Daniel Mack wrote:
> On Wed, Apr 07, 2010 at 10:59:47AM -0400, Alan Stern wrote:
> > On Wed, 7 Apr 2010, Daniel Mack wrote:
> > > Depending on the condition of the memory management, things might work
> > > or not, and especially right after a reboot, there's a better chance to
> > > get lower memory.
> > >
> > > The fix is to use usb_buffer_alloc() for that purpose which ensures
> > > memory that is suitable for DMA. And on x86_64, this also means that the
> > > upper 32 bits of the address returned are all 0's.
> >
> > That is not a good fix. usb_buffer_alloc() provides coherent memory,
> > which is not what we want. I believe the correct fix is to specify the
> > GFP_DMA32 flag in the kzalloc() call.
> >
> > Of course, some EHCI hardware _is_ capable of using 64-bit addresses.
> > But not all, and other controller types aren't. In principle we could
> > create a new allocation routine, which would take a pointer to the USB
> > bus as an additional argument and use it to decide whether the memory
> > needs to lie below 4 GB. I'm not sure adding this extra complexity
> > would be worthwhile.
>
> Well, I thought this is exactly what the usb_buffer_alloc() abstraction
> functions are there for. We already pass a pointer to a struct
> usb_device, so the routine knows which host controller it operates on.
> So we can safely dispatch all the magic inside this function, no?

Hm, yeah, I thought that is what it was for too. If not, why can't we
use it like this?

> If not, I would rather introduce a new function than adding GFP_ flags
> to all existing drivers.

I agree.

> I vote for a clean solution, a fixup of existing implementations and
> a clear note about how to allocate buffers for USB drivers. I believe
> faulty allocations of this kind can explain quite a lot of problems on
> x86_64 machines.

Yeah, I really don't want to have to change every driver in different
ways just depending on if someone thinks it is going to need to run on
this wierd hardware.

Alan, any objection to just using usb_buffer_alloc() for every driver?
Or is that too much overhead?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Daniel Mack on
On Wed, Apr 07, 2010 at 08:31:54AM -0700, Greg KH wrote:
> On Wed, Apr 07, 2010 at 05:11:25PM +0200, Daniel Mack wrote:
> > I vote for a clean solution, a fixup of existing implementations and
> > a clear note about how to allocate buffers for USB drivers. I believe
> > faulty allocations of this kind can explain quite a lot of problems on
> > x86_64 machines.
>
> Yeah, I really don't want to have to change every driver in different
> ways just depending on if someone thinks it is going to need to run on
> this wierd hardware.
>
> Alan, any objection to just using usb_buffer_alloc() for every driver?
> Or is that too much overhead?

FWIW, most drivers I've seen in the past hours use a wild mix of
kmalloc(), kzalloc(), kcalloc() and usb_buffer_alloc(). That should
really be unified.

Daniel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/