From: FUJITA Tomonori on
On Fri, 7 May 2010 10:51:10 -0400 (EDT)
Alan Stern <stern(a)rowland.harvard.edu> wrote:

> On Fri, 7 May 2010, Daniel Mack wrote:
>
> > > At least the audio class and ua101 drivers don't do this and fill the
> > > buffers before they are submitted.
> >
> > Gnaa, you're right. I _thought_ my code does it the way I described, but
> > what I wrote is how I _wanted_ to do it, not how it's currently done. I
> > have a plan to change this in the future.
> >
> > So unfortunately, that doesn't explain it either. Sorry for the noise.
>
> At one point we tried an experiment, printing out the buffer and DMA
> addresses. I don't recall seeing anything obviously wrong, but if an
> IOMMU was in use then that might not mean anything. Is it possible
> that the IOMMU mappings sometimes get messed up for addresses above 4
> GB?

You mean that an IOMMU could allocate an address above 4GB wrongly? If
so, IIRC, all the IOMMU implementations use dev->dma_mask and
dev->coherent_dma_mask properly. And the DMA address space of the
majority of IOMMUs are limited less than 4GB.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: David Woodhouse on
On Mon, 2010-05-10 at 11:50 +0900, FUJITA Tomonori wrote:
> On Fri, 7 May 2010 10:51:10 -0400 (EDT)
> Alan Stern <stern(a)rowland.harvard.edu> wrote:
>
> > On Fri, 7 May 2010, Daniel Mack wrote:
> >
> > > > At least the audio class and ua101 drivers don't do this and fill the
> > > > buffers before they are submitted.
> > >
> > > Gnaa, you're right. I _thought_ my code does it the way I described, but
> > > what I wrote is how I _wanted_ to do it, not how it's currently done. I
> > > have a plan to change this in the future.
> > >
> > > So unfortunately, that doesn't explain it either. Sorry for the noise.
> >
> > At one point we tried an experiment, printing out the buffer and DMA
> > addresses. I don't recall seeing anything obviously wrong, but if an
> > IOMMU was in use then that might not mean anything. Is it possible
> > that the IOMMU mappings sometimes get messed up for addresses above 4
> > GB?
>
> You mean that an IOMMU could allocate an address above 4GB wrongly? If
> so, IIRC, all the IOMMU implementations use dev->dma_mask and
> dev->coherent_dma_mask properly. And the DMA address space of the
> majority of IOMMUs are limited less than 4GB.

The Intel IOMMU code will use dev->dma_mask and dev->coherent_dma_mask
properly. It is not limited to 4GiB, but it will tend to give virtual
DMA addresses below 4GiB even when a device is capable of more; it'll
only give out higher addresses when the address space below 4GiB is
exhausted.

--
dwmw2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Konrad Rzeszutek Wilk on
On Fri, May 07, 2010 at 12:24:08PM +0200, Daniel Mack wrote:
> On Fri, May 07, 2010 at 11:47:37AM +0200, Clemens Ladisch wrote:
> > Daniel Mack wrote:
> > > The problem is again (summarized):
> > >
> > > On 64bit machines, with 4GB or more, the allocated buffers for USB
> > > transfers might be beyond the 32bit boundary. In this case, the IOMMU
> > > should take care and install DMA bounce buffer to copy over the buffer
> > > before the transfer actually happens. The problem is, however, that this
> > > copy mechanism takes place when the URB with its associated buffer is
> > > submitted, not when the EHCI will actually do the transfer.
> > >
> > > In the particular case of audio drivers, though, the contents of the
> > > buffers are likely to change after the submission. What we do here
> > > is that we map the audio stream buffers which are used by ALSA to
> > > the output URBs, so they're filled asychronously. Once the buffer is
> > > actually sent out on the bus, it is believed to contain proper audio
> > > date. If it doesn't, that's due to too tight audio timing or other
> > > problems. This breaks once buffers are magically bounced in the
> > > background.
> >
> > At least the audio class and ua101 drivers don't do this and fill the
> > buffers before they are submitted.
>
> Gnaa, you're right. I _thought_ my code does it the way I described, but
> what I wrote is how I _wanted_ to do it, not how it's currently done. I
> have a plan to change this in the future.
>
> So unfortunately, that doesn't explain it either. Sorry for the noise.

Well, you might be on the right track. You see, when you do any DMA API
operation (say pci_map_page), you might end up with _two_ DMA addresses.
One that you get from doing 'virt_to_phys' for your buffer (which might
be above the 4GB mark), and another from the 'pci_map_page' (which can
be the virt_to_phys of your buffer or it can be the DMA address of the
SWIOTLB). If you don't submit the _right_ DMA address or sync after the
DMA transfer (so the SWIOTLB would do its memcpy to your allocated
buffer DMA address), you could end up having the data it the SWIOTLB buffer,
and check data in your kzalloc buffer and notice that nothing is there
(and if it hadn't called pci_dma_sync.. before the check).

But this obviously would not happen if you buffer is allocated with the
GFP_DMA32.

I am not familiar with the USB stack so it might be doing this correctly
already...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Konrad Rzeszutek Wilk on
> > > Either the data isn't getting written to the buffer correctly or else
> > > the buffer isn't getting sent to the device correctly. Can anybody
> > > suggest a means of determining which is the case?
> >
> > I can't say anything about this log that including only DMA addresses.
> > I'm not familiar with how the USB core does DMA stuff. And the USB
> > stack design that the USB core does DMA stuff (allocating, mappings,
> > etc) makes debugging DMA issues really difficult.
>
> The DMA stuff is simple enough in this case. The urb->transfer_buffer
> address is passed to dma_map_single(), and the DMA address it returns
> is stored in urb->transfer_dma. Those are the two values printed out
> by the debugging patch.

Is that address (urb->transfer_dma) the same as 'virt_to_phys(urb->transfer_buffer)'
(if not, then SWIOTLB is being utilized) and is the dma_sync_* done on the
urb->transfer_dma (to properly sync the data from the SWIOTLB to the
transfer_buffer) before you start using the urb->transfer_buffer?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: FUJITA Tomonori on
On Tue, 11 May 2010 10:24:40 -0400
Konrad Rzeszutek Wilk <konrad.wilk(a)oracle.com> wrote:

> > > > Either the data isn't getting written to the buffer correctly or else
> > > > the buffer isn't getting sent to the device correctly. Can anybody
> > > > suggest a means of determining which is the case?
> > >
> > > I can't say anything about this log that including only DMA addresses.
> > > I'm not familiar with how the USB core does DMA stuff. And the USB
> > > stack design that the USB core does DMA stuff (allocating, mappings,
> > > etc) makes debugging DMA issues really difficult.
> >
> > The DMA stuff is simple enough in this case. The urb->transfer_buffer
> > address is passed to dma_map_single(), and the DMA address it returns
> > is stored in urb->transfer_dma. Those are the two values printed out
> > by the debugging patch.
>
> Is that address (urb->transfer_dma) the same as 'virt_to_phys(urb->transfer_buffer)'
> (if not, then SWIOTLB is being utilized) and is the dma_sync_* done on the
> urb->transfer_dma (to properly sync the data from the SWIOTLB to the
> transfer_buffer) before you start using the urb->transfer_buffer?

Or calling dma_unmap_single.

Can you tell me all the exact process of DMA that the usb core and the
driver do?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/