From: Alan Stern on
On Wed, 17 Mar 2010, FUJITA Tomonori wrote:

> dma_sync_single_for_* can do a partial sync but dma_sync_sg_for_*
> doesn't support a partial sync.
>
>
> > So it isn't clear that dma_sync_sg_for_cpu(dev, sg, 1, dir) can be used
> > on a mapping created by dma_map_sg(dev, sg, n, dir),
>
> You should not do (though it might work).
>
> > and it isn't
> > clear that dma_sync_single_for_cpu() can be used on a mapping created
> > by dma_map_sg().
>
> You should not do (though it might work).
>
>
> > But if you guys say it will work, I'll go ahead and use
> > dma_sync_single_for_cpu().
> >
>
> Well, it's undocumented. It might work but might not.

It's a real problem. I need it to work correctly.

Here's the situation. The USB controller drivers don't all support
scatter-gather operation. So there's a library routine in the USB core
which calls dma_map_sg() and then creates a separate I/O request for
each scatterlist element. The driver can process these requests one at
a time, and when they are all finished the library routine calls
dma_unmap_sg().

However... For tracing purposes (usbmon -- like tcpdump but for USB),
we may need to copy the data from each I/O request's transfer buffer.
Unfortunately, this copying is done as each request is submitted (for
output) or as it completes (for input), at which times the buffers are
all mapped for DMA. That's the problem.

Would we be better off not using dma_map_sg() at all in this situation?
We could map each scatterlist buffer individually with dma_map_single()
as the request is submitted and then unmap the buffer when the request
completes, just like with non-sg transfers; then the problem wouldn't
arise.

Would there be any significant penalty for doing this? I realize it
would prevent adjacent buffers from getting coalesced, but that's
probably okay. Any other reason not to?

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: FUJITA Tomonori on
On Wed, 17 Mar 2010 11:22:01 -0400 (EDT)
Alan Stern <stern(a)rowland.harvard.edu> wrote:

> Here's the situation. The USB controller drivers don't all support
> scatter-gather operation. So there's a library routine in the USB core
> which calls dma_map_sg() and then creates a separate I/O request for
> each scatterlist element. The driver can process these requests one at
> a time, and when they are all finished the library routine calls
> dma_unmap_sg().
>
> However... For tracing purposes (usbmon -- like tcpdump but for USB),
> we may need to copy the data from each I/O request's transfer buffer.
> Unfortunately, this copying is done as each request is submitted (for
> output) or as it completes (for input), at which times the buffers are
> all mapped for DMA. That's the problem.
>
> Would we be better off not using dma_map_sg() at all in this situation?
> We could map each scatterlist buffer individually with dma_map_single()
> as the request is submitted and then unmap the buffer when the request
> completes, just like with non-sg transfers; then the problem wouldn't
> arise.
>
> Would there be any significant penalty for doing this? I realize it
> would prevent adjacent buffers from getting coalesced, but that's
> probably okay. Any other reason not to?

No reason. About merging adjacent buffers, there are few IOMMU
implementations that do. The recent IOMMU implementations such as VT-d
and AMD IOMMU don't.

If drivers don't support scatter-gather operation, they had better use
dma_map_page() instead of forging scatter-gather lists and playing
with dma_map_sg().
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Alan Stern on
On Thu, 18 Mar 2010, FUJITA Tomonori wrote:

> On Wed, 17 Mar 2010 11:22:01 -0400 (EDT)
> Alan Stern <stern(a)rowland.harvard.edu> wrote:
>
> > Here's the situation. The USB controller drivers don't all support
> > scatter-gather operation. So there's a library routine in the USB core
> > which calls dma_map_sg() and then creates a separate I/O request for
> > each scatterlist element. The driver can process these requests one at
> > a time, and when they are all finished the library routine calls
> > dma_unmap_sg().
> >
> > However... For tracing purposes (usbmon -- like tcpdump but for USB),
> > we may need to copy the data from each I/O request's transfer buffer.
> > Unfortunately, this copying is done as each request is submitted (for
> > output) or as it completes (for input), at which times the buffers are
> > all mapped for DMA. That's the problem.
> >
> > Would we be better off not using dma_map_sg() at all in this situation?
> > We could map each scatterlist buffer individually with dma_map_single()
> > as the request is submitted and then unmap the buffer when the request
> > completes, just like with non-sg transfers; then the problem wouldn't
> > arise.
> >
> > Would there be any significant penalty for doing this? I realize it
> > would prevent adjacent buffers from getting coalesced, but that's
> > probably okay. Any other reason not to?
>
> No reason. About merging adjacent buffers, there are few IOMMU
> implementations that do. The recent IOMMU implementations such as VT-d
> and AMD IOMMU don't.
>
> If drivers don't support scatter-gather operation, they had better use
> dma_map_page() instead of forging scatter-gather lists and playing
> with dma_map_sg().

Okay, I'll handle it that way. Thanks for the advice.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/