From: Evan Lavelle on
I'm writing a driver for a PCIe card which has to support DMA.

I can get this to work by using 'pci_alloc_consistent' to get a coherent
mapping for a DMA buffer (when I pass the returned 'dma_addr_t' to my
card, it can use it to successfully DMA into the PC).

The problem is that I actually want to use a streaming mapping, for
direct I/O, with a scatter-gather list, and I can't get this to work.
'pci_map_sg' returns me a bus address for the user's read buffer, but
this doesn't appear to be a valid bus address. When the PCIe card DMAs
to this bus address, the DMA operation appears to complete, but the
user's read buffer is not modified. Any ideas?

This is the code that works:

dmaCPUAddr = pci_alloc_consistent(
PCI_Dev_Cfg, dmaBufSize, dmaPCIBusAddr);

This allocates a 64KB kernel buffer. The PCIe card can DMA into this
buffer, and I can then copy this buffer back to the user.
'dmaPCIBusAddr' is set to something in the region of 0x32xxxxxx to
0x34xxxxxx (on x86), so this is presumably a valid bus address into
kernel memory.

This is a simplified version of the code which doesn't work:

===================================================
// get bus addresses to DMA into user memory
// 'userAddr', for 'npages' pages

down_read(&current->mm->mmap_sem);
ret = get_user_pages(
current, current->mm, userAddr,
numPages, 1, 1, pageList, NULL);
up_read(&current->mm->mmap_sem);

if(ret != numPages) {...error}

....'kmalloc' and clear scatterlist 'sgList'

for(i = 0; i < numPages; i++) {
if(pageList[i] == NULL)
... error
sgList[i].page = pageList[i];
sgList[i].offset = 0;
sgList[i].length = PAGE_SIZE;
}

sgLen = pci_map_sg(
pPciDev, // the PCI device
sgList, // the place to store the list
numPages, // how many initial pages are in it
direction); // data flow direction

if(sgLen == 0) { ...error }

{ // DEBUG
struct scatterlist *sg = sgList;
for(i = 0; i < sgLen; i++, sg++)
printk(
KERN_INFO "busAddr 0x%08x; len %d\n",
(u32)sg_dma_address(sg), sg_dma_len(sg));
}
===================================================

For one page of user memory, 'sg_dma_address' returns a bus address in
the region of 0x13xxxxxx (on x86). When the PCIe card tries to DMA to
this address, the data disappears - I can't see it in my userland test
program. Any ideas?

Thanks -

Evan

================================================

Extra information:

- modern x86_64 HP multiprocessor server motherboard, running 32-bit
RHEL 5.1

- 2.6.18-53el5xen, i686/athlon/i386

- 'page_address(pageList[0])' sometimes returns NULL, which I find
surprising - isn't this meant to be locked down?

- 'kmap(pageList[0])' always returns a valid address, and I can use this
address to write directly to the user-space buffer from the driver

- 'virt_to_bus', 'virt_to_phys', and '__pa' don't seem to do anything
useful on this platform; they just give a fixed offset from the user
virtual address, which is nowhere near the bus address which works
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Evan Lavelle on
Made some progress here. The problem is that this is 32-bit PAE kernel,
so 'dma_addr_t' is 64-bit. However, I have a 32-bit PCIe card, so I need
a 32-bit dma_addr_t. How do I do this? In other words, how do I handle
32-bit PCI cards on PAE or 64-bit systems? My code sets the DMA mask to
32 bits but this is *not* sufficient:

pci_set_dma_mask(my_dev, DMA_32BIT_MASK)

Is this a bug, or do I have to do something else? LDD doesn't seem to
have anything to say about this. I had previously assumed that an IOMMU
would translate the (32-bit) dma_addr_t to a 36- or 64-bit value, but I
don't think there's an IOMMU in this system. Do x86 systems have IOMMUs?
This is a server motherboard, so I don't think it even has AGP. However,
even if I had an IOMMU, I would still need a way generate a 32-bit
dma_addr_t to start with.

Second problem: can I use the scatter-gather code ('pci_map_sg') on PAE
or 64-bit systems? I've found one post that says this isn't possible,
and that the DAC routines have to be used instead (second post in
http://www.alteraforum.com/forum/showthread.php?t=4171). These comments
seem to be incorrect, but I'd appreciate some confirmation of this.

The specific question in my first post was why the coherent mapping
worked, and the streaming mapping didn't. The answer was that, for this
system, the dma_addr_t for a coherent buffer in kernel space is in the
low 4GB, but the dma_addr_t for the streaming buffer in user space has
bit 32 set. I hadn't realised that dma_addr_t was 64-bit, and I was just
writing the low 32 bits to the DMA registers on the PCI card. The DMA op
to the coherent buffer worked, but the DMA op to the streaming buffer
didn't, since the PCIe card can't drive bit 32.

Thanks -

Evan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: FUJITA Tomonori on
On Wed, 04 Aug 2010 10:26:29 +0100
Evan Lavelle <sa212+lkml(a)cyconix.com> wrote:

> Made some progress here. The problem is that this is 32-bit PAE kernel,
> so 'dma_addr_t' is 64-bit. However, I have a 32-bit PCIe card, so I need
> a 32-bit dma_addr_t. How do I do this? In other words, how do I handle
> 32-bit PCI cards on PAE or 64-bit systems? My code sets the DMA mask to
> 32 bits but this is *not* sufficient:
>
> pci_set_dma_mask(my_dev, DMA_32BIT_MASK)

It doesn't work on x86_32 kernel if your driver doesn't work with the
block layer or the network subsystem.

If your driver can't handle 64bit DMA, you need bounce buffer. I don't
know what your driver do, but a subsystem passes a buffer to your
driver. If a buffer is not below 32bit address, for example, if you
read data from hardware, you need to allocate a temporary buffer
(below 32bit), do DMA with the buffer, copy the data to the original
buffer, then free the temporary buffer.

The block layer and the network subsystem have the own bounce
mechanism. x86_64 kernel has swiotlb, which is the generic bounce
buffer mechanism. So if a driver sets the dma mask, they do bounce
buffer for the driver.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Evan Lavelle on
FUJITA Tomonori wrote:
>> Made some progress here. The problem is that this is 32-bit PAE kernel,
>> so 'dma_addr_t' is 64-bit. However, I have a 32-bit PCIe card, so I need
>> a 32-bit dma_addr_t. How do I do this? In other words, how do I handle
>> 32-bit PCI cards on PAE or 64-bit systems? My code sets the DMA mask to
>> 32 bits but this is *not* sufficient:
>>
>> pci_set_dma_mask(my_dev, DMA_32BIT_MASK)
>
> It doesn't work on x86_32 kernel if your driver doesn't work with the
> block layer or the network subsystem.

Sorry, not sure that I understand this. Are you saying that I can't set
a DMA mask on x86_32 unless I have a block or network driver?

> If your driver can't handle 64bit DMA, you need bounce buffer.

The problem is not that I can't handle 64-bit DMA in the driver, but
that the PCI card can't do 64-bit DMA. I tell the kernel this by calling
'pci_set_dma_mask' with a 32-bit mask, but it appears to be ignoring my
request and then giving me a 64-bit dma_addr_t for the 32-bit PCI card.

Thanks -

Evan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: FUJITA Tomonori on
On Wed, 04 Aug 2010 12:22:32 +0100
Evan Lavelle <sa212+lkml(a)cyconix.com> wrote:

> FUJITA Tomonori wrote:
> >> Made some progress here. The problem is that this is 32-bit PAE kernel,
> >> so 'dma_addr_t' is 64-bit. However, I have a 32-bit PCIe card, so I need
> >> a 32-bit dma_addr_t. How do I do this? In other words, how do I handle
> >> 32-bit PCI cards on PAE or 64-bit systems? My code sets the DMA mask to
> >> 32 bits but this is *not* sufficient:
> >>
> >> pci_set_dma_mask(my_dev, DMA_32BIT_MASK)
> >
> > It doesn't work on x86_32 kernel if your driver doesn't work with the
> > block layer or the network subsystem.
>
> Sorry, not sure that I understand this. Are you saying that I can't set
> a DMA mask on x86_32 unless I have a block or network driver?

Yeah, the mask is ignored. As I wrote in the previous mail, x86_32
doesn't have a bounce mechanism so dma_map_{single|sg} can't do
anything for a buffer above 32bit even if the mask is 32bit.


> > If your driver can't handle 64bit DMA, you need bounce buffer.
>
> The problem is not that I can't handle 64-bit DMA in the driver, but
> that the PCI card can't do 64-bit DMA. I tell the kernel this by calling
> 'pci_set_dma_mask' with a 32-bit mask, but it appears to be ignoring my
> request and then giving me a 64-bit dma_addr_t for the 32-bit PCI card.

If your card can't do 64-bit DMA, you need a bounce buffer mechanism.

Options are:

- your driver implements its own bounce buffer mechanism (as some
driver do).

- add swiotlb support to x86_32 (I don't think that it's difficult but
I might miss something).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/