From: Joerg Roedel on
On Wed, Mar 31, 2010 at 05:08:38PM -0700, Tom Lyon wrote:
> uio_pci_generic has previously been discussed on the KVM list, but this patch
> has nothing to do with KVM, so it is also going to LKML.

But since you send it to the KVM list it should be suitable for KVM too,
no?

> The point of this patch is to beef up the uio_pci_generic driver so that a
> non-privileged user process can run a user level driver for most PCIe
> devices. This can only be safe if there is an IOMMU in the system with
> per-device domains. Privileged users (CAP_SYS_RAWIO) are allowed if there is
> no IOMMU.

If you rely on an IOMMU you can use the IOMMU-API instead of the DMA-API
for dma mappings. This change makes this driver suitable for KVM use
too. If the interface is designed clever enough we can even use it for
IOMMU emulation for pass-through devices.

Joerg

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Joerg Roedel on
On Thu, Apr 01, 2010 at 08:40:34AM -0700, Tom Lyon wrote:
> On Thursday 01 April 2010 05:52:18 am Joerg Roedel wrote:

> > > The point of this patch is to beef up the uio_pci_generic driver so that
> > > a non-privileged user process can run a user level driver for most PCIe
> > > devices. This can only be safe if there is an IOMMU in the system with
> > > per-device domains. Privileged users (CAP_SYS_RAWIO) are allowed if
> > > there is no IOMMU.
> >
> > If you rely on an IOMMU you can use the IOMMU-API instead of the DMA-API
> > for dma mappings. This change makes this driver suitable for KVM use
> > too. If the interface is designed clever enough we can even use it for
> > IOMMU emulation for pass-through devices.

> The use with privileged processes and no IOMMUs is still quite useful, so I'd
> rather stick with the DMA interface.

For the KVM use-case we need to be able to specify the io virtual
address for a given process virtual address. This is not possible with
the dma-api interface. So if we want to have uio-dma without an hardware
iommu we need two distinct interfaces for userspace to cover all
use-cases. I don't think its worth it to have two interfaces.

Joerg

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Michael S. Tsirkin on
On Wed, Mar 31, 2010 at 05:08:38PM -0700, Tom Lyon wrote:
> uio_pci_generic has previously been discussed on the KVM list, but this patch
> has nothing to do with KVM, so it is also going to LKML.
>
> The point of this patch is to beef up the uio_pci_generic driver so that a
> non-privileged user process can run a user level driver for most PCIe
> devices. This can only be safe if there is an IOMMU in the system with
> per-device domains.

Why? Per-guest domain should be safe enough.

> Privileged users (CAP_SYS_RAWIO) are allowed if there is
> no IOMMU.

qemu does not support it, I doubt this last option is worth having.

> Specifically, I seek to allow low-latency user level network drivers (non
> tcp/ip) which directly access SR-IOV style virtual network adapters, for use
> with packages such as OpenMPI.
>
> Key areas of change:
> - ioctl extensions to allow registration and dma mapping of memory regions,
> with lock accounting
> - support for mmu notifier driven de-mapping
> - support for MSI and MSI-X interrupts (the intel 82599 VFs support only
> MSI-X)
> - allowing interrupt enabling and device register mapping all
> through /dev/uio* so that permissions may be granted just by chmod
> on /dev/uio*

For non-priveledged users, we need a way to enforce that
device is bound to an iommu.

Further, locking really needs to be scoped with iommu domain existance
and with iommu mappings: as long as a page is mapped in iommu,
it must be locked. This patch does not seem to enforce that.

Also note that what we really want is a single iommu domain per guest,
not per device.

For this reason, I think we should address the problem somwwhat
differently:
- Create a character device to represent the iommu
- This device will handle memory locking etc
- Allow binding this device to iommu
- Allow other operations only after iommu is bound

Thanks!

--
MST
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Joerg Roedel on
On Thu, Apr 01, 2010 at 05:25:04PM +0300, Michael S. Tsirkin wrote:
> On Wed, Mar 31, 2010 at 05:08:38PM -0700, Tom Lyon wrote:
> > uio_pci_generic has previously been discussed on the KVM list, but this patch
> > has nothing to do with KVM, so it is also going to LKML.
> >
> > The point of this patch is to beef up the uio_pci_generic driver so that a
> > non-privileged user process can run a user level driver for most PCIe
> > devices. This can only be safe if there is an IOMMU in the system with
> > per-device domains.
>
> Why? Per-guest domain should be safe enough.

Hardware IOMMUs don't have something like a per-guest domain ;-)
Anyway, if we want to emulate an IOMMU in the guest and make this
working for pass-through devices too we need more than one domain per
guest. Essentially we may need one domain per device.

> > Privileged users (CAP_SYS_RAWIO) are allowed if there is
> > no IOMMU.
>
> qemu does not support it, I doubt this last option is worth having.

Agreed.

> For this reason, I think we should address the problem somwwhat
> differently:
> - Create a character device to represent the iommu
> - This device will handle memory locking etc
> - Allow binding this device to iommu
> - Allow other operations only after iommu is bound

Yes, something like this is needed. But I think we can implement this in
the generic uio-pci-driver. A seperate interface which basically passes
the iommu-api functions to userspace doesn't make sense because it would
also be device-centric like the uio-pci-driver.

Joerg

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Joerg Roedel on
On Thu, Apr 01, 2010 at 12:18:27PM -0700, Tom Lyon wrote:
> On Thursday 01 April 2010 09:07:47 am Joerg Roedel wrote:
> > For the KVM use-case we need to be able to specify the io virtual
> > address for a given process virtual address. This is not possible with
> > the dma-api interface. So if we want to have uio-dma without an hardware
> > iommu we need two distinct interfaces for userspace to cover all
> > use-cases. I don't think its worth it to have two interfaces.
>
> I started to add that capability but then realized that the IOMMU API also
> doesn't allow it. The map function allows a range of physically contiguous
> pages, not virtual.

The IOMMU-API allows that. You have to convert the user-virtual
addresses into physical addresses first. The current KVM code
already does this and uses the IOMMU-API later. You can have a look at
the gfn_to_pfn() function for a way to implement this.

> My preferred approach would be to add a DMA_ATTR that would request
> allocation of DMA at a specific device/iommu address.

No, that would be feature duplication between both APIs. Not to mention
the implementation hell this additional dma-api feature would cause for
the iommu driver developers.

Joerg

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/