From: Avi Kivity on 2 Jun 2010 06:00 On 06/02/2010 12:42 PM, Joerg Roedel wrote: > On Tue, Jun 01, 2010 at 12:55:32PM +0300, Michael S. Tsirkin wrote: > > >> There seems to be some misunderstanding. The userspace interface >> proposed forces a separate domain per device and forces userspace to >> repeat iommu programming for each device. We are better off sharing a >> domain between devices and programming the iommu once. >> >> The natural way to do this is to have an iommu driver for programming >> iommu. >> > IMO a seperate iommu-userspace driver is a nightmare for a userspace > interface. It is just too complicated to use. We can solve the problem > of multiple devices-per-domain with an ioctl which allows binding one > uio-device to the address-space on another. Thats much simpler. > This is non trivial with hotplug. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Joerg Roedel on 2 Jun 2010 06:10 On Wed, Jun 02, 2010 at 12:49:28PM +0300, Avi Kivity wrote: > On 06/02/2010 12:45 PM, Joerg Roedel wrote: >> IOMMU mapped memory can not be swapped out because we can't do demand >> paging on io-page-faults with current devices. We have to pin _all_ >> userspace memory that is mapped into an IOMMU domain. > > vhost doesn't pin memory. > > What I proposed is to describe the memory map using an object (fd), and > pass it around to clients that use it: kvm, vhost, vfio. That way you > maintain the memory map in a central location and broadcast changes to > clients. Only a vfio client would result in memory being pinned. Ah ok, so its only about the database which keeps the mapping information. > It can still work, but the interface needs to be extended to include > dirty bitmap logging. Thats hard to do. I am not sure about VT-d but the AMD IOMMU has no dirty-bits in the page-table. And without demand-paging we can't really tell what pages a device has written to. The only choice is to mark all IOMMU-mapped pages dirty as long as they are mapped. Joerg -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Joerg Roedel on 2 Jun 2010 06:20 On Wed, Jun 02, 2010 at 12:53:12PM +0300, Michael S. Tsirkin wrote: > On Wed, Jun 02, 2010 at 11:42:01AM +0200, Joerg Roedel wrote: > > IMO a seperate iommu-userspace driver is a nightmare for a userspace > > interface. It is just too complicated to use. > > One advantage would be that we can reuse the uio framework > for the devices themselves. So an existing app can just program > an iommu for DMA and keep using uio for interrupts and access. The driver is called UIO and not U-INTR-MMIO ;-) So I think handling IOMMU mappings belongs there. > > We can solve the problem > > of multiple devices-per-domain with an ioctl which allows binding one > > uio-device to the address-space on another. > > This would imply switching an iommu domain for a device while > it could potentially be doing DMA. No idea whether this can be done > in a safe manner. It can. The worst thing that can happen is an io-page-fault. > Forcing iommu assignment to be done as a first step seems much saner. If we force it, there is no reason why not doing it implicitly. We can do something like this then: dev1 = open(); ioctl(dev1, IOMMU_MAP, ...); /* creates IOMMU domain and assigns dev1 to it*/ dev2 = open(); ioctl(dev2, IOMMU_MAP, ...); /* Now dev1 and dev2 are in seperate domains */ ioctl(dev2, IOMMU_SHARE, dev1); /* destroys all mapping for dev2 and assigns it to the same domain as dev1. Domain has a refcount of two now */ close(dev1); /* domain refcount goes down to one */ close(dev2); /* domain refcount is zero and domain gets destroyed */ Joerg -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Michael S. Tsirkin on 2 Jun 2010 06:20 On Wed, Jun 02, 2010 at 12:04:04PM +0200, Joerg Roedel wrote: > On Wed, Jun 02, 2010 at 12:49:28PM +0300, Avi Kivity wrote: > > On 06/02/2010 12:45 PM, Joerg Roedel wrote: > >> IOMMU mapped memory can not be swapped out because we can't do demand > >> paging on io-page-faults with current devices. We have to pin _all_ > >> userspace memory that is mapped into an IOMMU domain. > > > > vhost doesn't pin memory. > > > > What I proposed is to describe the memory map using an object (fd), and > > pass it around to clients that use it: kvm, vhost, vfio. That way you > > maintain the memory map in a central location and broadcast changes to > > clients. Only a vfio client would result in memory being pinned. > > Ah ok, so its only about the database which keeps the mapping > information. > > > It can still work, but the interface needs to be extended to include > > dirty bitmap logging. > > Thats hard to do. I am not sure about VT-d but the AMD IOMMU has no > dirty-bits in the page-table. And without demand-paging we can't really > tell what pages a device has written to. The only choice is to mark all > IOMMU-mapped pages dirty as long as they are mapped. > > Joerg Or mark them dirty when they are unmapped. -- MST -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Michael S. Tsirkin on 2 Jun 2010 06:30
On Wed, Jun 02, 2010 at 11:45:27AM +0200, Joerg Roedel wrote: > On Tue, Jun 01, 2010 at 03:41:55PM +0300, Avi Kivity wrote: > > On 06/01/2010 01:46 PM, Michael S. Tsirkin wrote: > > >> Main difference is that vhost works fine with unlocked > >> memory, paging it in on demand. iommu needs to unmap > >> memory when it is swapped out or relocated. > >> > > So you'd just take the memory map and not pin anything. This way you > > can reuse the memory map. > > > > But no, it doesn't handle the dirty bitmap, so no go. > > IOMMU mapped memory can not be swapped out because we can't do demand > paging on io-page-faults with current devices. We have to pin _all_ > userspace memory that is mapped into an IOMMU domain. > > Joerg One of the issues I see with the current patch is that it uses the mlock rlimit to do this pinning. So this wastes the rlimit for an app that did mlockall already, and also consumes this resource transparently, so an app might call mlock on a small buffer and be surprised that it fails. Using mmu notifiers might help? -- MST -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |