vhost_net: a kernel-level virtio server [Kernel]

Prev: bfa: Brocade BFA FC SCSI driver (bfad)
Next: Atheros Linux wireless drivers home page - and two new driver projects

From: Michael S. Tsirkin on 14 Sep 2009 03:10

On Mon, Sep 14, 2009 at 01:57:06PM +0800, Xin, Xiaohui wrote:
> >The irqfd/ioeventfd patches are part of Avi's kvm.git tree:
> >git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm.git
> >
> >I expect them to be merged by 2.6.32-rc1 - right, Avi?
>
> Michael,
>
> I think I have the kernel patch for kvm_irqfd and kvm_ioeventfd, but missed the qemu side patch for irqfd and ioeventfd.
>
> I met the compile error when I compiled virtio-pci.c file in qemu-kvm like this:
>
> /root/work/vmdq/vhost/qemu-kvm/hw/virtio-pci.c:384: error: `KVM_IRQFD` undeclared (first use in this function)
> /root/work/vmdq/vhost/qemu-kvm/hw/virtio-pci.c:400: error: `KVM_IOEVENTFD` undeclared (first use in this function)
>
> Which qemu tree or patch do you use for kvm_irqfd and kvm_ioeventfd?

I'm using the headers from upstream kernel.
I'll send a patch for that.

> Thanks
> Xiaohui
>
> -----Original Message-----
> From: Michael S. Tsirkin [mailto:mst(a)redhat.com]
> Sent: Sunday, September 13, 2009 1:46 PM
> To: Xin, Xiaohui
> Cc: Ira W. Snyder; netdev(a)vger.kernel.org; virtualization(a)lists.linux-foundation.org; kvm(a)vger.kernel.org; linux-kernel(a)vger.kernel.org; mingo(a)elte.hu; linux-mm(a)kvack.org; akpm(a)linux-foundation.org; hpa(a)zytor.com; gregory.haskins(a)gmail.com; Rusty Russell; s.hetze(a)linux-ag.com; avi(a)redhat.com
> Subject: Re: [PATCHv5 3/3] vhost_net: a kernel-level virtio server
>
> On Fri, Sep 11, 2009 at 11:17:33PM +0800, Xin, Xiaohui wrote:
> > Michael,
> > We are very interested in your patch and want to have a try with it.
> > I have collected your 3 patches in kernel side and 4 patches in queue side.
> > The patches are listed here:
> >
> > PATCHv5-1-3-mm-export-use_mm-unuse_mm-to-modules.patch
> > PATCHv5-2-3-mm-reduce-atomic-use-on-use_mm-fast-path.patch
> > PATCHv5-3-3-vhost_net-a-kernel-level-virtio-server.patch
> >
> > PATCHv3-1-4-qemu-kvm-move-virtio-pci[1].o-to-near-pci.o.patch
> > PATCHv3-2-4-virtio-move-features-to-an-inline-function.patch
> > PATCHv3-3-4-qemu-kvm-vhost-net-implementation.patch
> > PATCHv3-4-4-qemu-kvm-add-compat-eventfd.patch
> >
> > I applied the kernel patches on v2.6.31-rc4 and the qemu patches on latest kvm qemu.
> > But seems there are some patches are needed at least irqfd and ioeventfd patches on
> > current qemu. I cannot create a kvm guest with "-net nic,model=virtio,vhost=vethX".
> >
> > May you kindly advice us the patch lists all exactly to make it work?
> > Thanks a lot. :-)
> >
> > Thanks
> > Xiaohui
>
>
> The irqfd/ioeventfd patches are part of Avi's kvm.git tree:
> git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm.git
>
> I expect them to be merged by 2.6.32-rc1 - right, Avi?
>
> --
> MST
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Gregory Haskins on 14 Sep 2009 12:10

Michael S. Tsirkin wrote:
> On Fri, Sep 11, 2009 at 12:00:21PM -0400, Gregory Haskins wrote:
>> FWIW: VBUS handles this situation via the "memctx" abstraction. IOW,
>> the memory is not assumed to be a userspace address. Rather, it is a
>> memctx-specific address, which can be userspace, or any other type
>> (including hardware, dma-engine, etc). As long as the memctx knows how
>> to translate it, it will work.
>
> How would permissions be handled?

Same as anything else, really. Read on for details.

> it's easy to allow an app to pass in virtual addresses in its own address space.

Agreed, and this is what I do.

The guest always passes its own physical addresses (using things like
__pa() in linux). This address passed is memctx specific, but generally
would fall into the category of "virtual-addresses" from the hosts
perspective.

For a KVM/AlacrityVM guest example, the addresses are GPAs, accessed
internally to the context via a gfn_to_hva conversion (you can see this
occuring in the citation links I sent)

For Ira's example, the addresses would represent a physical address on
the PCI boards, and would follow any kind of relevant rules for
converting a "GPA" to a host accessible address (even if indirectly, via
a dma controller).

> But we can't let the guest specify physical addresses.

Agreed. Neither your proposal nor mine operate this way afaict.

HTH

Kind Regards,
-Greg

From: Michael S. Tsirkin on 14 Sep 2009 13:00

On Mon, Sep 14, 2009 at 12:08:55PM -0400, Gregory Haskins wrote:
> For Ira's example, the addresses would represent a physical address on
> the PCI boards, and would follow any kind of relevant rules for
> converting a "GPA" to a host accessible address (even if indirectly, via
> a dma controller).

I don't think limiting addresses to PCI physical addresses will work
well. From what I rememeber, Ira's x86 can not initiate burst
transactions on PCI, and it's the ppc that initiates all DMA.

>
> > But we can't let the guest specify physical addresses.
>
> Agreed. Neither your proposal nor mine operate this way afaict.

But this seems to be what Ira needs.

> HTH
>
> Kind Regards,
> -Greg
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Michael S. Tsirkin on 14 Sep 2009 13:00

On Mon, Sep 14, 2009 at 12:08:55PM -0400, Gregory Haskins wrote:
> Michael S. Tsirkin wrote:
> > On Fri, Sep 11, 2009 at 12:00:21PM -0400, Gregory Haskins wrote:
> >> FWIW: VBUS handles this situation via the "memctx" abstraction. IOW,
> >> the memory is not assumed to be a userspace address. Rather, it is a
> >> memctx-specific address, which can be userspace, or any other type
> >> (including hardware, dma-engine, etc). As long as the memctx knows how
> >> to translate it, it will work.
> >
> > How would permissions be handled?
>
> Same as anything else, really. Read on for details.
>
> > it's easy to allow an app to pass in virtual addresses in its own address space.
>
> Agreed, and this is what I do.
>
> The guest always passes its own physical addresses (using things like
> __pa() in linux). This address passed is memctx specific, but generally
> would fall into the category of "virtual-addresses" from the hosts
> perspective.
>
> For a KVM/AlacrityVM guest example, the addresses are GPAs, accessed
> internally to the context via a gfn_to_hva conversion (you can see this
> occuring in the citation links I sent)
>
> For Ira's example, the addresses would represent a physical address on
> the PCI boards, and would follow any kind of relevant rules for
> converting a "GPA" to a host accessible address (even if indirectly, via
> a dma controller).

So vbus can let an application access either its own virtual memory or a
physical memory on a PCI device. My question is, is any application
that's allowed to do the former also granted rights to do the later?

> > But we can't let the guest specify physical addresses.
>
> Agreed. Neither your proposal nor mine operate this way afaict.
>
> HTH
>
> Kind Regards,
> -Greg
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Gregory Haskins on 14 Sep 2009 15:20

Michael S. Tsirkin wrote:
> On Mon, Sep 14, 2009 at 12:08:55PM -0400, Gregory Haskins wrote:
>> For Ira's example, the addresses would represent a physical address on
>> the PCI boards, and would follow any kind of relevant rules for
>> converting a "GPA" to a host accessible address (even if indirectly, via
>> a dma controller).
>
> I don't think limiting addresses to PCI physical addresses will work
> well.

The only "limit" is imposed by the memctx. If a given context needs to
meet certain requirements beyond PCI physical addresses, it would
presumably be designed that way.

> From what I rememeber, Ira's x86 can not initiate burst
> transactions on PCI, and it's the ppc that initiates all DMA.

The only requirement is that the "guest" "owns" the memory. IOW: As
with virtio/vhost, the guest can access the pointers in the ring
directly but the host must pass through a translation function.

Your translation is direct: you use a slots/hva scheme. My translation
is abstracted, which means it can support slots/hva (such as in
alacrityvm) or some other scheme as long as the general model of "guest
owned" holds true.

>
>>> But we can't let the guest specify physical addresses.
>> Agreed. Neither your proposal nor mine operate this way afaict.
>
> But this seems to be what Ira needs.

So what he could do then is implement the memctx to integrate with the
ppc side dma controller. E.g. "translation" in his box means a protocol
from the x86 to the ppc to initiate the dma cycle. This could be
exposed as a dma facility in the register file of the ppc boards, for
instance.

To reiterate, as long as the model is such that the ppc boards are
considered the "owner" (direct access, no translation needed) I believe
it will work. If the pointers are expected to be owned by the host,
then my model doesn't work well either.

Kind Regards,
-Greg

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13
Prev: bfa: Brocade BFA FC SCSI driver (bfad)
Next: Atheros Linux wireless drivers home page - and two new driver projects