From: Arnd Bergmann on
On Wednesday 10 February 2010, Xin Xiaohui wrote:
> The idea is simple, just to pin the guest VM user space and then
> let host NIC driver has the chance to directly DMA to it.
> The patches are based on vhost-net backend driver. We add a device
> which provides proto_ops as sendmsg/recvmsg to vhost-net to
> send/recv directly to/from the NIC driver. KVM guest who use the
> vhost-net backend may bind any ethX interface in the host side to
> get copyless data transfer thru guest virtio-net frontend.
>
> We provide multiple submits and asynchronous notifiicaton to
> vhost-net too.

This does a lot of things that I had planned for macvtap. It's
great to hear that you have made this much progress.

However, I'd hope that we could combine this with the macvtap driver,
which would give us zero-copy transfer capability both with and
without vhost, as well as (tx at least) when using multiple guests
on a macvlan setup.

For transmit, it should be fairly straightforward to hook up
your zero-copy method and the vhost-net interface into the
macvtap driver.

You have simplified the receiv path significantly by assuming
that the entire netdev can receive into a single guest, right?
I'm assuming that the idea is to allow VMDq adapters to simply
show up as separate adapters and have the driver handle this
in a hardware specific way.
My plan for this was to instead move support for VMDq into the
macvlan driver so we can transparently use VMDq on hardware where
available, including zero-copy receives, but fall back to software
operation on non-VMDq hardware.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Xin, Xiaohui on
>>On Wednesday 10 February 2010, Xin Xiaohui wrote:
> >The idea is simple, just to pin the guest VM user space and then
> >let host NIC driver has the chance to directly DMA to it.
> >The patches are based on vhost-net backend driver. We add a device
> >which provides proto_ops as sendmsg/recvmsg to vhost-net to
> >send/recv directly to/from the NIC driver. KVM guest who use the
> >vhost-net backend may bind any ethX interface in the host side to
> >get copyless data transfer thru guest virtio-net frontend.
>>
>> We provide multiple submits and asynchronous notifiicaton to
> >vhost-net too.

>This does a lot of things that I had planned for macvtap. It's
>great to hear that you have made this much progress.
>
>However, I'd hope that we could combine this with the macvtap driver,
>which would give us zero-copy transfer capability both with and
>without vhost, as well as (tx at least) when using multiple guests
>on a macvlan setup.

You mean the zero-copy can work with macvtap driver without vhost.
May you give me some detailed info about your macvtap driver and the
relationship between vhost and macvtap to make me have a clear picture then?

>For transmit, it should be fairly straightforward to hook up
>your zero-copy method and the vhost-net interface into the
>macvtap driver.
>
>You have simplified the receiv path significantly by assuming
>that the entire netdev can receive into a single guest, right?

Yes.

>I'm assuming that the idea is to allow VMDq adapters to simply
>show up as separate adapters and have the driver handle this
>in a hardware specific way.

Does the VMDq driver do so now?

>My plan for this was to instead move support for VMDq into the
>macvlan driver so we can transparently use VMDq on hardware where
>available, including zero-copy receives, but fall back to software
>operation on non-VMDq hardware.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Xin, Xiaohui on
Will be in a vacation during 2/13~2/20, so email may be very slow or no replied
for your comments. But please don't hesitate to comment more, and I will address
them after the vacation. :-)

Thanks
Xiaohui
-----Original Message-----
From: kvm-owner(a)vger.kernel.org [mailto:kvm-owner(a)vger.kernel.org] On Behalf Of Xin Xiaohui
Sent: Wednesday, February 10, 2010 7:49 PM
To: netdev(a)vger.kernel.org; kvm(a)vger.kernel.org; linux-kernel(a)vger.kernel.org; mingo(a)elte.hu; mst(a)redhat.com; jdike(a)c2.user-mode-linux.org
Subject: [PATCH 0/3] Provide a zero-copy method on KVM virtio-net.

The idea is simple, just to pin the guest VM user space and then
let host NIC driver has the chance to directly DMA to it.
The patches are based on vhost-net backend driver. We add a device
which provides proto_ops as sendmsg/recvmsg to vhost-net to
send/recv directly to/from the NIC driver. KVM guest who use the
vhost-net backend may bind any ethX interface in the host side to
get copyless data transfer thru guest virtio-net frontend.

We provide multiple submits and asynchronous notifiicaton to
vhost-net too.

Our goal is to improve the bandwidth and reduce the CPU usage.
Exact performance data will be provided later. But for simple
test with netperf, we found bindwidth up and CPU % up too,
but the bindwidth up ratio is much more than CPU % up ratio.

What we have not done yet:
To support GRO
Performance tuning
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Arnd Bergmann on
On Thursday 11 February 2010, Xin, Xiaohui wrote:
> >This does a lot of things that I had planned for macvtap. It's
> >great to hear that you have made this much progress.
> >
> >However, I'd hope that we could combine this with the macvtap driver,
> >which would give us zero-copy transfer capability both with and
> >without vhost, as well as (tx at least) when using multiple guests
> >on a macvlan setup.
>
> You mean the zero-copy can work with macvtap driver without vhost.
> May you give me some detailed info about your macvtap driver and the
> relationship between vhost and macvtap to make me have a clear picture then?

macvtap provides a user interface that is largely compatible with
the tun/tap driver, and can be used in place of that from qemu.
Vhost-net currently interfaces with tun/tap, but not yet with macvtap,
which is easy enough to add and already on my list.

The underlying code is macvlan, which is a driver that virtualizes
network adapters in software, giving you multiple net_device instances
for a real NIC, each of them with their own MAC address.

In order to do zero-copy transmit with macvtap, the idea is to
add a nonblocking version of the aio_write() function that works
a lot like your transmit function.

For receive, the hardware does not currently know which guest
is supposed to get any frame coming in from the outside. Adding
zero-copy receive requires interaction with the device driver
and hardware capabilities to separate traffic by inbound MAC
address into separate buffers per VM.

> >I'm assuming that the idea is to allow VMDq adapters to simply
> >show up as separate adapters and have the driver handle this
> >in a hardware specific way.
>
> Does the VMDq driver do so now?

I don't think anyone has published a VMDq capable driver so far.
I was just assuming that you were working on one.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Michael S. Tsirkin on
On Sat, Mar 06, 2010 at 05:38:35PM +0800, xiaohui.xin(a)intel.com wrote:
> The idea is simple, just to pin the guest VM user space and then
> let host NIC driver has the chance to directly DMA to it.
> The patches are based on vhost-net backend driver. We add a device
> which provides proto_ops as sendmsg/recvmsg to vhost-net to
> send/recv directly to/from the NIC driver. KVM guest who use the
> vhost-net backend may bind any ethX interface in the host side to
> get copyless data transfer thru guest virtio-net frontend.
>
> We provide multiple submits and asynchronous notifiicaton to
> vhost-net too.
>
> Our goal is to improve the bandwidth and reduce the CPU usage.
> Exact performance data will be provided later. But for simple
> test with netperf, we found bindwidth up and CPU % up too,
> but the bindwidth up ratio is much more than CPU % up ratio.
>
> What we have not done yet:
> packet split support
> To support GRO
> Performance tuning

Am I right to say that nic driver needs changes for these patches
to work? If so, please publish nic driver patches as well.

> what we have done in v1:
> polish the RCU usage
> deal with write logging in asynchroush mode in vhost
> add notifier block for mp device
> rename page_ctor to mp_port in netdevice.h to make it looks generic
> add mp_dev_change_flags() for mp device to change NIC state
> add CONIFG_VHOST_MPASSTHRU to limit the usage when module is not load
> a small fix for missing dev_put when fail
> using dynamic minor instead of static minor number
> a __KERNEL__ protect to mp_get_sock()
>
> performance:
> using netperf with GSO/TSO disabled, 10G NIC,
> disabled packet split mode, with raw socket case compared to vhost.
>
> bindwidth will be from 1.1Gbps to 1.7Gbps
> CPU % from 120%-140% to 140%-160%

That's pretty low for a 10Gb nic. Are you hitting some other bottleneck,
like high interrupt rate? Also, GSO support and performance tuning
for raw are incomplete. Try comparing with e.g. tap with GSO.

--
MST
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/