From: Xin, Xiaohui on
Herbert,
That's why I have sent you the patch for guest virtio-net driver. I reserved 512 bytes in each page, then I can always have the space to copy and avoid the backend memory used up issue.

Thanks
Xiaohui

>-----Original Message-----
>From: Herbert Xu [mailto:herbert(a)gondor.apana.org.au]
>Sent: Thursday, June 24, 2010 6:09 PM
>To: Dong, Eddie
>Cc: Xin, Xiaohui; Stephen Hemminger; netdev(a)vger.kernel.org; kvm(a)vger.kernel.org;
>linux-kernel(a)vger.kernel.org; mst(a)redhat.com; mingo(a)elte.hu; davem(a)davemloft.net;
>jdike(a)linux.intel.com
>Subject: Re: [RFC PATCH v7 01/19] Add a new structure for skb buffer from external.
>
>On Wed, Jun 23, 2010 at 06:05:41PM +0800, Dong, Eddie wrote:
>>
>> I mean once the frontend side driver post the buffers to the backend driver, the backend
>driver will "immediately" use that buffers to compose skb or gro_frags and post them to the
>assigned host NIC driver as receive buffers. In that case, if the backend driver recieves a
>packet from the NIC that requires to do copy, it may be unable to find additional free guest
>buffer because all of them are already used by the NIC driver. We have to reserve some guest
>buffers for the possible copy even if the buffer address is not identified by original skb :(
>
>OK I see what you mean. Can you tell me how does Xiaohui's
>previous patch-set deal with this problem?
>
>Thanks,
>--
>Visit Openswan at http://www.openswan.org/
>Email: Herbert Xu ~{PmV>HI~} <herbert(a)gondor.apana.org.au>
>Home Page: http://gondor.apana.org.au/~herbert/
>PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Michael S. Tsirkin on
On Fri, Jun 25, 2010 at 09:03:46AM +0800, Dong, Eddie wrote:
> Herbert Xu wrote:
> > On Wed, Jun 23, 2010 at 06:05:41PM +0800, Dong, Eddie wrote:
> >>
> >> I mean once the frontend side driver post the buffers to the backend
> >> driver, the backend driver will "immediately" use that buffers to
> >> compose skb or gro_frags and post them to the assigned host NIC
> >> driver as receive buffers. In that case, if the backend driver
> >> recieves a packet from the NIC that requires to do copy, it may be
> >> unable to find additional free guest buffer because all of them are
> >> already used by the NIC driver. We have to reserve some guest
> >> buffers for the possible copy even if the buffer address is not
> >> identified by original skb :(
> >
> > OK I see what you mean. Can you tell me how does Xiaohui's
> > previous patch-set deal with this problem?
> >
> > Thanks,
>
> In current patch, each SKB for the assigned device (SRIOV VF or NIC or a complete queue pairs) uses the buffer from guest, so it eliminates copy completely in software and requires hardware to do so. If we can have an additonal place to store the buffer per skb (may cause copy later on), we can do copy later on or re-post the buffer to assigned NIC driver later on. But that may be not very clean either :(
> BTW, some hardware may require certain level of packet copy such as for broadcast packets in very old VMDq device, which is not addressed in previous Xiaohui's patch yet. We may address this by implementing an additional virtqueue between guest and host for slow path (broadcast packets only here) with additinal complexity in FE/BE driver.
>
> Thx, Eddie

guest posts a large number of buffers to the host.
Host can use them any way it wants to, and in any order,
for example reserve half the buffers for the copy.

This might waste some memory if buffers are used
only partially, but let's worry about this later.

--
MST
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Herbert Xu on
On Fri, Jun 25, 2010 at 09:03:46AM +0800, Dong, Eddie wrote:
>
> In current patch, each SKB for the assigned device (SRIOV VF or NIC or a complete queue pairs) uses the buffer from guest, so it eliminates copy completely in software and requires hardware to do so. If we can have an additonal place to store the buffer per skb (may cause copy later on), we can do copy later on or re-post the buffer to assigned NIC driver later on. But that may be not very clean either :(

OK, if I understand you correctly then I don't think have a
problem. With your current patch-set you have exactly the same
situation when the skb->data is reallocated as a kernel buffer.

This is OK because as you correctly argue, it is a rare situation.

With my proposal you will need to get this extra external buffer
in even less cases, because you'd only need to do it if the skb
head grows, which only happens if it becomes encapsulated.

So let me explain it in a bit more detail:

Our packet starts out as a purely non-linear skb, i.e., skb->head
contains nothing and all the page frags come from the guest.

During host processing we may pull data into skb->head but the
first frag will remain unless we pull all of it. If we did do
that then you would have a free external buffer anyway.

Now in the common case the header may be modified or pulled, but
it very rarely grows. So you can just copy the header back into
the first frag just before we give it to the guest.

Only in the case where the packet header grows (e.g., encapsulation)
would you need to get an extra external buffer.

Cheers,
--
Email: Herbert Xu <herbert(a)gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Xin, Xiaohui on
>-----Original Message-----
>From: Herbert Xu [mailto:herbert(a)gondor.apana.org.au]
>Sent: Sunday, June 27, 2010 2:15 PM
>To: Dong, Eddie
>Cc: Xin, Xiaohui; Stephen Hemminger; netdev(a)vger.kernel.org; kvm(a)vger.kernel.org;
>linux-kernel(a)vger.kernel.org; mst(a)redhat.com; mingo(a)elte.hu; davem(a)davemloft.net;
>jdike(a)linux.intel.com
>Subject: Re: [RFC PATCH v7 01/19] Add a new structure for skb buffer from external.
>
>On Fri, Jun 25, 2010 at 09:03:46AM +0800, Dong, Eddie wrote:
>>
>> In current patch, each SKB for the assigned device (SRIOV VF or NIC or a complete
>queue pairs) uses the buffer from guest, so it eliminates copy completely in software and
>requires hardware to do so. If we can have an additonal place to store the buffer per skb (may
>cause copy later on), we can do copy later on or re-post the buffer to assigned NIC driver
>later on. But that may be not very clean either :(
>
>OK, if I understand you correctly then I don't think have a
>problem. With your current patch-set you have exactly the same
>situation when the skb->data is reallocated as a kernel buffer.
>

When will skb->data to be reallocated? May you point me the code path?

>This is OK because as you correctly argue, it is a rare situation.
>
>With my proposal you will need to get this extra external buffer
>in even less cases, because you'd only need to do it if the skb
>head grows, which only happens if it becomes encapsulated.
>So let me explain it in a bit more detail:
>
>Our packet starts out as a purely non-linear skb, i.e., skb->head
>contains nothing and all the page frags come from the guest.
>
>During host processing we may pull data into skb->head but the
>first frag will remain unless we pull all of it. If we did do
>that then you would have a free external buffer anyway.
>
>Now in the common case the header may be modified or pulled, but
>it very rarely grows. So you can just copy the header back into
>the first frag just before we give it to the guest.
>
Since the data is still there, so recompute the page offset and size is ok, right?

>Only in the case where the packet header grows (e.g., encapsulation)
>would you need to get an extra external buffer.
>
>Cheers,
>--
>Email: Herbert Xu <herbert(a)gondor.apana.org.au>
>Home Page: http://gondor.apana.org.au/~herbert/
>PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Michael S. Tsirkin on
On Mon, Jun 28, 2010 at 05:56:07PM +0800, Xin, Xiaohui wrote:
> >-----Original Message-----
> >From: Herbert Xu [mailto:herbert(a)gondor.apana.org.au]
> >Sent: Sunday, June 27, 2010 2:15 PM
> >To: Dong, Eddie
> >Cc: Xin, Xiaohui; Stephen Hemminger; netdev(a)vger.kernel.org; kvm(a)vger.kernel.org;
> >linux-kernel(a)vger.kernel.org; mst(a)redhat.com; mingo(a)elte.hu; davem(a)davemloft.net;
> >jdike(a)linux.intel.com
> >Subject: Re: [RFC PATCH v7 01/19] Add a new structure for skb buffer from external.
> >
> >On Fri, Jun 25, 2010 at 09:03:46AM +0800, Dong, Eddie wrote:
> >>
> >> In current patch, each SKB for the assigned device (SRIOV VF or NIC or a complete
> >queue pairs) uses the buffer from guest, so it eliminates copy completely in software and
> >requires hardware to do so. If we can have an additonal place to store the buffer per skb (may
> >cause copy later on), we can do copy later on or re-post the buffer to assigned NIC driver
> >later on. But that may be not very clean either :(
> >
> >OK, if I understand you correctly then I don't think have a
> >problem. With your current patch-set you have exactly the same
> >situation when the skb->data is reallocated as a kernel buffer.
> >
>
> When will skb->data to be reallocated? May you point me the code path?
>
> >This is OK because as you correctly argue, it is a rare situation.
> >
> >With my proposal you will need to get this extra external buffer
> >in even less cases, because you'd only need to do it if the skb
> >head grows, which only happens if it becomes encapsulated.
> >So let me explain it in a bit more detail:
> >
> >Our packet starts out as a purely non-linear skb, i.e., skb->head
> >contains nothing and all the page frags come from the guest.
> >
> >During host processing we may pull data into skb->head but the
> >first frag will remain unless we pull all of it. If we did do
> >that then you would have a free external buffer anyway.
> >
> >Now in the common case the header may be modified or pulled, but
> >it very rarely grows. So you can just copy the header back into
> >the first frag just before we give it to the guest.
> >
> Since the data is still there, so recompute the page offset and size is ok, right?

Question: can devices use parts of the same page
in frags of different skbs (or for other purposes)? If they do,
we'll corrupt that memory if we try to stick the header there.

We have another option, reserve some buffers
posted by guest and use them if we need to copy
the header. This seems the most straight-forward to me.

> >Only in the case where the packet header grows (e.g., encapsulation)
> >would you need to get an extra external buffer.
> >
> >Cheers,
> >--
> >Email: Herbert Xu <herbert(a)gondor.apana.org.au>
> >Home Page: http://gondor.apana.org.au/~herbert/
> >PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/