From: Sridhar Samudrala on
On Thu, 2010-06-10 at 18:20 +0300, Michael S. Tsirkin wrote:
> virtio net will never try to overflow the TX ring, so the only reason
> add_buf may fail is out of memory. Thus, we can not stop the
> device until some request completes - there's no guarantee anything
> at all is outstanding.
>
> Make the error message clearer as well: error here does not
> indicate queue full.
>
> Signed-off-by: Michael S. Tsirkin <mst(a)redhat.com>
> ---
> drivers/net/virtio_net.c | 15 ++++++++-------
> 1 files changed, 8 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 85615a3..e48a06f 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -563,7 +563,6 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
> struct virtnet_info *vi = netdev_priv(dev);
> int capacity;
>
> -again:
> /* Free up any pending old buffers before queueing new ones. */
> free_old_xmit_skbs(vi);
>
> @@ -572,12 +571,14 @@ again:
>
> /* This can happen with OOM and indirect buffers. */
> if (unlikely(capacity < 0)) {
> - netif_stop_queue(dev);
> - dev_warn(&dev->dev, "Unexpected full queue\n");
> - if (unlikely(!virtqueue_enable_cb(vi->svq))) {
> - virtqueue_disable_cb(vi->svq);
> - netif_start_queue(dev);
> - goto again;
> + if (net_ratelimit()) {
> + if (likely(capacity == -ENOMEM))
> + dev_warn(&dev->dev,
> + "TX queue failure: out of memory\n");
> + else
> + dev_warn(&dev->dev,
> + "Unexpected TX queue failure: %d\n",
> + capacity);
> }
> return NETDEV_TX_BUSY;
> }

It is not clear to me how xmit_skb() can return -ENOMEM.
xmit_skb() calls virtqueue_add_buf_gfp() which can return -ENOSPC.
Even vring_add_indirect() doesn't return -ENOMEM on kmalloc failure.

Thanks
Sridhar

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Stephen Hemminger on
On Thu, 10 Jun 2010 10:17:07 -0700
Sridhar Samudrala <sri(a)us.ibm.com> wrote:

> On Thu, 2010-06-10 at 18:20 +0300, Michael S. Tsirkin wrote:
> > virtio net will never try to overflow the TX ring, so the only reason
> > add_buf may fail is out of memory. Thus, we can not stop the
> > device until some request completes - there's no guarantee anything
> > at all is outstanding.
> >
> > Make the error message clearer as well: error here does not
> > indicate queue full.
> >
> > Signed-off-by: Michael S. Tsirkin <mst(a)redhat.com>
> > ---
> > drivers/net/virtio_net.c | 15 ++++++++-------
> > 1 files changed, 8 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > index 85615a3..e48a06f 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
> > @@ -563,7 +563,6 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
> > struct virtnet_info *vi = netdev_priv(dev);
> > int capacity;
> >
> > -again:
> > /* Free up any pending old buffers before queueing new ones. */
> > free_old_xmit_skbs(vi);
> >
> > @@ -572,12 +571,14 @@ again:
> >
> > /* This can happen with OOM and indirect buffers. */
> > if (unlikely(capacity < 0)) {
> > - netif_stop_queue(dev);
> > - dev_warn(&dev->dev, "Unexpected full queue\n");
> > - if (unlikely(!virtqueue_enable_cb(vi->svq))) {
> > - virtqueue_disable_cb(vi->svq);
> > - netif_start_queue(dev);
> > - goto again;
> > + if (net_ratelimit()) {
> > + if (likely(capacity == -ENOMEM))
> > + dev_warn(&dev->dev,
> > + "TX queue failure: out of memory\n");
> > + else
> > + dev_warn(&dev->dev,
> > + "Unexpected TX queue failure: %d\n",
> > + capacity);
> > }
> > return NETDEV_TX_BUSY;
> > }
>
> It is not clear to me how xmit_skb() can return -ENOMEM.
> xmit_skb() calls virtqueue_add_buf_gfp() which can return -ENOSPC.
> Even vring_add_indirect() doesn't return -ENOMEM on kmalloc failure.

It makes more sense to have the device increment tx_droppped,
and return NETDEV_TX_OK. Skip the message (or make it a pr_debug()).
Network devices do not guarantee packet delivery, and if out of
resources then holding more data in the
queue is going to hurt not help the situation.

--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Michael S. Tsirkin on
On Thu, Jun 10, 2010 at 10:17:07AM -0700, Sridhar Samudrala wrote:
> On Thu, 2010-06-10 at 18:20 +0300, Michael S. Tsirkin wrote:
> > virtio net will never try to overflow the TX ring, so the only reason
> > add_buf may fail is out of memory. Thus, we can not stop the
> > device until some request completes - there's no guarantee anything
> > at all is outstanding.
> >
> > Make the error message clearer as well: error here does not
> > indicate queue full.
> >
> > Signed-off-by: Michael S. Tsirkin <mst(a)redhat.com>
> > ---
> > drivers/net/virtio_net.c | 15 ++++++++-------
> > 1 files changed, 8 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > index 85615a3..e48a06f 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
> > @@ -563,7 +563,6 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
> > struct virtnet_info *vi = netdev_priv(dev);
> > int capacity;
> >
> > -again:
> > /* Free up any pending old buffers before queueing new ones. */
> > free_old_xmit_skbs(vi);
> >
> > @@ -572,12 +571,14 @@ again:
> >
> > /* This can happen with OOM and indirect buffers. */
> > if (unlikely(capacity < 0)) {
> > - netif_stop_queue(dev);
> > - dev_warn(&dev->dev, "Unexpected full queue\n");
> > - if (unlikely(!virtqueue_enable_cb(vi->svq))) {
> > - virtqueue_disable_cb(vi->svq);
> > - netif_start_queue(dev);
> > - goto again;
> > + if (net_ratelimit()) {
> > + if (likely(capacity == -ENOMEM))
> > + dev_warn(&dev->dev,
> > + "TX queue failure: out of memory\n");
> > + else
> > + dev_warn(&dev->dev,
> > + "Unexpected TX queue failure: %d\n",
> > + capacity);
> > }
> > return NETDEV_TX_BUSY;
> > }
>
> It is not clear to me how xmit_skb() can return -ENOMEM.
> xmit_skb() calls virtqueue_add_buf_gfp() which can return -ENOSPC.
> Even vring_add_indirect() doesn't return -ENOMEM on kmalloc failure.
>
> Thanks
> Sridhar

A separate patch fixes vring_add_indirect to return -ENOMEM.
-ENOSPC really means ring is full so nothing to do
and no need to retry.

--
MST
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Rusty Russell on
On Fri, 11 Jun 2010 04:33:43 am Michael S. Tsirkin wrote:
> On Thu, Jun 10, 2010 at 10:46:53AM -0700, Stephen Hemminger wrote:
> > It makes more sense to have the device increment tx_droppped,
> > and return NETDEV_TX_OK. Skip the message (or make it a pr_debug()).
> > Network devices do not guarantee packet delivery, and if out of
> > resources then holding more data in the
> > queue is going to hurt not help the situation.

Yes, actually oom should be a ratelimited message and TX_OK, the other
case is a "should never happen" logic bug which warrants an error and
I don't care what it returns (whatever's easiest).

Please fix both at once since you're touching it.

Thanks,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Rusty Russell on
On Fri, 11 Jun 2010 04:33:43 am Michael S. Tsirkin wrote:
> > > > @@ -572,12 +571,14 @@ again:
> > > >
> > > > /* This can happen with OOM and indirect buffers. */
> > > > if (unlikely(capacity < 0)) {
> > > > - netif_stop_queue(dev);
> > > > - dev_warn(&dev->dev, "Unexpected full queue\n");
> > > > - if (unlikely(!virtqueue_enable_cb(vi->svq))) {
> > > > - virtqueue_disable_cb(vi->svq);
> > > > - netif_start_queue(dev);
> > > > - goto again;
> > > > + if (net_ratelimit()) {
> > > > + if (likely(capacity == -ENOMEM))
> > > > + dev_warn(&dev->dev,
> > > > + "TX queue failure: out of memory\n");
> > > > + else
> > > > + dev_warn(&dev->dev,
> > > > + "Unexpected TX queue failure: %d\n",
> > > > + capacity);
....
>
> Well, I only keep the existing behaviour around.

Actually, it *does* change behavior, as the comment indicates. So let's
fix the whole thing. AFAICT wth TX_BUSY we'll get called again RSN, and
that's not really useful for OOM.

This is what I have:

Subject: virtio_net: fix oom handling on tx
Date: Thu, 10 Jun 2010 18:20:41 +0300
From: "Michael S. Tsirkin" <mst(a)redhat.com>

virtio net will never try to overflow the TX ring, so the only reason
add_buf may fail is out of memory. Thus, we can not stop the
device until some request completes - there's no guarantee anything
at all is outstanding.

Make the error message clearer as well: error here does not
indicate queue full.

Signed-off-by: Michael S. Tsirkin <mst(a)redhat.com>
Signed-off-by: Rusty Russell <rusty(a)rustcorp.com.au> (...and avoid TX_BUSY)
Cc: stable(a)kernel.org
---
drivers/net/virtio_net.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -562,7 +562,6 @@ static netdev_tx_t start_xmit(struct sk_
struct virtnet_info *vi = netdev_priv(dev);
int capacity;

-again:
/* Free up any pending old buffers before queueing new ones. */
free_old_xmit_skbs(vi);

@@ -571,14 +570,17 @@ again:

/* This can happen with OOM and indirect buffers. */
if (unlikely(capacity < 0)) {
- netif_stop_queue(dev);
- dev_warn(&dev->dev, "Unexpected full queue\n");
- if (unlikely(!virtqueue_enable_cb(vi->svq))) {
- virtqueue_disable_cb(vi->svq);
- netif_start_queue(dev);
- goto again;
+ if (net_ratelimit()) {
+ if (likely(capacity == -ENOMEM))
+ dev_warn(&dev->dev,
+ "TX queue failure: out of memory\n");
+ else
+ dev_warn(&dev->dev,
+ "Unexpected TX queue failure: %d\n",
+ capacity);
}
- return NETDEV_TX_BUSY;
+ kfree_skb(skb);
+ return NETDEV_TX_OK;
}
virtqueue_kick(vi->svq);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/