r8169: Fix rtl8169_rx_interrupt() [Kernel]

Prev: rcu: make rcu_read_lock_bh_held() allow for disabled BH
Next: drivers/serial/sunsab.c: adjust the constant used to initialize the interrupt_mask0 fields

From: Sergey Senozhatsky on 16 Mar 2010 02:50

Hello,

Sure. Give me a couple of days. By the way, should I test against .34
or .33?

Sergey

On (03/16/10 01:33), Eric Dumazet wrote:
> Le lundi 15 mars 2010 à 22:01 +0100, Eric Dumazet a écrit :
>
> > Yes, this is wrong. In this context (process context, not softirq), we
> > should use netif_rx() or just drop frames if we are in reset phase.
> >
>
> Sergey,
>
> Here is a compiled but untested patch (I dont have the hardware), could
> you please test it ?
>
> Thanks
>
> [PATCH] r8169: Fix rtl8169_rx_interrupt()
>
> In case a reset is performed, rtl8169_rx_interrupt() is called from
> process context instead of softirq context. Special care must be taken
> to call appropriate network core services (netif_rx() instead of
> netif_receive_skb()). VLAN handling also corrected.
>
> Reported-by: Sergey Senozhatsky <sergey.senozhatsky(a)gmail.com>
> Diagnosed-by: Oleg Nesterov <oleg(a)redhat.com>
> Signed-off-by: Eric Dumazet <eric.dumazet(a)gmail.com>
> ---
> diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
> index 9d3ebf3..d873639 100644
> --- a/drivers/net/r8169.c
> +++ b/drivers/net/r8169.c
> @@ -1038,14 +1038,14 @@ static void rtl8169_vlan_rx_register(struct net_device *dev,
> }
>
> static int rtl8169_rx_vlan_skb(struct rtl8169_private *tp, struct RxDesc *desc,
> - struct sk_buff *skb)
> + struct sk_buff *skb, int polling)
> {
> u32 opts2 = le32_to_cpu(desc->opts2);
> struct vlan_group *vlgrp = tp->vlgrp;
> int ret;
>
> if (vlgrp && (opts2 & RxVlanTag)) {
> - vlan_hwaccel_receive_skb(skb, vlgrp, swab16(opts2 & 0xffff));
> + __vlan_hwaccel_rx(skb, vlgrp, swab16(opts2 & 0xffff), polling);
> ret = 0;
> } else
> ret = -1;
> @@ -1062,7 +1062,7 @@ static inline u32 rtl8169_tx_vlan_tag(struct rtl8169_private *tp,
> }
>
> static int rtl8169_rx_vlan_skb(struct rtl8169_private *tp, struct RxDesc *desc,
> - struct sk_buff *skb)
> + struct sk_buff *skb, int polling)
> {
> return -1;
> }
> @@ -4429,12 +4429,20 @@ out:
> return done;
> }
>
> +/*
> + * Warning : rtl8169_rx_interrupt() might be called :
> + * 1) from NAPI (softirq) context
> + * (polling = 1 : we should call netif_receive_skb())
> + * 2) from process context (rtl8169_reset_task())
> + * (polling = 0 : we must call netif_rx() instead)
> + */
> static int rtl8169_rx_interrupt(struct net_device *dev,
> struct rtl8169_private *tp,
> void __iomem *ioaddr, u32 budget)
> {
> unsigned int cur_rx, rx_left;
> unsigned int delta, count;
> + int polling = (budget != ~(u32)0) ? 1 : 0;
>
> cur_rx = tp->cur_rx;
> rx_left = NUM_RX_DESC + tp->dirty_rx - cur_rx;
> @@ -4496,8 +4504,12 @@ static int rtl8169_rx_interrupt(struct net_device *dev,
> skb_put(skb, pkt_size);
> skb->protocol = eth_type_trans(skb, dev);
>
> - if (rtl8169_rx_vlan_skb(tp, desc, skb) < 0)
> - netif_receive_skb(skb);
> + if (rtl8169_rx_vlan_skb(tp, desc, skb, polling) < 0) {
> + if (likely(polling))
> + netif_receive_skb(skb);
> + else
> + netif_rx(skb);
> + }
>
> dev->stats.rx_bytes += pkt_size;
> dev->stats.rx_packets++;
>
>

From: Eric Dumazet on 16 Mar 2010 03:20

Le mardi 16 mars 2010 à 08:50 +0200, Sergey Senozhatsky a écrit :
> Hello,
>
> Sure. Give me a couple of days. By the way, should I test against .34
> or .33?
>

It's an old bug anyway, so you can pick up whats is the best for you.
My personal pref would be current kernel.

To trigger the original condition (fifo overflow), you might flood your
machine from another one with small packets, for example using pktgen.

Thanks

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Sergey Senozhatsky on 16 Mar 2010 11:00

Hello,

Eric, I did pktgen testing (3 times).

sudo cat /proc/net/pktgen/eth0
Params: count 1000000000 min_pkt_size: 42 max_pkt_size: 42
frags: 0 delay: 0 clone_skb: 4000000 ifname: eth0
flows: 0 flowlen: 0
queue_map_min: 0 queue_map_max: 0
dst_min: 10.6.22.59 dst_max:
src_min: src_max:
src_mac: 00:1a:92:c9:a0:68 dst_mac: 00:1a:92:c9:a0:68
udp_src_min: 9 udp_src_max: 9 udp_dst_min: 9 udp_dst_max: 9
src_mac_count: 0 dst_mac_count: 0
Flags:
Current:
pkts-sofar: 1000000000 errors: 0
started: 9957912410us stopped: 16682620361us idle: 1231us
seq_num: 1000000001 cur_dst_mac_offset: 0 cur_src_mac_offset: 0
cur_saddr: 0x3b16060a cur_daddr: 0x3b16060a
cur_udp_dst: 9 cur_udp_src: 9
cur_queue_map: 0
flows: 0
Result: OK: 6724707951(c6724706719+d1231) nsec, 1000000000 (42byte,0frags)
148705pps 49Mb/sec (49964880bps) errors: 0

Without any errors from the r8169 side. Can we consider testing successful?
If so, fell free to add Tested-by: Sergey Senozhatsky <sergey.senozhatsky(a)gmail.com>

Sergey

On (03/16/10 01:33), Eric Dumazet wrote:
> > Yes, this is wrong. In this context (process context, not softirq), we
> > should use netif_rx() or just drop frames if we are in reset phase.
> >
>
> Sergey,
>
> Here is a compiled but untested patch (I dont have the hardware), could
> you please test it ?
>
> Thanks
>
> [PATCH] r8169: Fix rtl8169_rx_interrupt()
>
> In case a reset is performed, rtl8169_rx_interrupt() is called from
> process context instead of softirq context. Special care must be taken
> to call appropriate network core services (netif_rx() instead of
> netif_receive_skb()). VLAN handling also corrected.
>
> Reported-by: Sergey Senozhatsky <sergey.senozhatsky(a)gmail.com>
> Diagnosed-by: Oleg Nesterov <oleg(a)redhat.com>
> Signed-off-by: Eric Dumazet <eric.dumazet(a)gmail.com>
> ---
> diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
> index 9d3ebf3..d873639 100644
> --- a/drivers/net/r8169.c
> +++ b/drivers/net/r8169.c
> @@ -1038,14 +1038,14 @@ static void rtl8169_vlan_rx_register(struct net_device *dev,
> }
>
> static int rtl8169_rx_vlan_skb(struct rtl8169_private *tp, struct RxDesc *desc,
> - struct sk_buff *skb)
> + struct sk_buff *skb, int polling)
> {
> u32 opts2 = le32_to_cpu(desc->opts2);
> struct vlan_group *vlgrp = tp->vlgrp;
> int ret;
>
> if (vlgrp && (opts2 & RxVlanTag)) {
> - vlan_hwaccel_receive_skb(skb, vlgrp, swab16(opts2 & 0xffff));
> + __vlan_hwaccel_rx(skb, vlgrp, swab16(opts2 & 0xffff), polling);
> ret = 0;
> } else
> ret = -1;
> @@ -1062,7 +1062,7 @@ static inline u32 rtl8169_tx_vlan_tag(struct rtl8169_private *tp,
> }
>
> static int rtl8169_rx_vlan_skb(struct rtl8169_private *tp, struct RxDesc *desc,
> - struct sk_buff *skb)
> + struct sk_buff *skb, int polling)
> {
> return -1;
> }
> @@ -4429,12 +4429,20 @@ out:
> return done;
> }
>
> +/*
> + * Warning : rtl8169_rx_interrupt() might be called :
> + * 1) from NAPI (softirq) context
> + * (polling = 1 : we should call netif_receive_skb())
> + * 2) from process context (rtl8169_reset_task())
> + * (polling = 0 : we must call netif_rx() instead)
> + */
> static int rtl8169_rx_interrupt(struct net_device *dev,
> struct rtl8169_private *tp,
> void __iomem *ioaddr, u32 budget)
> {
> unsigned int cur_rx, rx_left;
> unsigned int delta, count;
> + int polling = (budget != ~(u32)0) ? 1 : 0;
>
> cur_rx = tp->cur_rx;
> rx_left = NUM_RX_DESC + tp->dirty_rx - cur_rx;
> @@ -4496,8 +4504,12 @@ static int rtl8169_rx_interrupt(struct net_device *dev,
> skb_put(skb, pkt_size);
> skb->protocol = eth_type_trans(skb, dev);
>
> - if (rtl8169_rx_vlan_skb(tp, desc, skb) < 0)
> - netif_receive_skb(skb);
> + if (rtl8169_rx_vlan_skb(tp, desc, skb, polling) < 0) {
> + if (likely(polling))
> + netif_receive_skb(skb);
> + else
> + netif_rx(skb);
> + }
>
> dev->stats.rx_bytes += pkt_size;
> dev->stats.rx_packets++;
>
>

From: Sergey Senozhatsky on 16 Mar 2010 11:10

Nope!
Here it is:

[17250.998293] ------------[ cut here ]------------
[17250.998305] WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0xc1/0x125()
[17250.998308] Hardware name: F3JC
[17250.998312] NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
[17250.998315] Modules linked in: pktgen ppp_async crc_ccitt ipv6 ppp_generic slhc snd_hwdep snd_hda_codec_si3054 snd_hda_codec_realtek sdhci_pci sdhci asus_laptop sparse_keymap
mmc_core led_class snd_hda_intel snd_hda_codec snd_pcm snd_timer snd_page_alloc rng_core sg evdev i2c_i801 snd soundcore psmouse r8169 serio_raw mii uhci_hcd ehci_hcd sr_mod
usbcore cdrom sd_mod ata_piix
[17250.998371] Pid: 3985, comm: kpktgend_0 Tainted: G W 2.6.34-rc1-dbg-git6-r8169 #46
[17250.998375] Call Trace:
[17250.998383] [<c102e353>] warn_slowpath_common+0x65/0x7c
[17250.998388] [<c126b654>] ? dev_watchdog+0xc1/0x125
[17250.998393] [<c102e39e>] warn_slowpath_fmt+0x24/0x27
[17250.998398] [<c126b654>] dev_watchdog+0xc1/0x125
[17250.998405] [<c1036bbb>] ? run_timer_softirq+0x120/0x1eb
[17250.998411] [<c1036c11>] run_timer_softirq+0x176/0x1eb
[17250.998416] [<c1036bbb>] ? run_timer_softirq+0x120/0x1eb
[17250.998421] [<c126b593>] ? dev_watchdog+0x0/0x125
[17250.998426] [<c1032df9>] __do_softirq+0x8d/0x117
[17250.998431] [<c1032eae>] do_softirq+0x2b/0x43
[17250.998436] [<c1032fd3>] irq_exit+0x38/0x75
[17250.998442] [<c1014e75>] smp_apic_timer_interrupt+0x66/0x74
[17250.998448] [<c12c812a>] apic_timer_interrupt+0x36/0x3c
[17250.998457] [<c1185d18>] ? do_raw_spin_trylock+0x28/0x37
[17250.998464] [<c12c7101>] _raw_spin_lock+0x2f/0x58
[17250.998472] [<f80855ca>] ? spin_lock+0x8/0xa [pktgen]
[17250.998478] [<f80855ca>] spin_lock+0x8/0xa [pktgen]
[17250.998484] [<f80873b6>] pktgen_thread_worker+0x9b/0x631 [pktgen]
[17250.998491] [<c103f9f1>] ? autoremove_wake_function+0x0/0x2f
[17250.998497] [<c103f9f1>] ? autoremove_wake_function+0x0/0x2f
[17250.998503] [<f808731b>] ? pktgen_thread_worker+0x0/0x631 [pktgen]
[17250.998508] [<c103f6ce>] kthread+0x6a/0x6f
[17250.998514] [<c103f664>] ? kthread+0x0/0x6f
[17250.998520] [<c1002e42>] kernel_thread_helper+0x6/0x1a
[17250.998523] ---[ end trace a22d306b065d4a68 ]---
[17251.011663] r8169 0000:02:00.0: eth0: link up

[17419.011748] NOHZ: local_softirq_pending 08

Sergey

On (03/16/10 01:33), Eric Dumazet wrote:
> > Yes, this is wrong. In this context (process context, not softirq), we
> > should use netif_rx() or just drop frames if we are in reset phase.
> >
>
> Sergey,
>
> Here is a compiled but untested patch (I dont have the hardware), could
> you please test it ?
>
> Thanks
>
> [PATCH] r8169: Fix rtl8169_rx_interrupt()
>
> In case a reset is performed, rtl8169_rx_interrupt() is called from
> process context instead of softirq context. Special care must be taken
> to call appropriate network core services (netif_rx() instead of
> netif_receive_skb()). VLAN handling also corrected.
>
> Reported-by: Sergey Senozhatsky <sergey.senozhatsky(a)gmail.com>
> Diagnosed-by: Oleg Nesterov <oleg(a)redhat.com>
> Signed-off-by: Eric Dumazet <eric.dumazet(a)gmail.com>
> ---
> diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
> index 9d3ebf3..d873639 100644
> --- a/drivers/net/r8169.c
> +++ b/drivers/net/r8169.c
> @@ -1038,14 +1038,14 @@ static void rtl8169_vlan_rx_register(struct net_device *dev,
> }
>
> static int rtl8169_rx_vlan_skb(struct rtl8169_private *tp, struct RxDesc *desc,
> - struct sk_buff *skb)
> + struct sk_buff *skb, int polling)
> {
> u32 opts2 = le32_to_cpu(desc->opts2);
> struct vlan_group *vlgrp = tp->vlgrp;
> int ret;
>
> if (vlgrp && (opts2 & RxVlanTag)) {
> - vlan_hwaccel_receive_skb(skb, vlgrp, swab16(opts2 & 0xffff));
> + __vlan_hwaccel_rx(skb, vlgrp, swab16(opts2 & 0xffff), polling);
> ret = 0;
> } else
> ret = -1;
> @@ -1062,7 +1062,7 @@ static inline u32 rtl8169_tx_vlan_tag(struct rtl8169_private *tp,
> }
>
> static int rtl8169_rx_vlan_skb(struct rtl8169_private *tp, struct RxDesc *desc,
> - struct sk_buff *skb)
> + struct sk_buff *skb, int polling)
> {
> return -1;
> }
> @@ -4429,12 +4429,20 @@ out:
> return done;
> }
>
> +/*
> + * Warning : rtl8169_rx_interrupt() might be called :
> + * 1) from NAPI (softirq) context
> + * (polling = 1 : we should call netif_receive_skb())
> + * 2) from process context (rtl8169_reset_task())
> + * (polling = 0 : we must call netif_rx() instead)
> + */
> static int rtl8169_rx_interrupt(struct net_device *dev,
> struct rtl8169_private *tp,
> void __iomem *ioaddr, u32 budget)
> {
> unsigned int cur_rx, rx_left;
> unsigned int delta, count;
> + int polling = (budget != ~(u32)0) ? 1 : 0;
>
> cur_rx = tp->cur_rx;
> rx_left = NUM_RX_DESC + tp->dirty_rx - cur_rx;
> @@ -4496,8 +4504,12 @@ static int rtl8169_rx_interrupt(struct net_device *dev,
> skb_put(skb, pkt_size);
> skb->protocol = eth_type_trans(skb, dev);
>
> - if (rtl8169_rx_vlan_skb(tp, desc, skb) < 0)
> - netif_receive_skb(skb);
> + if (rtl8169_rx_vlan_skb(tp, desc, skb, polling) < 0) {
> + if (likely(polling))
> + netif_receive_skb(skb);
> + else
> + netif_rx(skb);
> + }
>
> dev->stats.rx_bytes += pkt_size;
> dev->stats.rx_packets++;
>
>

From: Sergey Senozhatsky on 16 Mar 2010 11:10

On (03/16/10 16:05), Eric Dumazet wrote:
> > Nope!
> > Here it is:
> >
> >
> > [17250.998293] ------------[ cut here ]------------
> > [17250.998305] WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0xc1/0x125()
> > [17250.998308] Hardware name: F3JC
> > [17250.998312] NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
> > [17250.998315] Modules linked in: pktgen ppp_async crc_ccitt ipv6 ppp_generic slhc snd_hwdep snd_hda_codec_si3054 snd_hda_codec_realtek sdhci_pci sdhci asus_laptop sparse_keymap
> > mmc_core led_class snd_hda_intel snd_hda_codec snd_pcm snd_timer snd_page_alloc rng_core sg evdev i2c_i801 snd soundcore psmouse r8169 serio_raw mii uhci_hcd ehci_hcd sr_mod
> > usbcore cdrom sd_mod ata_piix
> > [17250.998371] Pid: 3985, comm: kpktgend_0 Tainted: G W 2.6.34-rc1-dbg-git6-r8169 #46
> > [17250.998375] Call Trace:
> > [17250.998383] [<c102e353>] warn_slowpath_common+0x65/0x7c
> > [17250.998388] [<c126b654>] ? dev_watchdog+0xc1/0x125
> > [17250.998393] [<c102e39e>] warn_slowpath_fmt+0x24/0x27
> > [17250.998398] [<c126b654>] dev_watchdog+0xc1/0x125
> > [17250.998405] [<c1036bbb>] ? run_timer_softirq+0x120/0x1eb
> > [17250.998411] [<c1036c11>] run_timer_softirq+0x176/0x1eb
> > [17250.998416] [<c1036bbb>] ? run_timer_softirq+0x120/0x1eb
> > [17250.998421] [<c126b593>] ? dev_watchdog+0x0/0x125
> > [17250.998426] [<c1032df9>] __do_softirq+0x8d/0x117
> > [17250.998431] [<c1032eae>] do_softirq+0x2b/0x43
> > [17250.998436] [<c1032fd3>] irq_exit+0x38/0x75
> > [17250.998442] [<c1014e75>] smp_apic_timer_interrupt+0x66/0x74
> > [17250.998448] [<c12c812a>] apic_timer_interrupt+0x36/0x3c
> > [17250.998457] [<c1185d18>] ? do_raw_spin_trylock+0x28/0x37
> > [17250.998464] [<c12c7101>] _raw_spin_lock+0x2f/0x58
> > [17250.998472] [<f80855ca>] ? spin_lock+0x8/0xa [pktgen]
> > [17250.998478] [<f80855ca>] spin_lock+0x8/0xa [pktgen]
> > [17250.998484] [<f80873b6>] pktgen_thread_worker+0x9b/0x631 [pktgen]
> > [17250.998491] [<c103f9f1>] ? autoremove_wake_function+0x0/0x2f
> > [17250.998497] [<c103f9f1>] ? autoremove_wake_function+0x0/0x2f
> > [17250.998503] [<f808731b>] ? pktgen_thread_worker+0x0/0x631 [pktgen]
> > [17250.998508] [<c103f6ce>] kthread+0x6a/0x6f
> > [17250.998514] [<c103f664>] ? kthread+0x0/0x6f
> > [17250.998520] [<c1002e42>] kernel_thread_helper+0x6/0x1a
> > [17250.998523] ---[ end trace a22d306b065d4a68 ]---
> > [17251.011663] r8169 0000:02:00.0: eth0: link up
> >
> > [17419.011748] NOHZ: local_softirq_pending 08
> >
> >
>
> But this stack trace is on pktgen side (the sender), and my patch was
> about the receiver ?
>

[...]
>> NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
Isn't it r8169 related?

> I wonder if you dont hit another problem :)
:)

Sergey

| Next | Last
Pages: 1 2 3
Prev: rcu: make rcu_read_lock_bh_held() allow for disabled BH
Next: drivers/serial/sunsab.c: adjust the constant used to initialize the interrupt_mask0 fields