| 	
Prev: stop_machine: struct cpu_stopper, remove alignment padding  on 64 bits Next: stop_machine: struct cpu_stopper, remove alignment padding on 64 bits 	
		 From: Michael Leun on 5 Aug 2010 07:50 On Thu, 05 Aug 2010 02:51:29 -0700 ebiederm(a)xmission.com (Eric W. Biederman) wrote: > >> > Jul 10 20:02:36 doris kernel: unregister_netdevice: waiting for > >> > lo to become free. Usage count = 3 [repeated] > >> > >> How many times? > > > > Unfortunately looks like indefinitely. Never watched longer so far > > (rebooted soon), but I'm seeing this message now repeated every 10 > > secs for ~10 minutes on a idle system. > > Ugh. A real bug then. These can be a pain to track down and fix. I > think the last one of these I tracked down took a couple of weeks. I > will start digging in when I get back from vacation. OK, fortunately (hopefully) you have not put to much time onto that so far - because everything I told about usage of tun and difference between ssh and openvpn is complete nonsense. I happen to have an script in that openvpn config, which puts an ipv6 address on the vpn device. Putting an ipv6 address on a device seems to be the trigger: OrigNS > # ip link add type veth OrigNS > # ip link set dev veth0 up OrigNS > # unshare -n /bin/bash NewNS > # echo $$ <SomePID> OrigNS > # ip link set dev veth1 netns <SomePID> # this, of course is on a different terminal NewNS > # ip link set dev veth1 up NewNS > # ip -6 addr add dev veth1 fd50:dead:beef::1/64 NewNS > # exit Yields kernel: unregister_netdevice: waiting for veth1 to become free. Usage count = 3 Oh - its veth1 this time, not lo - add an "ip link set up dev lo" in the above scenario just after the unshare, and you get the message with lo. One might ask, if > # unshare -n /bin/bash > # ip link set up dev lo > # ip -6 addr add dev veth1 fd50:dead:beef::1/64 > # exit also does the trick, so I tried it - and it does NOT. In the above scenario, not setting veth0 and veth1 up also makes it not happen. Only setting veth1 up also is not enough (seems to need to be "really up" what as you shurely know with veth is only the case when both sides are up). I hope, this makes it somewhat easier to track that down. -- MfG, Michael Leun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ 	
		 From: David Miller on 5 Aug 2010 16:20 From: ebiederm(a)xmission.com (Eric W. Biederman) Date: Thu, 05 Aug 2010 12:57:59 -0700 > I wonder what has changed with ipv6 recently. There was a recent fix to the IGMP snooping code we have in the bridging layer, if parsing of an ipv6 IGMP packet failed we'd leak the packet (and thus references to whatever device it referenced). commit 6d1d1d398cb7db7a12c5d652d50f85355345234f Author: Herbert Xu <herbert(a)gondor.apana.org.au> Date: Thu Jul 29 01:12:31 2010 +0000 bridge: Fix skb leak when multicast parsing fails on TX On the bridge TX path we're leaking an skb when br_multicast_rcv returns an error. Reported-by: David Lamparter <equinox(a)diac24.net> Signed-off-by: Herbert Xu <herbert(a)gondor.apana.org.au> Signed-off-by: David S. Miller <davem(a)davemloft.net> diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c index 4cec805..f49bcd9 100644 --- a/net/bridge/br_device.c +++ b/net/bridge/br_device.c @@ -48,8 +48,10 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev) rcu_read_lock(); if (is_multicast_ether_addr(dest)) { - if (br_multicast_rcv(br, NULL, skb)) + if (br_multicast_rcv(br, NULL, skb)) { + kfree_skb(skb); goto out; + } mdst = br_mdb_get(br, skb); if (mdst || BR_INPUT_SKB_CB_MROUTERS_ONLY(skb)) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ 	
		 From: lkml20100708 on 5 Aug 2010 19:50 On Thu, 05 Aug 2010 13:11:28 -0700 (PDT) David Miller <davem(a)davemloft.net> wrote: > From: ebiederm(a)xmission.com (Eric W. Biederman) > Date: Thu, 05 Aug 2010 12:57:59 -0700 > > > I wonder what has changed with ipv6 recently. > > There was a recent fix to the IGMP snooping code we have in > the bridging layer, if parsing of an ipv6 IGMP packet failed > we'd leak the packet (and thus references to whatever device > it referenced). > > commit 6d1d1d398cb7db7a12c5d652d50f85355345234f [...] But this patch is not in 2.6.35 and therefore cannot make the difference Eric sees (belives to see) between his modified 2.6.32 and 2.6.35. Also, this patch, if I understand that correctly, only changes bridging and in my scenario bridge.ko (have it as module) was not even loaded, so applying this patch should not make any difference for the bug I see, or do I overlook something? So, I guess, your answer was general information to Erics question what changed with ipv6, not related to that bug we seek in particular? -- MfG, Michael Leun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ 	
		 From: Michael Leun on 5 Aug 2010 20:20 On Thu, 05 Aug 2010 12:57:59 -0700 ebiederm(a)xmission.com (Eric W. Biederman) wrote: > What puzzles me is that on a slightly patched 2.6.32 (so sysfs works) > and I am doing very similar things (openvpn tunnels, ipv6 to the > network as a whole etc), and I am not seeing the infinite > unregister_netdevice: messages you are talking about. Hmmm, I think there are 2 possibilities: - You send me a patch against plain 2.6.32, so I can check my scenarios against that kernel or - You could try yourself, its really just that few lines against a fresh booted system in a clean, easy to reproduce state (Only, if you think that would yield useful information, of course). > When a network device is removed most references to it are redirected > to the loopback device so a normal network device should not see the > worst of the problems. That is why lo showed up. > > In that context I'm a bit surprised you managed trigger a problem on > veth1. Difference was, when that message showed up with veth1, lo in that namespace was down while testing. When lo was up it showed up on lo. -- MfG, Michael Leun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |