From: Eric Dumazet on
Francis Moreau a �crit :
> Hello Eric,
>
> It seems I still have a related bug, please have a look to the following oops.
>
> This happened on a 2.6.32-rc5 where your patch is included.
>
> [107304.558821] nfsd: last server has exited, flushing export cache
> [107304.558848] ------------[ cut here ]------------
> [107304.558858] WARNING: at net/ipv4/af_inet.c:153
> inet_sock_destruct+0x161/0x17c()
> [107304.558862] Hardware name: P5K-VM
> [107304.558865] Modules linked in: kvm_intel kvm jfs loop nfsd lockd
> nfs_acl auth_rpcgss exportfs sunrpc [last unloaded: microcode]
> [107304.558889] Pid: 8198, comm: nfsd Tainted: G M 2.6.32-rc5 #25
> [107304.558892] Call Trace:
> [107304.558899] [<ffffffff81429f19>] ? inet_sock_destruct+0x161/0x17c
> [107304.558907] [<ffffffff810487e9>] warn_slowpath_common+0x7c/0xa9
> [107304.558914] [<ffffffff8104882a>] warn_slowpath_null+0x14/0x16
> [107304.558920] [<ffffffff81429f19>] inet_sock_destruct+0x161/0x17c
> [107304.558927] [<ffffffff813c8741>] __sk_free+0x23/0xe7
> [107304.558933] [<ffffffff813c8881>] sk_free+0x1f/0x21
> [107304.558939] [<ffffffff813c894b>] sk_common_release+0xc8/0xcd
> [107304.558944] [<ffffffff81420b59>] udp_lib_close+0xe/0x10
> [107304.558951] [<ffffffff814299bf>] inet_release+0x55/0x5c
> [107304.558957] [<ffffffff813c5aa9>] sock_release+0x1f/0x71
> [107304.558962] [<ffffffff813c5b22>] sock_close+0x27/0x2b
> [107304.558968] [<ffffffff810eb60f>] __fput+0xfb/0x1c0
> [107304.558973] [<ffffffff810eb6f1>] fput+0x1d/0x1f
> [107304.558995] [<ffffffffa0013e23>] svc_sock_free+0x40/0x56 [sunrpc]
> [107304.559018] [<ffffffffa001f392>] svc_xprt_free+0x43/0x53 [sunrpc]
> [107304.559038] [<ffffffffa001f34f>] ? svc_xprt_free+0x0/0x53 [sunrpc]
> [107304.559048] [<ffffffff811d9275>] kref_put+0x43/0x4f
> [107304.559069] [<ffffffffa001e67a>] svc_close_xprt+0x55/0x5e [sunrpc]
> [107304.559088] [<ffffffffa001e6d3>] svc_close_all+0x50/0x69 [sunrpc]
> [107304.559107] [<ffffffffa0012a2b>] svc_destroy+0x9e/0x142 [sunrpc]
> [107304.559126] [<ffffffffa0012b88>] svc_exit_thread+0xb9/0xc2 [sunrpc]
> [107304.559138] [<ffffffffa008981b>] ? nfsd+0x0/0x13f [nfsd]
> [107304.559149] [<ffffffffa0089940>] nfsd+0x125/0x13f [nfsd]
> [107304.559157] [<ffffffff810685e3>] kthread+0x82/0x8a
> [107304.559164] [<ffffffff8100c13a>] child_rip+0xa/0x20
> [107304.559172] [<ffffffff8100baad>] ? restore_args+0x0/0x30
> [107304.559179] [<ffffffff81068561>] ? kthread+0x0/0x8a
> [107304.559185] [<ffffffff8100c130>] ? child_rip+0x0/0x20
> [107304.559191] ---[ end trace c107131f4762168c ]---
> [107304.927931] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state
> recovery directory
> [107304.932765] NFSD: starting 90-second grace period
>

Thanks Francis, I think I found the problem.

I am preparing a patch, test it, and submit it in couple of hours

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Eric Dumazet on
Francis Moreau a �crit :
> Hello Eric,
>
> It seems I still have a related bug, please have a look to the following oops.
>
> This happened on a 2.6.32-rc5 where your patch is included.
>
> [107304.558821] nfsd: last server has exited, flushing export cache
> [107304.558848] ------------[ cut here ]------------
> [107304.558858] WARNING: at net/ipv4/af_inet.c:153
> inet_sock_destruct+0x161/0x17c()
> [107304.558862] Hardware name: P5K-VM
> [107304.558865] Modules linked in: kvm_intel kvm jfs loop nfsd lockd
> nfs_acl auth_rpcgss exportfs sunrpc [last unloaded: microcode]
> [107304.558889] Pid: 8198, comm: nfsd Tainted: G M 2.6.32-rc5 #25
> [107304.558892] Call Trace:
> [107304.558899] [<ffffffff81429f19>] ? inet_sock_destruct+0x161/0x17c
> [107304.558907] [<ffffffff810487e9>] warn_slowpath_common+0x7c/0xa9
> [107304.558914] [<ffffffff8104882a>] warn_slowpath_null+0x14/0x16
> [107304.558920] [<ffffffff81429f19>] inet_sock_destruct+0x161/0x17c
> [107304.558927] [<ffffffff813c8741>] __sk_free+0x23/0xe7
> [107304.558933] [<ffffffff813c8881>] sk_free+0x1f/0x21
> [107304.558939] [<ffffffff813c894b>] sk_common_release+0xc8/0xcd
> [107304.558944] [<ffffffff81420b59>] udp_lib_close+0xe/0x10
> [107304.558951] [<ffffffff814299bf>] inet_release+0x55/0x5c
> [107304.558957] [<ffffffff813c5aa9>] sock_release+0x1f/0x71
> [107304.558962] [<ffffffff813c5b22>] sock_close+0x27/0x2b
> [107304.558968] [<ffffffff810eb60f>] __fput+0xfb/0x1c0
> [107304.558973] [<ffffffff810eb6f1>] fput+0x1d/0x1f
> [107304.558995] [<ffffffffa0013e23>] svc_sock_free+0x40/0x56 [sunrpc]
> [107304.559018] [<ffffffffa001f392>] svc_xprt_free+0x43/0x53 [sunrpc]
> [107304.559038] [<ffffffffa001f34f>] ? svc_xprt_free+0x0/0x53 [sunrpc]
> [107304.559048] [<ffffffff811d9275>] kref_put+0x43/0x4f
> [107304.559069] [<ffffffffa001e67a>] svc_close_xprt+0x55/0x5e [sunrpc]
> [107304.559088] [<ffffffffa001e6d3>] svc_close_all+0x50/0x69 [sunrpc]
> [107304.559107] [<ffffffffa0012a2b>] svc_destroy+0x9e/0x142 [sunrpc]
> [107304.559126] [<ffffffffa0012b88>] svc_exit_thread+0xb9/0xc2 [sunrpc]
> [107304.559138] [<ffffffffa008981b>] ? nfsd+0x0/0x13f [nfsd]
> [107304.559149] [<ffffffffa0089940>] nfsd+0x125/0x13f [nfsd]
> [107304.559157] [<ffffffff810685e3>] kthread+0x82/0x8a
> [107304.559164] [<ffffffff8100c13a>] child_rip+0xa/0x20
> [107304.559172] [<ffffffff8100baad>] ? restore_args+0x0/0x30
> [107304.559179] [<ffffffff81068561>] ? kthread+0x0/0x8a
> [107304.559185] [<ffffffff8100c130>] ? child_rip+0x0/0x20
> [107304.559191] ---[ end trace c107131f4762168c ]---
> [107304.927931] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state
> recovery directory
> [107304.932765] NFSD: starting 90-second grace period
>

This oops occurring again and again with SUNRPC finally gave me the right pointer.

David, we added two years ago memory accounting to UDP, and this changed
requirements about calling skb_free_datagram() in the right context.

I wish we had an ASSERT_SOCK_LOCKED() debugging facility :(

Francis, would you please test following patch ?

Thanks

[PATCH] net: fix sk_forward_alloc corruption

On UDP sockets, we must call skb_free_datagram() with socket locked,
or risk sk_forward_alloc corruption. This requirement is not respected
in SUNRPC.

Add a convenient helper, skb_free_datagram_locked() and use it in SUNRPC

Reported-by: Francis Moreau <francis.moro(a)gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet(a)gmail.com>
---
include/linux/skbuff.h | 2 ++
net/core/datagram.c | 10 +++++++++-
net/ipv4/udp.c | 4 +---
net/ipv6/udp.c | 4 +---
net/sunrpc/svcsock.c | 10 +++++-----
net/sunrpc/xprtsock.c | 2 +-
6 files changed, 19 insertions(+), 13 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index df7b23a..266878f 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1757,6 +1757,8 @@ extern int skb_copy_datagram_const_iovec(const struct sk_buff *from,
int to_offset,
int size);
extern void skb_free_datagram(struct sock *sk, struct sk_buff *skb);
+extern void skb_free_datagram_locked(struct sock *sk,
+ struct sk_buff *skb);
extern int skb_kill_datagram(struct sock *sk, struct sk_buff *skb,
unsigned int flags);
extern __wsum skb_checksum(const struct sk_buff *skb, int offset,
diff --git a/net/core/datagram.c b/net/core/datagram.c
index 1c6cf3a..4ade301 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -224,6 +224,15 @@ void skb_free_datagram(struct sock *sk, struct sk_buff *skb)
consume_skb(skb);
sk_mem_reclaim_partial(sk);
}
+EXPORT_SYMBOL(skb_free_datagram);
+
+void skb_free_datagram_locked(struct sock *sk, struct sk_buff *skb)
+{
+ lock_sock(sk);
+ skb_free_datagram(sk, skb);
+ release_sock(sk);
+}
+EXPORT_SYMBOL(skb_free_datagram_locked);

/**
* skb_kill_datagram - Free a datagram skbuff forcibly
@@ -752,5 +761,4 @@ unsigned int datagram_poll(struct file *file, struct socket *sock,
EXPORT_SYMBOL(datagram_poll);
EXPORT_SYMBOL(skb_copy_and_csum_datagram_iovec);
EXPORT_SYMBOL(skb_copy_datagram_iovec);
-EXPORT_SYMBOL(skb_free_datagram);
EXPORT_SYMBOL(skb_recv_datagram);
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index d0d436d..0fa9f70 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -999,9 +999,7 @@ try_again:
err = ulen;

out_free:
- lock_sock(sk);
- skb_free_datagram(sk, skb);
- release_sock(sk);
+ skb_free_datagram_locked(sk, skb);
out:
return err;

diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 3a60f12..cf538ed 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -288,9 +288,7 @@ try_again:
err = ulen;

out_free:
- lock_sock(sk);
- skb_free_datagram(sk, skb);
- release_sock(sk);
+ skb_free_datagram_locked(sk, skb);
out:
return err;

diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index ccc5e83..1c246a4 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -111,7 +111,7 @@ static void svc_release_skb(struct svc_rqst *rqstp)
rqstp->rq_xprt_ctxt = NULL;

dprintk("svc: service %p, releasing skb %p\n", rqstp, skb);
- skb_free_datagram(svsk->sk_sk, skb);
+ skb_free_datagram_locked(svsk->sk_sk, skb);
}
}

@@ -578,7 +578,7 @@ static int svc_udp_recvfrom(struct svc_rqst *rqstp)
"svc: received unknown control message %d/%d; "
"dropping RPC reply datagram\n",
cmh->cmsg_level, cmh->cmsg_type);
- skb_free_datagram(svsk->sk_sk, skb);
+ skb_free_datagram_locked(svsk->sk_sk, skb);
return 0;
}

@@ -588,18 +588,18 @@ static int svc_udp_recvfrom(struct svc_rqst *rqstp)
if (csum_partial_copy_to_xdr(&rqstp->rq_arg, skb)) {
local_bh_enable();
/* checksum error */
- skb_free_datagram(svsk->sk_sk, skb);
+ skb_free_datagram_locked(svsk->sk_sk, skb);
return 0;
}
local_bh_enable();
- skb_free_datagram(svsk->sk_sk, skb);
+ skb_free_datagram_locked(svsk->sk_sk, skb);
} else {
/* we can use it in-place */
rqstp->rq_arg.head[0].iov_base = skb->data +
sizeof(struct udphdr);
rqstp->rq_arg.head[0].iov_len = len;
if (skb_checksum_complete(skb)) {
- skb_free_datagram(svsk->sk_sk, skb);
+ skb_free_datagram_locked(svsk->sk_sk, skb);
return 0;
}
rqstp->rq_xprt_ctxt = skb;
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 37c5475..d61be4a 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -883,7 +883,7 @@ static void xs_udp_data_ready(struct sock *sk, int len)
out_unlock:
spin_unlock(&xprt->transport_lock);
dropit:
- skb_free_datagram(sk, skb);
+ skb_free_datagram_locked(sk, skb);
out:
read_unlock(&sk->sk_callback_lock);
}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Francis Moreau on
On Fri, Oct 30, 2009 at 12:27 PM, Eric Dumazet <eric.dumazet(a)gmail.com> wrote:
>
> This oops occurring again and again with SUNRPC finally gave me the right pointer.
>
> David, we added two years ago memory accounting to UDP, and this changed
> requirements about calling skb_free_datagram() in the right context.
>
> I wish we had an ASSERT_SOCK_LOCKED() debugging facility :(
>
> Francis, would you please test following patch ?

I'm applying it and testing it during a couple of days and see if
something wrong still happens.

Since I have no specific test case, I'll report if this oops happen
again after a couple of days (probably on next Wednesday).

--
Francis
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Francis Moreau on
On Fri, Oct 30, 2009 at 1:33 PM, Francis Moreau <francis.moro(a)gmail.com> wrote:
> On Fri, Oct 30, 2009 at 12:27 PM, Eric Dumazet <eric.dumazet(a)gmail.com> wrote:
>>
>> This oops occurring again and again with SUNRPC finally gave me the right pointer.
>>
>> David, we added two years ago memory accounting to UDP, and this changed
>> requirements about calling skb_free_datagram() in the right context.
>>
>> I wish we had an ASSERT_SOCK_LOCKED() debugging facility :(
>>
>> Francis, would you please test following patch ?
>
> I'm applying it and testing it during a couple of days and see if
> something wrong still happens.

Hmm, with the patch applied on a 2.6.32-rc5, my machine locks hard
when starting nfsd.

--
Francis
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Eric Dumazet on
Francis Moreau a �crit :
> On Fri, Oct 30, 2009 at 1:33 PM, Francis Moreau <francis.moro(a)gmail.com> wrote:
>> On Fri, Oct 30, 2009 at 12:27 PM, Eric Dumazet <eric.dumazet(a)gmail.com> wrote:
>>> This oops occurring again and again with SUNRPC finally gave me the right pointer.
>>>
>>> David, we added two years ago memory accounting to UDP, and this changed
>>> requirements about calling skb_free_datagram() in the right context.
>>>
>>> I wish we had an ASSERT_SOCK_LOCKED() debugging facility :(
>>>
>>> Francis, would you please test following patch ?
>> I'm applying it and testing it during a couple of days and see if
>> something wrong still happens.
>
> Hmm, with the patch applied on a 2.6.32-rc5, my machine locks hard
> when starting nfsd.
>

Please ignore/revert the last part of the patch (about net/sunrpc/xprtsock.c)

In xs_udp_data_ready() we really want to call skb_free_datagram(),
not skb_free_datagram_locked(), because socket is already locked.

Thanks

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 37c5475..d61be4a 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -883,7 +883,7 @@ static void xs_udp_data_ready(struct sock *sk, int len)
out_unlock:
spin_unlock(&xprt->transport_lock);
dropit:
+ skb_free_datagram(sk, skb);
- skb_free_datagram_locked(sk, skb);
out:
read_unlock(&sk->sk_callback_lock);
}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/