Badness with the kernel version 2.6.35-rc1-git1 running on P6 box [Kernel]

Prev: twl6040: fix wrong kfree in twl6040_remove and twl6040_codec_remove
Next: [PATCH 2/7] memcg: mem_cgroup_shrink_node_zone() doesn't need sc.nodemask

From: Eric Dumazet on 16 Jul 2010 06:00

Le vendredi 16 juillet 2010 à 14:20 +0530, divya a écrit :
> Hi ,
>
> With the latest kernel version 2.6.35-rc5-git1(2f7989efd4398) running on power(p6) box came across the following
> call trace
>
> Call Trace:
> [c000000006a0e800] [c000000000011c30] .show_stack+0x6c/0x16c (unreliable)
> [c000000006a0e8b0] [c00000000012129c] .__alloc_pages_nodemask+0x6a0/0x75c
> [c000000006a0ea30] [c0000000001527cc] .alloc_pages_current+0xc4/0x104
> [c000000006a0ead0] [c00000000015b1a0] .new_slab+0xe0/0x314
> [c000000006a0eb70] [c00000000015b6fc] .__slab_alloc+0x328/0x644
> [c000000006a0ec50] [c00000000015cc34] .__kmalloc_node_track_caller+0x114/0x194
> [c000000006a0ed00] [c000000000599f6c] .__alloc_skb+0x94/0x180
> [c000000006a0edb0] [c00000000059af5c] .__netdev_alloc_skb+0x3c/0x74
> [c000000006a0ee30] [c0000000004f9480] .ehea_refill_rq_def+0xf8/0x2d0
> [c000000006a0ef30] [c0000000004fab8c] .ehea_up+0x5b8/0x69c
> [c000000006a0f040] [c0000000004facd4] .ehea_open+0x64/0x118
> [c000000006a0f0e0] [c0000000005a6e9c] .__dev_open+0x100/0x168
> [c000000006a0f170] [c0000000005a3ac0] .__dev_change_flags+0x10c/0x1ac
> [c000000006a0f210] [c0000000005a6d44] .dev_change_flags+0x24/0x7c
> [c000000006a0f2a0] [c0000000005b50b4] .do_setlink+0x31c/0x750
> [c000000006a0f3b0] [c0000000005b6724] .rtnl_newlink+0x388/0x618
> [c000000006a0f5f0] [c0000000005b6350] .rtnetlink_rcv_msg+0x268/0x2b4
> [c000000006a0f6a0] [c0000000005cfdc0] .netlink_rcv_skb+0x74/0x108
> [c000000006a0f730] [c0000000005b60c4] .rtnetlink_rcv+0x38/0x5c
> [c000000006a0f7c0] [c0000000005cf8c8] .netlink_unicast+0x318/0x3f4
> [c000000006a0f890] [c0000000005d05b4] .netlink_sendmsg+0x2d0/0x310
> [c000000006a0f970] [c00000000058e1e8] .sock_sendmsg+0xd4/0x110
> [c000000006a0fb50] [c00000000058e514] .SyS_sendmsg+0x1f4/0x288
> [c000000006a0fd70] [c00000000058c2b8] .SyS_socketcall+0x214/0x280
> [c000000006a0fe30] [c0000000000085b4] syscall_exit+0x0/0x40
> Mem-Info:
> Node 0 DMA per-cpu:
> CPU 0: hi: 0, btch: 1 usd: 0
> CPU 1: hi: 0, btch: 1 usd: 0
> CPU 2: hi: 0, btch: 1 usd: 0
> CPU 3: hi: 0, btch: 1 usd: 0
> active_anon:50 inactive_anon:260 isolated_anon:0
> active_file:159 inactive_file:139 isolated_file:0
> unevictable:0 dirty:2 writeback:1 unstable:0
> free:16 slab_reclaimable:66 slab_unreclaimable:502
> mapped:120 shmem:2 pagetables:37 bounce:0
> Node 0 DMA free:1024kB min:1408kB low:1728kB high:2112kB active_anon:3200kB inactive_anon:16640kB active_file:10176kB inactive_file:8896kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:130944kB mlocked:0kB dirty:128kB writeback:64kB mapped:7680kB shmem:128kB slab_reclaimable:4224kB slab_unreclaimable:32128kB kernel_stack:2528kB pagetables:2368kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0
> Node 0 DMA: 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB 0*8192kB 0*16384kB = 0kB
> 496 total pagecache pages
> 178 pages in swap cache
> Swap cache stats: add 780, delete 602, find 467/551
> Free swap = 1027904kB
> Total swap = 1044160kB
> 2048 pages RAM
> 683 pages reserved
> 582 pages shared
> 1075 pages non-shared
> SLUB: Unable to allocate memory on node -1 (gfp=0x20)
> cache: kmalloc-16384, object size: 16384, buffer size: 16384, default order: 2, min order: 0
> node 0: slabs: 28, objs: 292, free: 0
> ip: page allocation failure. order:0, mode:0x8020
> Call Trace:
> [c000000006a0eb40] [c000000000011c30] .show_stack+0x6c/0x16c (unreliable)
> [c000000006a0ebf0] [c00000000012129c] .__alloc_pages_nodemask+0x6a0/0x75c
> [c000000006a0ed70] [c0000000001527cc] .alloc_pages_current+0xc4/0x104
> [c000000006a0ee10] [c00000000011fca4] .__get_free_pages+0x18/0x90
> [c000000006a0ee90] [c0000000004f7058] .ehea_get_stats+0x4c/0x1bc
> [c000000006a0ef30] [c0000000005a0a04] .dev_get_stats+0x38/0x64
> [c000000006a0efc0] [c0000000005b456c] .rtnl_fill_ifinfo+0x35c/0x85c
> [c000000006a0f150] [c0000000005b5920] .rtmsg_ifinfo+0x164/0x204
> [c000000006a0f210] [c0000000005a6d6c] .dev_change_flags+0x4c/0x7c
> [c000000006a0f2a0] [c0000000005b50b4] .do_setlink+0x31c/0x750
> [c000000006a0f3b0] [c0000000005b6724] .rtnl_newlink+0x388/0x618
> [c000000006a0f5f0] [c0000000005b6350] .rtnetlink_rcv_msg+0x268/0x2b4
> [c000000006a0f6a0] [c0000000005cfdc0] .netlink_rcv_skb+0x74/0x108
> [c000000006a0f730] [c0000000005b60c4] .rtnetlink_rcv+0x38/0x5c
> [c000000006a0f7c0] [c0000000005cf8c8] .netlink_unicast+0x318/0x3f4
> [c000000006a0f890] [c0000000005d05b4] .netlink_sendmsg+0x2d0/0x310
> [c000000006a0f970] [c00000000058e1e8] .sock_sendmsg+0xd4/0x110
> [c000000006a0fb50] [c00000000058e514] .SyS_sendmsg+0x1f4/0x288
> [c000000006a0fd70] [c00000000058c2b8] .SyS_socketcall+0x214/0x280
> [c000000006a0fe30] [c0000000000085b4] syscall_exit+0x0/0x40
> Mem-Info:
> Node 0 DMA per-cpu:
> CPU 0: hi: 0, btch: 1 usd: 0
> CPU 1: hi: 0, btch: 1 usd: 0
> CPU 2: hi: 0, btch: 1 usd: 0
> CPU 3: hi: 0, btch: 1 usd: 0
>
> The mainline 2.6.35-rc5 worked fine.

Maybe you were lucky with 2.6.35-rc5

Anyway ehea should not use GFP_ATOMIC in its ehea_get_stats() method,
called in process context, but GFP_KERNEL.

Another patch is needed for ehea_refill_rq_def() as well.

[PATCH] ehea: ehea_get_stats() should use GFP_KERNEL

ehea_get_stats() is called in process context and should use GFP_KERNEL
allocation instead of GFP_ATOMIC.

Clearing stats at beginning of ehea_get_stats() is racy in case of
concurrent stat readers.

get_stats() can also use netdev net_device_stats, instead of a private
copy.

Reported-by: divya <dipraksh(a)linux.vnet.ibm.com>
Signed-off-by: Eric Dumazet <eric.dumazet(a)gmail.com>
---
drivers/net/ehea/ehea.h | 1 -
drivers/net/ehea/ehea_main.c | 6 ++----
2 files changed, 2 insertions(+), 5 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Eric Dumazet on 16 Jul 2010 08:30

Le vendredi 16 juillet 2010 à 11:56 +0200, Eric Dumazet a écrit :

> [PATCH] ehea: ehea_get_stats() should use GFP_KERNEL
>
> ehea_get_stats() is called in process context and should use GFP_KERNEL
> allocation instead of GFP_ATOMIC.
>
> Clearing stats at beginning of ehea_get_stats() is racy in case of
> concurrent stat readers.
>
> get_stats() can also use netdev net_device_stats, instead of a private
> copy.
>
> Reported-by: divya <dipraksh(a)linux.vnet.ibm.com>
> Signed-off-by: Eric Dumazet <eric.dumazet(a)gmail.com>
> ---
> drivers/net/ehea/ehea.h | 1 -
> drivers/net/ehea/ehea_main.c | 6 ++----
> 2 files changed, 2 insertions(+), 5 deletions(-)
>
>

Hmm, net-next-2.6 contains following patch :

commit 3d8009c780ee90fccb5c171caf30aff839f13547
Author: Brian King <brking(a)linux.vnet.ibm.com>
Date: Wed Jun 30 11:59:12 2010 +0000

ehea: Allocate stats buffer with GFP_KERNEL

Since ehea_get_stats calls ehea_h_query_ehea_port, which
can sleep, we can also sleep when allocating a page in
this function. This fixes some memory allocation failure
warnings seen under low memory conditions.

Signed-off-by: Brian King <brking(a)linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>

diff --git a/drivers/net/ehea/ehea_main.c b/drivers/net/ehea/ehea_main.c
index 8b92acb..3beba70 100644
--- a/drivers/net/ehea/ehea_main.c
+++ b/drivers/net/ehea/ehea_main.c
@@ -335,7 +335,7 @@ static struct net_device_stats
*ehea_get_stats(struct net_device *dev)

memset(stats, 0, sizeof(*stats));

- cb2 = (void *)get_zeroed_page(GFP_ATOMIC);
+ cb2 = (void *)get_zeroed_page(GFP_KERNEL);
if (!cb2) {
ehea_error("no mem for cb2");
goto out;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Dave Hansen on 16 Jul 2010 13:40

On Fri, 2010-07-16 at 11:56 +0200, Eric Dumazet wrote:
>
> > SLUB: Unable to allocate memory on node -1 (gfp=0x20)
> > cache: kmalloc-16384, object size: 16384, buffer size: 16384,
> default order: 2, min order: 0
> > node 0: slabs: 28, objs: 292, free: 0
> > ip: page allocation failure. order:0, mode:0x8020
> > Call Trace:
> > [c000000006a0eb40] [c000000000011c30] .show_stack+0x6c/0x16c (unreliable)
> > [c000000006a0ebf0] [c00000000012129c] .__alloc_pages_nodemask+0x6a0/0x75c
> > [c000000006a0ed70] [c0000000001527cc] .alloc_pages_current+0xc4/0x104
> > [c000000006a0ee10] [c00000000011fca4] .__get_free_pages+0x18/0x90
> > [c000000006a0ee90] [c0000000004f7058] .ehea_get_stats+0x4c/0x1bc
> > [c000000006a0ef30] [c0000000005a0a04] .dev_get_stats+0x38/0x64
> > [c000000006a0efc0] [c0000000005b456c] .rtnl_fill_ifinfo+0x35c/0x85c
> > [c000000006a0f150] [c0000000005b5920] .rtmsg_ifinfo+0x164/0x204
> > [c000000006a0f210] [c0000000005a6d6c] .dev_change_flags+0x4c/0x7c
> > [c000000006a0f2a0] [c0000000005b50b4] .do_setlink+0x31c/0x750
> > [c000000006a0f3b0] [c0000000005b6724] .rtnl_newlink+0x388/0x618
> > [c000000006a0f5f0] [c0000000005b6350] .rtnetlink_rcv_msg+0x268/0x2b4
> > [c000000006a0f6a0] [c0000000005cfdc0] .netlink_rcv_skb+0x74/0x108
> > [c000000006a0f730] [c0000000005b60c4] .rtnetlink_rcv+0x38/0x5c
> > [c000000006a0f7c0] [c0000000005cf8c8] .netlink_unicast+0x318/0x3f4
> > [c000000006a0f890] [c0000000005d05b4] .netlink_sendmsg+0x2d0/0x310
> > [c000000006a0f970] [c00000000058e1e8] .sock_sendmsg+0xd4/0x110
> > [c000000006a0fb50] [c00000000058e514] .SyS_sendmsg+0x1f4/0x288
> > [c000000006a0fd70] [c00000000058c2b8] .SyS_socketcall+0x214/0x280
> > [c000000006a0fe30] [c0000000000085b4] syscall_exit+0x0/0x40
> > Mem-Info:
> > Node 0 DMA per-cpu:
> > CPU 0: hi: 0, btch: 1 usd: 0
> > CPU 1: hi: 0, btch: 1 usd: 0
> > CPU 2: hi: 0, btch: 1 usd: 0
> > CPU 3: hi: 0, btch: 1 usd: 0
> >
> > The mainline 2.6.35-rc5 worked fine.
>
> Maybe you were lucky with 2.6.35-rc5
>
> Anyway ehea should not use GFP_ATOMIC in its ehea_get_stats() method,
> called in process context, but GFP_KERNEL.
>
> Another patch is needed for ehea_refill_rq_def() as well.

You're right that this is abusing GFP_ATOMIC.

But is, this is just a normal "GFP_ATOMIC" allocation failure? "SLUB:
Unable to allocate memory on node -1" seems like a somewhat
inappropriate error message for that.

It isn't immediately obvious where the -1 is coming from. Does it truly
mean "allocate from any node" here, or is that a buglet in and of
itself?

-- Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: David Rientjes on 16 Jul 2010 15:20

On Fri, 16 Jul 2010, Dave Hansen wrote:

> > > SLUB: Unable to allocate memory on node -1 (gfp=0x20)
> > > cache: kmalloc-16384, object size: 16384, buffer size: 16384,
> > default order: 2, min order: 0
> > > node 0: slabs: 28, objs: 292, free: 0
> > > ip: page allocation failure. order:0, mode:0x8020
> > > Call Trace:
> > > [c000000006a0eb40] [c000000000011c30] .show_stack+0x6c/0x16c (unreliable)
> > > [c000000006a0ebf0] [c00000000012129c] .__alloc_pages_nodemask+0x6a0/0x75c
> > > [c000000006a0ed70] [c0000000001527cc] .alloc_pages_current+0xc4/0x104
> > > [c000000006a0ee10] [c00000000011fca4] .__get_free_pages+0x18/0x90
> > > [c000000006a0ee90] [c0000000004f7058] .ehea_get_stats+0x4c/0x1bc
> > > [c000000006a0ef30] [c0000000005a0a04] .dev_get_stats+0x38/0x64
> > > [c000000006a0efc0] [c0000000005b456c] .rtnl_fill_ifinfo+0x35c/0x85c
> > > [c000000006a0f150] [c0000000005b5920] .rtmsg_ifinfo+0x164/0x204
> > > [c000000006a0f210] [c0000000005a6d6c] .dev_change_flags+0x4c/0x7c
> > > [c000000006a0f2a0] [c0000000005b50b4] .do_setlink+0x31c/0x750
> > > [c000000006a0f3b0] [c0000000005b6724] .rtnl_newlink+0x388/0x618
> > > [c000000006a0f5f0] [c0000000005b6350] .rtnetlink_rcv_msg+0x268/0x2b4
> > > [c000000006a0f6a0] [c0000000005cfdc0] .netlink_rcv_skb+0x74/0x108
> > > [c000000006a0f730] [c0000000005b60c4] .rtnetlink_rcv+0x38/0x5c
> > > [c000000006a0f7c0] [c0000000005cf8c8] .netlink_unicast+0x318/0x3f4
> > > [c000000006a0f890] [c0000000005d05b4] .netlink_sendmsg+0x2d0/0x310
> > > [c000000006a0f970] [c00000000058e1e8] .sock_sendmsg+0xd4/0x110
> > > [c000000006a0fb50] [c00000000058e514] .SyS_sendmsg+0x1f4/0x288
> > > [c000000006a0fd70] [c00000000058c2b8] .SyS_socketcall+0x214/0x280
> > > [c000000006a0fe30] [c0000000000085b4] syscall_exit+0x0/0x40
> > > Mem-Info:
> > > Node 0 DMA per-cpu:
> > > CPU 0: hi: 0, btch: 1 usd: 0
> > > CPU 1: hi: 0, btch: 1 usd: 0
> > > CPU 2: hi: 0, btch: 1 usd: 0
> > > CPU 3: hi: 0, btch: 1 usd: 0
> > >
> > > The mainline 2.6.35-rc5 worked fine.
> >
> > Maybe you were lucky with 2.6.35-rc5
> >
> > Anyway ehea should not use GFP_ATOMIC in its ehea_get_stats() method,
> > called in process context, but GFP_KERNEL.
> >
> > Another patch is needed for ehea_refill_rq_def() as well.
>
> You're right that this is abusing GFP_ATOMIC.
>
> But is, this is just a normal "GFP_ATOMIC" allocation failure? "SLUB:
> Unable to allocate memory on node -1" seems like a somewhat
> inappropriate error message for that.
>

The slub message is seperate and doesn't generate a call trace, even
though it is a (minimum) order-0 GFP_ATOMIC allocation as well. The page
allocation failure is seperate instance that is calling the page
allocator, not the slab allocator.

> It isn't immediately obvious where the -1 is coming from. Does it truly
> mean "allocate from any node" here, or is that a buglet in and of
> itself?
>

Yes, slub uses -1 to indicate that the allocation need not come from a
specific node.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: David Miller on 18 Jul 2010 18:00

From: Eric Dumazet <eric.dumazet(a)gmail.com>
Date: Fri, 16 Jul 2010 14:20:42 +0200

> Le vendredi 16 juillet 2010 � 11:56 +0200, Eric Dumazet a �crit :
>
>> [PATCH] ehea: ehea_get_stats() should use GFP_KERNEL
>>
>> ehea_get_stats() is called in process context and should use GFP_KERNEL
>> allocation instead of GFP_ATOMIC.
>>
>> Clearing stats at beginning of ehea_get_stats() is racy in case of
>> concurrent stat readers.
>>
>> get_stats() can also use netdev net_device_stats, instead of a private
>> copy.
>>
>> Reported-by: divya <dipraksh(a)linux.vnet.ibm.com>
>> Signed-off-by: Eric Dumazet <eric.dumazet(a)gmail.com>
>> ---
>> drivers/net/ehea/ehea.h | 1 -
>> drivers/net/ehea/ehea_main.c | 6 ++----
>> 2 files changed, 2 insertions(+), 5 deletions(-)
>>
>>
>
> Hmm, net-next-2.6 contains following patch :

If people think ehea usage is ubiquitous enough to deserve a backport
of this to net-2.6, fine. But personally I don't think it's worth it.

Can someone close the kernel bugzilla 16406 created for this bug? This
patch we have already obviously would fix this issue.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

| Next | Last
Pages: 1 2
Prev: twl6040: fix wrong kfree in twl6040_remove and twl6040_codec_remove
Next: [PATCH 2/7] memcg: mem_cgroup_shrink_node_zone() doesn't need sc.nodemask