From: Chris Friesen on
Hi,

I've backported about 60 kmemleak patches to a modified 2.6.27 in an
attempt to track down a memory leak of about 4MB/hr that we're seeing on
an x86 blade.

For the most part the backport was straightforward, but now booting up
the kernel I see this:

Cannot insert 0xc312b880 into the object search tree (already existing)
Pid: 0, comm: swapper Not tainted 2.6.27-pne #23
[<c048babd>] create_object+0x21d/0x250
[<c014172b>] kmemleak_init+0x12a/0x1cf
[<c012980d>] start_kernel+0x225/0x338
[<c01293aa>] ? unknown_bootoption+0x0/0x1ed
[<c0129093>] _sinittext+0x93/0x99
=======================
Kernel memory leak detector disabled
Object 0xc312b880 (size 340):
comm "swapper", pid 0, jiffies 4294667296
min_count = 0
count = 0
flags = 0x1
checksum = 0
backtrace:
[<c01415f9>] log_early+0x90/0x98
[<c08132df>] kmemleak_alloc+0x5f/0x70
[<c013f1c9>] alloc_bootmem_core+0x2c3/0x2c8
[<c013f233>] ___alloc_bootmem_nopanic+0x65/0x91
[<c013f2c8>] ___alloc_bootmem+0x16/0x3c
[<c013f39e>] __alloc_bootmem+0x12/0x14
[<c0147c43>] con_init+0x8a/0x22d
[<c01474c4>] console_init+0x19/0x27
[<c01297b6>] start_kernel+0x1ce/0x338
[<c0129093>] _sinittext+0x93/0x99
[<ffffffff>] 0xffffffff


Have you got any suggestions as to how I should deal with this?

Thanks,
Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Chris Friesen on
I realized that I could disable CONFIG_VT, and the previous issue went
away. The system got further into the boot and started bringing up
userspace, but then gave the error below.


BUG: unable to handle kernel paging request at 83234000
IP: [<c048ad32>] scan_block+0xb2/0x100
Oops: 0000 [#1] SMP
Modules linked in: ipmi_serial_terminal_mode ipmi_serial ipmi_msghandler
ipmi_devintf

Pid: 506, comm: kmemleak Not tainted (2.6.27-pne #25)
EIP: 0060:[<c048ad32>] EFLAGS: 00010293 CPU: 1
EIP is at scan_block+0xb2/0x100
EAX: 00000001 EBX: 00000000 ECX: 00000000 EDX: 8323779c
ESI: 83234000 EDI: 83237799 EBP: f6f8bf90 ESP: f6f8bf80
DS: 0068 ES: 007b FS: 00d8 GS: 0000 SS: 0068
Process kmemleak (pid: 506, ti=f6f8a000 task=f799da40 task.ti=f6f8a000)
Stack: 00000000 00000000 f60a5454 00000000 f6f8bfc0 c048affd 00000001
c0435330
00000206 00000001 00000000 00000002 f6f8bfb8 000927c0 c048b6c0
00000000
f6f8bfd0 c048b70e c08a66b0 00000000 f6f8bfe0 c043f6cc c043f690
00000000
Call Trace:
[<c048affd>] ? kmemleak_scan+0x11d/0x3c0
[<c0435330>] ? process_timeout+0x0/0x40
[<c048b6c0>] ? kmemleak_scan_thread+0x0/0xc0
[<c048b70e>] ? kmemleak_scan_thread+0x4e/0xc0
[<c043f6cc>] ? kthread+0x3c/0x70
[<c043f690>] ? kthread+0x0/0x70
[<c0404737>] ? kernel_thread_helper+0x7/0x10
=======================
Code: 15 50 7d 92 c0 c7 43 10 4c 7d 92 c0 89 43 14 89 10 89 ca 89 d8 e8
af c1 38 00 8d b4 26 00 00 00 00 83 c6 04 39 f7 7621 8b 45 08 <8b> 1e 85
c0 0f 84 6c ff ff ff e8 ff aa 38 00 e8 fa fe ff ff 85



Looking at the function offset it appears that kmemleak_scan() is
calling scan_block() for the per-cpu section.

I'll keep digging, but any suggestions would be appreciated.

Chris


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Chris Friesen on
After the previous message I commented out the per-cpu scan but now I'm
seeing the following:


root(a)10:/root> BUG: unable to handle kernel paging request at 8152a79c
IP: [<c048ad52>] scan_block+0xd2/0x120
Kcore timestamp : 1263971633.595000
Kcore HighResolution timestamp : 238BAE499D2
Oops: 0000 [#1] SMP
Modules linked in: kmemleak_test pmemfs

Pid: 889, comm: bash Not tainted (2.6.27-pne #32)
EIP: 0060:[<c048ad52>] EFLAGS: 00010082 CPU: 2
EIP is at scan_block+0xd2/0x120
EAX: 00000000 EBX: 8152a79c ECX: 00000000 EDX: 8e7d441b
ESI: 8152a79c EDI: 8152a79d EBP: de893ea8 ESP: de893e8c
DS: 0068 ES: 0068 FS: 00d8 GS: 0033 SS: 0068
Process bash (pid: 889, ti=de892000 task=df947200 task.ti=de892000)
Stack: c08a758d 8152a79c 8152a7a0 df9889a0 8152b79c 00000246 df9889a0
de893ed0
c048aee8 00000000 8152a79c 8152a7a0 8152a7a0 8152a79c df947200
c0919220
00020000 de893f00 c048b074 00000000 00000000 00000202 00000001
df940068
Call Trace:
[<c048aee8>] ? scan_gray_list+0x148/0x190
[<c048b074>] ? kmemleak_scan+0x144/0x390
[<c048b5aa>] ? kmemleak_write+0x1aa/0x2e0
[<c048a62b>] ? put_object+0x2b/0x40
[<c048e40c>] ? vfs_write+0x9c/0x140
[<c048b400>] ? kmemleak_write+0x0/0x2e0
[<c048e5c3>] ? sys_write+0x43/0xb0
[<c0403541>] ? system_call_done+0x0/0x4


The code is failing at the "pointer = *ptr;" line in scan_block().

Based on some added instrumentation, we're scanning an object from from
8152a79c to 8152a7a0, which in turn is scanning a block from 8152a79c to
8152a7a0, so this is the first address in the block.

The odd thing is that as far as I can tell this should be a valid
address. This is a 32-bit x86 kernel, with CONFIG_FLATMEM=y, and the
memory map is:

virtual kernel memory layout:
fixmap : 0xfff81000 - 0xfffff000 ( 504 kB)
pkmap : 0xffa00000 - 0xffc00000 (2048 kB)
vmalloc : 0xe0800000 - 0xff9fe000 ( 497 MB)
lowmem : 0xc0000000 - 0xe0000000 ( 512 MB)
.init : 0xc0106000 - 0xc0400000 (3048 kB)
.data : 0xc0919000 - 0xc095634c ( 244 kB)
.text : 0xc0400000 - 0xc081bcdc (4207 kB)


If anyone has any suggestions, I'd appreciate it.

Thanks,
Chris

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Catalin Marinas on
Hi,

First of all, apart from backporting the kmemleak patches I would
suggest you do a kernel grep for kmemleak_* function calls as there may
be explicit cases where allocated memory blocks are ignored from
scanning (like the AGP aperture which is unmapped from the standard
kernel linear mapping).

On Wed, 2010-01-20 at 17:51 +0000, Chris Friesen wrote:
> root(a)10:/root> BUG: unable to handle kernel paging request at 8152a79c
> IP: [<c048ad52>] scan_block+0xd2/0x120
[...]
> The code is failing at the "pointer = *ptr;" line in scan_block().

This happens when kmemleak was told about a memory block being allocated
but there isn't any valid virtual address for that location.

I can't really tell where it came from but I would suggest that you
disable the kmemleak automatic scanning and do an "echo dump=0x815sa79c
> /sys/kernel/debug/kmemleak" and you should get the information that
kmemleak has about that location.

Alternatively, until you get this to work, just modify kmemleak to dump
information it has about every object it scans via dump_object_info()
called from scan_object(). The amount of information is quite large but
at least you should see the last block that it fails to scan and maybe
add a kmemleak_ignore() on the block allocation site.

--
Catalin

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Chris Friesen on
On 01/20/2010 04:58 PM, Catalin Marinas wrote:
> Hi,
>
> First of all, apart from backporting the kmemleak patches I would
> suggest you do a kernel grep for kmemleak_* function calls as there may
> be explicit cases where allocated memory blocks are ignored from
> scanning (like the AGP aperture which is unmapped from the standard
> kernel linear mapping).

Thanks for the suggestion. I'll do that.

> Alternatively, until you get this to work, just modify kmemleak to dump
> information it has about every object it scans via dump_object_info()
> called from scan_object(). The amount of information is quite large but
> at least you should see the last block that it fails to scan and maybe
> add a kmemleak_ignore() on the block allocation site.

I finally got access to a 64-bit lab system and there the backport seems
to be working just fine. Since that's were I actually need it, life is
now much better. :)

If I run into more issues I'll try using your ideas to work around the
issue.

Thanks,

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/