From: Stephen Rothwell on
Hi Rusty,

Today's linux-next produced these messages during the boot of a Power7
box:

Starting udev: udevd[2739]: udev: missing sysfs features; please update the kernel or disable the kernel's CONFIG_SYSFS_DEPRECATED option; udev may fail to work correctly

%GUnable to handle kernel paging request for data at address 0x00330000
Faulting instruction address: 0xc0000000000a2f50
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=128 NUMA pSeries
last sysfs file: /sys/block/sda/uevent
Modules linked in:
NIP: c0000000000a2f50 LR: c0000000000a2f18 CTR: 0000000000000000
REGS: c00000000228f9a0 TRAP: 0300 Not tainted (2.6.35-rc1-autokern1-next-20100603)
MSR: 8000000000009032 <EE,ME,IR,DR> CR: 44422448 XER: 20000001
DAR: 0000000000330000, DSISR: 0000000042000000
TASK = c0000000fbe1ae40[2855] 'modprobe' THREAD: c00000000228c000 CPU: 12
GPR00: 0000000000000001 c00000000228fc20 c00000000099f620 d0000000010b7040
GPR04: d0000000010a4dac 0000000000000005 0000000000000000 0000000000000223
GPR08: 6c6f635f72656769 0000000000000000 0000000000000020 0000000000330000
GPR12: d9fb876c951b89f3 c00000000f331800 0000000000012510 0000000000010320
GPR16: 000000000000002a 0000000000000012 0000000000000000 c0000000fb90cc00
GPR20: 0000000000000010 0000000000000021 0000000000000029 d000000001098b68
GPR24: d000000001098ba8 c000000002174800 d0000000010b7040 d000000001097fb9
GPR28: d000000001088000 d000000001098128 c00000000092d120 d0000000010b7040
NIP [c0000000000a2f50] .load_module+0x990/0x1458
LR [c0000000000a2f18] .load_module+0x958/0x1458
Call Trace:
[c00000000228fc20] [c0000000000a2f18] .load_module+0x958/0x1458 (unreliable)
[c00000000228fd90] [c0000000000a3a78] .SyS_init_module+0x60/0x244
[c00000000228fe30] [c00000000000852c] syscall_exit+0x0/0x40
Instruction dump:
7c7a1b78 7fa30040 41dd09cc 39230210 38030220 f9230218 f8030228 f9230210
f8030220 e9230240 38000001 e96d0040 <7c09592e> e80d01b0 f8030230 38800004
---[ end trace 85cf1caaf6abfc45 ]---
udevd-work[2749]: '/sbin/modprobe -b of:NlheaT<NULL>CIBM,lhea' unexpected exit with status 0x000b

This was followed by several other similaar OOPSes also in load_module.

I assume that this may have something to do with the module loading
fix/cleanup that is in progress ...

--
Cheers,
Stephen Rothwell sfr(a)canb.auug.org.au
http://www.canb.auug.org.au/~sfr/
From: Linus Torvalds on


On Thu, 3 Jun 2010, Stephen Rothwell wrote:
>
> Unable to handle kernel paging request for data at address 0x00330000
> Faulting instruction address: 0xc0000000000a2f50
> Oops: Kernel access of bad area, sig: 11 [#1]
> NIP [c0000000000a2f50] .load_module+0x990/0x1458
> LR [c0000000000a2f18] .load_module+0x958/0x1458
> Call Trace:
> [c00000000228fc20] [c0000000000a2f18] .load_module+0x958/0x1458 (unreliable)
> [c00000000228fd90] [c0000000000a3a78] .SyS_init_module+0x60/0x244
> [c00000000228fe30] [c00000000000852c] syscall_exit+0x0/0x40
> Instruction dump:
> 7c7a1b78 7fa30040 41dd09cc 39230210 38030220 f9230218 f8030228 f9230210
> f8030220 e9230240 38000001 e96d0040 <7c09592e> e80d01b0 f8030230 38800004
> ---[ end trace 85cf1caaf6abfc45 ]---
> udevd-work[2749]: '/sbin/modprobe -b of:NlheaT<NULL>CIBM,lhea' unexpected exit with status 0x000b
>
> This was followed by several other similaar OOPSes also in load_module.
>
> I assume that this may have something to do with the module loading
> fix/cleanup that is in progress ...

That's a fairly safe assumption.

I can't read PPC oopses in my sleep the way I can do x86, and in
particular, I can't pinpoint that to the source code by just decoding the
instructions and matching them against what I have. Do you have that
binary, and can you do a 'gdb vmlinux' on it, and then have gdb tell you
where in load_module "load_module+0x990" and "load_module+0x958" are?

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Stephen Rothwell on
Hi Linus,

On Thu, 3 Jun 2010 07:51:35 -0700 (PDT) Linus Torvalds <torvalds(a)linux-foundation.org> wrote:
>
> I can't read PPC oopses in my sleep the way I can do x86, and in
> particular, I can't pinpoint that to the source code by just decoding the
> instructions and matching them against what I have. Do you have that
> binary, and can you do a 'gdb vmlinux' on it, and then have gdb tell you
> where in load_module "load_module+0x990" and "load_module+0x958" are?

I was hoping to avoid that :-)

Rusty seems to have already figured out an obvious cause, so I will see
if that fixes it tomorrow (later today).

--
Cheers,
Stephen Rothwell sfr(a)canb.auug.org.au
http://www.canb.auug.org.au/~sfr/