From: divya on
While running fs_racer test from LTP on a POWER6 box against latest git(2.6.35-rc3-git4 - commitid 984bc9601f64fd)
came across the following warning followed by multiple oops.

------------[ cut here ]------------

Badness at kernel/mutex-debug.c:64
NIP: c0000000000be9e8 LR: c0000000000be9cc CTR: 0000000000000000
REGS: c00000010be8f6f0 TRAP: 0700 Not tainted (2.6.35-rc3-git4-autotest)
MSR: 8000000000029032<EE,ME,CE,IR,DR> CR: 24224422 XER: 00000012
TASK = c00000010727cf00[8211] 'fs_racer_file_c' THREAD: c00000010be8bb50 CPU: 2
GPR00: 0000000000000000 c00000010be8f970 c000000000d3d798 0000000000000001
GPR04: c00000010be8fa70 c00000010be8c000 c00000010727d9f8 0000000000000000
GPR08: c0000000043042f0 c0000000016534e8 000000000000017a c000000000c29a1c
GPR12: 0000000028228424 c00000000f600500 c00000010be8fc40 0000000020000000
GPR16: fffffffffffff000 c000000109c73000 c00000010be8fc30 0000000000010442
GPR20: 0000000000000000 0000000000000000 00000000000001b6 c00000010dd12250
GPR24: c00000000017c08c c00000010727cf00 c00000010dd12278 c00000010dd12210
GPR28: 0000000000000001 c00000010be8c000 c000000000ca2008 c00000010be8fa70
NIP [c0000000000be9e8] .mutex_remove_waiter+0xa4/0x130
LR [c0000000000be9cc] .mutex_remove_waiter+0x88/0x130
Call Trace:
[c00000010be8f970] [c00000010be8fa00] 0xc00000010be8fa00 (unreliable)
[c00000010be8fa00] [c00000000064a9f0] .mutex_lock_nested+0x384/0x430
Instruction dump:
e81f0010 e93d0000 7fa04800 41fe0028 482e96e5 60000000 2fa30000 419e0018
e93e8008 80090000 2f800000 409e0008<0fe00000> e93e8000 80090000 2f800000
Unable to handle kernel paging request for unknown fault
Faulting instruction address: 0xc00000000008d0f4
Oops: Kernel access of bad area, sig: 7 [#1]
SMP NR_CPUS=1024 NUMA
Unrecoverable FP Unavailable Exception 800 at c000000000648ed4
pSeries
last sysfs file: /sys/devices/system/cpu/cpu19/cache/index1/shared_cpu_map
Modules linked in: ipv6 fuse loop dm_mod sr_mod cdrom ibmveth sg
sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt scsi_mod
NIP: c00000000008d0f4 LR: c00000000008d0d0 CTR: 0000000000000000
REGS: c00000010978f900 TRAP: 0600 Tainted: G W (2.6.35-rc3-git4-autotest)
MSR: 8000000000009032
Unrecoverable FP Unavailable Exception 800 at c000000000648ed4
Unrecoverable FP Unavailable Exception 800 at c000000000648ed4
Unrecoverable FP Unavailable Exception 800 at c000000000648ed4
Unrecoverable FP Unavailable Exception 800 at c000000000648ed4
Unrecoverable FP Unavailable Exception 800 at c000000000648ed4
EE,ME,IR,DR> CR: 24022442 XER: 00000012
DAR: c000000000648f54, DSISR: 0000000040010000
TASK = c0000001096e4900[7353] 'fs_racer_file_s' THREAD: c00000010978c000 CPU: 10
GPR00: 0000000000004000 c00000010978fb80 c000000000d3d798 0000000000000001
GPR04: c00000000083539e c000000001610228 0000000000000000 c0000000054c6880
GPR08: 00000000000006a5 c000000000648f54 0000000000000007 00000000049b0000
GPR12: 0000000000000000 c00000000f601900 00000000ffffffff ffffffffffffffff
GPR16: 000000004b7dc520 0000000000000000 0000000000000000 c00000010978fea0
GPR20: 00000fffcca7e7a0 00000fffcca7e7a0 00000fffabf7dfd0 00000fffabf7dfd0
GPR24: 0000000000000000 0000000001200011 c000000000e1c0a8 c000000000648ed4
GPR28: 0000000000000000 c0000001096e4900 c000000000ca0458 c00000010725d400
NIP [c00000000008d0f4] .copy_process+0x310/0xf40
LR [c00000000008d0d0] .copy_process+0x2ec/0xf40
Call Trace:
[c00000010978fb80] [c00000000008d0d0] .copy_process+0x2ec/0xf40 (unreliable)
[c00000010978fc80] [c00000000008deb4] .do_fork+0x190/0x3cc
[c00000010978fdc0] [c000000000011ef4] .sys_clone+0x58/0x70
[c00000010978fe30] [c0000000000087f0] .ppc_clone+0x8/0xc
Instruction dump:
419e0010 7fe3fb78 480774cd 60000000 801f0014 e93f0008 7800b842 39290080
78004800 60000042 901f0014 38004000<7d6048a8> 7d6b0078 7d6049ad 40c2fff4

Kernel version 2.6.34-rc3-git3 works fine.

Thanks
Divya


From: Michael Neuling on
> While running fs_racer test from LTP on a POWER6 box against latest git(2.6.3
5-rc3-git4 - commitid 984bc9601f64fd)
> came across the following warning followed by multiple oops.
>
> ------------[ cut here ]------------
>
> Badness at kernel/mutex-debug.c:64
> NIP: c0000000000be9e8 LR: c0000000000be9cc CTR: 0000000000000000
> REGS: c00000010be8f6f0 TRAP: 0700 Not tainted (2.6.35-rc3-git4-autotest)
> MSR: 8000000000029032<EE,ME,CE,IR,DR> CR: 24224422 XER: 00000012
> TASK = c00000010727cf00[8211] 'fs_racer_file_c' THREAD: c00000010be8bb50 CPU:
2
> GPR00: 0000000000000000 c00000010be8f970 c000000000d3d798 0000000000000001
> GPR04: c00000010be8fa70 c00000010be8c000 c00000010727d9f8 0000000000000000
> GPR08: c0000000043042f0 c0000000016534e8 000000000000017a c000000000c29a1c
> GPR12: 0000000028228424 c00000000f600500 c00000010be8fc40 0000000020000000
> GPR16: fffffffffffff000 c000000109c73000 c00000010be8fc30 0000000000010442
> GPR20: 0000000000000000 0000000000000000 00000000000001b6 c00000010dd12250
> GPR24: c00000000017c08c c00000010727cf00 c00000010dd12278 c00000010dd12210
> GPR28: 0000000000000001 c00000010be8c000 c000000000ca2008 c00000010be8fa70
> NIP [c0000000000be9e8] .mutex_remove_waiter+0xa4/0x130
> LR [c0000000000be9cc] .mutex_remove_waiter+0x88/0x130
> Call Trace:
> [c00000010be8f970] [c00000010be8fa00] 0xc00000010be8fa00 (unreliable)
> [c00000010be8fa00] [c00000000064a9f0] .mutex_lock_nested+0x384/0x430
> Instruction dump:
> e81f0010 e93d0000 7fa04800 41fe0028 482e96e5 60000000 2fa30000 419e0018
> e93e8008 80090000 2f800000 409e0008<0fe00000> e93e8000 80090000 2f800000
> Unable to handle kernel paging request for unknown fault
> Faulting instruction address: 0xc00000000008d0f4
> Oops: Kernel access of bad area, sig: 7 [#1]
> SMP NR_CPUS=1024 NUMA
> Unrecoverable FP Unavailable Exception 800 at c000000000648ed4
> pSeries
> last sysfs file: /sys/devices/system/cpu/cpu19/cache/index1/shared_cpu_map
> Modules linked in: ipv6 fuse loop dm_mod sr_mod cdrom ibmveth sg
> sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt scsi_mod
> NIP: c00000000008d0f4 LR: c00000000008d0d0 CTR: 0000000000000000
> REGS: c00000010978f900 TRAP: 0600 Tainted: G W (2.6.35-rc3-git4-a
utotest)
> MSR: 8000000000009032
> Unrecoverable FP Unavailable Exception 800 at c000000000648ed4
> Unrecoverable FP Unavailable Exception 800 at c000000000648ed4
> Unrecoverable FP Unavailable Exception 800 at c000000000648ed4
> Unrecoverable FP Unavailable Exception 800 at c000000000648ed4
> Unrecoverable FP Unavailable Exception 800 at c000000000648ed4
> EE,ME,IR,DR> CR: 24022442 XER: 00000012
> DAR: c000000000648f54, DSISR: 0000000040010000
> TASK = c0000001096e4900[7353] 'fs_racer_file_s' THREAD: c00000010978c000 CPU:
10
> GPR00: 0000000000004000 c00000010978fb80 c000000000d3d798 0000000000000001
> GPR04: c00000000083539e c000000001610228 0000000000000000 c0000000054c6880
> GPR08: 00000000000006a5 c000000000648f54 0000000000000007 00000000049b0000
> GPR12: 0000000000000000 c00000000f601900 00000000ffffffff ffffffffffffffff
> GPR16: 000000004b7dc520 0000000000000000 0000000000000000 c00000010978fea0
> GPR20: 00000fffcca7e7a0 00000fffcca7e7a0 00000fffabf7dfd0 00000fffabf7dfd0
> GPR24: 0000000000000000 0000000001200011 c000000000e1c0a8 c000000000648ed4
> GPR28: 0000000000000000 c0000001096e4900 c000000000ca0458 c00000010725d400
> NIP [c00000000008d0f4] .copy_process+0x310/0xf40
> LR [c00000000008d0d0] .copy_process+0x2ec/0xf40
> Call Trace:
> [c00000010978fb80] [c00000000008d0d0] .copy_process+0x2ec/0xf40 (unreliable)
> [c00000010978fc80] [c00000000008deb4] .do_fork+0x190/0x3cc
> [c00000010978fdc0] [c000000000011ef4] .sys_clone+0x58/0x70
> [c00000010978fe30] [c0000000000087f0] .ppc_clone+0x8/0xc
> Instruction dump:
> 419e0010 7fe3fb78 480774cd 60000000 801f0014 e93f0008 7800b842 39290080
> 78004800 60000042 901f0014 38004000<7d6048a8> 7d6b0078 7d6049ad 40c2fff4
>
> Kernel version 2.6.34-rc3-git3 works fine.

Should this read 2.6.35-rc3-git3?

If so, there's only about 20 commits in:
5904b3b81d2516..984bc9601f64fd

The likely fs related candidates are from Christoph and Nick Piggin
(added to CC)

No commits relating to POWER6 or PPC.

Mikey
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Maciej Rutecki on
On środa, 30 czerwca 2010 o 13:22:27 divya wrote:
> While running fs_racer test from LTP on a POWER6 box against latest
> git(2.6.35-rc3-git4 - commitid 984bc9601f64fd) came across the following
> warning followed by multiple oops.
>

I created a Bugzilla entry at
https://bugzilla.kernel.org/show_bug.cgi?id=16324
for your bug report, please add your address to the CC list in there, thanks!


--
Maciej Rutecki
http://www.maciek.unixy.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Michael Neuling on
In message <20100701105907.GK22976(a)laptop> you wrote:
> On Thu, Jul 01, 2010 at 03:04:54PM +1000, Michael Neuling wrote:
> > > While running fs_racer test from LTP on a POWER6 box against latest git(2
..6.3
> > 5-rc3-git4 - commitid 984bc9601f64fd)
> > > came across the following warning followed by multiple oops.
> > >
> > > ------------[ cut here ]------------
> > >
> > > Badness at kernel/mutex-debug.c:64
> > > NIP: c0000000000be9e8 LR: c0000000000be9cc CTR: 0000000000000000
> > > REGS: c00000010be8f6f0 TRAP: 0700 Not tainted (2.6.35-rc3-git4-autotes
t)
> > > MSR: 8000000000029032<EE,ME,CE,IR,DR> CR: 24224422 XER: 00000012
> > > TASK = c00000010727cf00[8211] 'fs_racer_file_c' THREAD: c00000010be8bb50
CPU:
> > 2
> > > GPR00: 0000000000000000 c00000010be8f970 c000000000d3d798 000000000000000
1
> > > GPR04: c00000010be8fa70 c00000010be8c000 c00000010727d9f8 000000000000000

0
> > > GPR08: c0000000043042f0 c0000000016534e8 000000000000017a c000000000c29a1
c
> > > GPR12: 0000000028228424 c00000000f600500 c00000010be8fc40 000000002000000

0
> > > GPR16: fffffffffffff000 c000000109c73000 c00000010be8fc30 000000000001044
2
> > > GPR20: 0000000000000000 0000000000000000 00000000000001b6 c00000010dd1225

0
> > > GPR24: c00000000017c08c c00000010727cf00 c00000010dd12278 c00000010dd1221

0
> > > GPR28: 0000000000000001 c00000010be8c000 c000000000ca2008 c00000010be8fa7

0
> > > NIP [c0000000000be9e8] .mutex_remove_waiter+0xa4/0x130
> > > LR [c0000000000be9cc] .mutex_remove_waiter+0x88/0x130
> > > Call Trace:
> > > [c00000010be8f970] [c00000010be8fa00] 0xc00000010be8fa00 (unreliable)
> > > [c00000010be8fa00] [c00000000064a9f0] .mutex_lock_nested+0x384/0x430
> > > Instruction dump:
> > > e81f0010 e93d0000 7fa04800 41fe0028 482e96e5 60000000 2fa30000 419e0018
> > > e93e8008 80090000 2f800000 409e0008<0fe00000> e93e8000 80090000 2f80000

0
> > > Unable to handle kernel paging request for unknown fault
> > > Faulting instruction address: 0xc00000000008d0f4
> > > Oops: Kernel access of bad area, sig: 7 [#1]
> > > SMP NR_CPUS=1024 NUMA
> > > Unrecoverable FP Unavailable Exception 800 at c000000000648ed4
> > > pSeries
> > > last sysfs file: /sys/devices/system/cpu/cpu19/cache/index1/shared_cpu_ma
p
> > > Modules linked in: ipv6 fuse loop dm_mod sr_mod cdrom ibmveth sg
> > > sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt scsi_mod
> > > NIP: c00000000008d0f4 LR: c00000000008d0d0 CTR: 0000000000000000
> > > REGS: c00000010978f900 TRAP: 0600 Tainted: G W (2.6.35-rc3-gi
t4-a
> > utotest)
> > > MSR: 8000000000009032
> > > Unrecoverable FP Unavailable Exception 800 at c000000000648ed4
> > > Unrecoverable FP Unavailable Exception 800 at c000000000648ed4
> > > Unrecoverable FP Unavailable Exception 800 at c000000000648ed4
> > > Unrecoverable FP Unavailable Exception 800 at c000000000648ed4
> > > Unrecoverable FP Unavailable Exception 800 at c000000000648ed4
> > > EE,ME,IR,DR> CR: 24022442 XER: 00000012
> > > DAR: c000000000648f54, DSISR: 0000000040010000
> > > TASK = c0000001096e4900[7353] 'fs_racer_file_s' THREAD: c00000010978c000
CPU:
> > 10
> > > GPR00: 0000000000004000 c00000010978fb80 c000000000d3d798 000000000000000
1
> > > GPR04: c00000000083539e c000000001610228 0000000000000000 c0000000054c688

0
> > > GPR08: 00000000000006a5 c000000000648f54 0000000000000007 00000000049b000

0
> > > GPR12: 0000000000000000 c00000000f601900 00000000ffffffff fffffffffffffff
f
> > > GPR16: 000000004b7dc520 0000000000000000 0000000000000000 c00000010978fea

0
> > > GPR20: 00000fffcca7e7a0 00000fffcca7e7a0 00000fffabf7dfd0 00000fffabf7dfd

0
> > > GPR24: 0000000000000000 0000000001200011 c000000000e1c0a8 c000000000648ed
4
> > > GPR28: 0000000000000000 c0000001096e4900 c000000000ca0458 c00000010725d40

0
> > > NIP [c00000000008d0f4] .copy_process+0x310/0xf40
> > > LR [c00000000008d0d0] .copy_process+0x2ec/0xf40
> > > Call Trace:
> > > [c00000010978fb80] [c00000000008d0d0] .copy_process+0x2ec/0xf40 (unreliab
le)
> > > [c00000010978fc80] [c00000000008deb4] .do_fork+0x190/0x3cc
> > > [c00000010978fdc0] [c000000000011ef4] .sys_clone+0x58/0x70
> > > [c00000010978fe30] [c0000000000087f0] .ppc_clone+0x8/0xc
> > > Instruction dump:
> > > 419e0010 7fe3fb78 480774cd 60000000 801f0014 e93f0008 7800b842 39290080
> > > 78004800 60000042 901f0014 38004000<7d6048a8> 7d6b0078 7d6049ad 40c2fff
4
> > >
> > > Kernel version 2.6.34-rc3-git3 works fine.
> >
> > Should this read 2.6.35-rc3-git3?
> >
> > If so, there's only about 20 commits in:
> > 5904b3b81d2516..984bc9601f64fd
> >
> > The likely fs related candidates are from Christoph and Nick Piggin
> > (added to CC)
> >
> > No commits relating to POWER6 or PPC.
>
> Not sure what's happening here. The first warning looks like some mutex
> corruption, but it doesn't have a stack trace (these are 2 seperate
> dumps, right? ie. the copy_process stack doesn't relate to the mutex
> warning?) So I don't have much idea.
>
> If it is reproducable, can you try getting a better stack trace, or
> better yet, even bisecting if there is just a small window?

I can't reproduce the bug here on POWER6 or POWER7.

Divya, can you bisect this?

Mikey
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/