2.6.35-rc3: System unresponsive under load [Kernel]

Prev: Btrfs: broken file system design (was Unbound(?) internal fragmentation in Btrfs)
Next: [PATCH] Staging: comedi: fix over 80 character coding style issue in ni_labpc.c

From: Manfred Spraul on 26 Jun 2010 11:50

Hi Luca,

On 06/26/2010 02:52 PM, Luca Tettamanti wrote:
> They don't seem really hung as before, I see two different behaviours:
> * Near the end of the run ab is frozen for a few seconds, but in the
> end all requests are processed; however I see a few "length" errors,
> meaning that the received page does not match the expected content
> (I'm testing a static page):
>
>
That's consistent with what I see:
If I run:
#./semtimedop 100 100&
#./semtimedop 100 100&
#./semtimedop 100 100&
#./semtimedop 100 100&

(i.e.: 4 times the attached test app concurrently), then the system
sometimes locks up for 10..20 seconds:
The keyboard is unresponsive, not even the numlock key is processed
(i.e.: the LED does not change anymore).
After 10 or 20 seconds, the keyboard reacts again (both to <enter> and
to Num Lock)
The stock Fedora 13 kernel (2.6.33.5) does not exhibit this behavior
The load average is 300 or so, that's expected.

I have no idea why and how to debug the behavior.
# CONFIG_PREEMPT_NONE is not set
CONFIG_PREEMPT_VOLUNTARY=y
# CONFIG_PREEMPT is not set
> strace on apache shows:
> [pid 3787] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
> [pid 3789] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
> [pid 3788] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
> [pid 3784] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
> [pid 3783] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
> [pid 3782] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
> [pid 3239] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
> [pid 3233] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
> [pid 3238] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
> [pid 3237] restart_syscall(<... resuming interrupted call ...>
>

That can't be semop:
sysv ipc and msg are among the (broken) parts of the kernel that do not
honor SA_RESTART.

--
Manfred

From: Luca Tettamanti on 26 Jun 2010 12:50

On Sat, Jun 26, 2010 at 5:47 PM, Manfred Spraul
<manfred(a)colorfullife.com> wrote:
> Hi Luca,
>
> On 06/26/2010 02:52 PM, Luca Tettamanti wrote:
>>
>> They don't seem really hung as before, I see two different behaviours:
>> * Near the end of the run ab is frozen for a few seconds, but in the
>> end all requests are processed; however I see a few "length" errors,
>> meaning that the received page does not match the expected content
>> (I'm testing a static page):
>>
>>
>
> That's consistent with what I see:
> If I run:
> #./semtimedop 100 100&
> #./semtimedop 100 100&
> #./semtimedop 100 100&
> #./semtimedop 100 100&
>
> (i.e.: 4 times the attached test app concurrently), then the system
> sometimes locks up for 10..20 seconds:
> The keyboard is unresponsive, not even the numlock key is processed (i.e.:
> the LED does not change anymore).
> After 10 or 20 seconds, the keyboard reacts again (both to <enter> and to
> Num Lock)
> The stock Fedora 13 kernel (2.6.33.5) does not exhibit this behavior
> The load average is 300 or so, that's expected.

Confirmed here: your test program freezes the system for a while under
2.6.35-rc3, while vanilla 2.6.34 copes fine.
sysrq-t was responsive during the freeze, so I took a snapshot during
it, file is attached.

> I have no idea why and how to debug the behavior.
> # CONFIG_PREEMPT_NONE is not set
> CONFIG_PREEMPT_VOLUNTARY=y
> # CONFIG_PREEMPT is not set

My kernel has PREEMPT enabled.

Luca

From: Manfred Spraul on 30 Jun 2010 15:10

Hi Luca,

On 06/26/2010 06:47 PM, Luca Tettamanti wrote:
>
> Confirmed here: your test program freezes the system for a while under
> 2.6.35-rc3, while vanilla 2.6.34 copes fine.
> sysrq-t was responsive during the freeze, so I took a snapshot during
> it, file is attached.
>
>
Ignore my test program:
If the master thread is interrupted in the right place, then there are
400 runnable tasks in the runqueue.
It seems that the scheduler just processes these 400 tasks first instead
of the keventd/ksoftirqd that is necessary for the keyboard handling.

Attached is a new idea, could you try it with your httpd test?

Perhaps the race is actually a race in the user space:
The exit path of semtimedop() does not contain an explicit memory barrier.
For the kernel, it does not matter: It merely reads one integer value.
If sysret is also no memory barrier, then user space might observe stale
data.

Which cpu do you have? I was unable to show any misbehavior on a Phenom X4.

--
Manfred

From: Luca Tettamanti on 6 Jul 2010 12:10

On Wed, Jun 30, 2010 at 9:07 PM, Manfred Spraul
<manfred(a)colorfullife.com> wrote:
> Attached is a new idea, could you try it with your httpd test?

Will test ASAP.

> Perhaps the race is actually a race in the user space:
> The exit path of semtimedop() does not contain an explicit memory barrier.
> For the kernel, it does not matter: It merely reads one integer value.
> If sysret is also no memory barrier, then user space might observe stale
> data.
>
> Which cpu do you have? I was unable to show any misbehavior on a Phenom X4.

Intel(R) Core(TM)2 Duo CPU T7500 @ 2.20GHz

I also have a Phenom X4, but I'm currently waiting a replacement for the PSU...

Luca
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Luca Tettamanti on 7 Jul 2010 15:40

On Wed, Jun 30, 2010 at 9:07 PM, Manfred Spraul
<manfred(a)colorfullife.com> wrote:
> Attached is a new idea, could you try it with your httpd test?

With kernel 2.6.35-rc3 your patch does not make any difference.
2.6.35-rc4, however, works fine with either one of your patches (yes,
I've checked multiple times).

Luca
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

|
Pages: 1
Prev: Btrfs: broken file system design (was Unbound(?) internal fragmentation in Btrfs)
Next: [PATCH] Staging: comedi: fix over 80 character coding style issue in ni_labpc.c