From: Andy Whitcroft on
> Can you see what they're doing?
>
> � � � �watch -n0 cat /sys/kernel/debug/slow_work/runqueue

Turned on the debugging and applied the patch from Ted, and when
things are bad I see constant cycling of all four threads in the
output showing similar to the below, note only one thread shows at a
time:

Every 0.1s: cat /sys/kernel/debug/slow_work/runqueue Wed Jun 16 12:34:52 2010

THR PID ITEM ADDR FL MARK DESC
=== ===== ================ == ===== ==========
0 897 ffff88012bb07510 12 20ms DRM_CRTC_HELPER: i915(a)pci:0000:00:02.0

When things are working well I see the same output appearing for one
update, about every 10s.

-apw
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Andy Whitcroft on
On Wed, Jun 16, 2010 at 12:37 PM, Andy Whitcroft <apw(a)canonical.com> wrote:
>> Can you see what they're doing?
>>
>> � � � �watch -n0 cat /sys/kernel/debug/slow_work/runqueue
>
> Turned on the debugging and applied the patch from Ted, and when
> things are bad I see constant cycling of all four threads in the
> output showing similar to the below, note only one thread shows at a
> time:
>
> Every 0.1s: cat /sys/kernel/debug/slow_work/runqueue � �Wed Jun 16 12:34:52 2010
>
> THR PID � ITEM ADDR � � � �FL MARK �DESC
> === ===== ================ == ===== ==========
> � 0 � 897 ffff88012bb07510 12 �20ms DRM_CRTC_HELPER: i915(a)pci:0000:00:02.0
>
> When things are working well I see the same output appearing for one
> update, about every 10s.

Ok I've added some additional debug and these appear to be being
thrown a loop by the HPD interrupt which is firing continiusly, here
is a sample from some 4 minutes after boot:

[ 284.862670] APW: drm_helper_hpd_irq_event: cancel/submit
[ 284.882803] APW: drm_helper_hpd_irq_event: cancel/submit
[ 284.902691] APW: drm_helper_hpd_irq_event: cancel/submit
[ 284.922814] APW: drm_helper_hpd_irq_event: cancel/submit
[ 284.942620] APW: drm_helper_hpd_irq_event: cancel/submit
[ 284.962707] APW: drm_helper_hpd_irq_event: cancel/submit
[ 284.982937] APW: drm_helper_hpd_irq_event: cancel/submit
[ 285.004181] APW: drm_helper_hpd_irq_event: cancel/submit
[ 285.022622] APW: drm_helper_hpd_irq_event: cancel/submit
[ 285.042569] APW: drm_helper_hpd_irq_event: cancel/submit
[ 285.062593] APW: drm_helper_hpd_irq_event: cancel/submit
[ 285.082683] APW: drm_helper_hpd_irq_event: cancel/submit

Later when things calm down we see them but much much less often:

[ 379.038239] APW: drm_helper_hpd_irq_event: cancel/submit
[ 379.044904] APW: drm_helper_hpd_irq_event: cancel/submit
[ 382.438106] APW: drm_helper_hpd_irq_event: cancel/submit
[ 382.444754] APW: drm_helper_hpd_irq_event: cancel/submit
[ 385.838072] APW: drm_helper_hpd_irq_event: cancel/submit
[ 385.844735] APW: drm_helper_hpd_irq_event: cancel/submit
[ 389.237876] APW: drm_helper_hpd_irq_event: cancel/submit
[ 389.244523] APW: drm_helper_hpd_irq_event: cancel/submit
[ 392.638048] APW: drm_helper_hpd_irq_event: cancel/submit
[ 392.644740] APW: drm_helper_hpd_irq_event: cancel/submit

-apw
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Nick Bowler on
On 12:37 Wed 16 Jun , Andy Whitcroft wrote:
> > Can you see what they're doing?
> >
> > � � � �watch -n0 cat /sys/kernel/debug/slow_work/runqueue
>
> Turned on the debugging and applied the patch from Ted, and when
> things are bad I see constant cycling of all four threads in the
> output showing similar to the below, note only one thread shows at a
> time:
>
> Every 0.1s: cat /sys/kernel/debug/slow_work/runqueue Wed Jun 16 12:34:52 2010
>
> THR PID ITEM ADDR FL MARK DESC
> === ===== ================ == ===== ==========
> 0 897 ffff88012bb07510 12 20ms DRM_CRTC_HELPER: i915(a)pci:0000:00:02.0

Indeed, this looks very similar to mine, except my DESC field is blank
for some reason. The FL value is sometimes 12, sometimes 2.

When things are bad, all four of the kslowd threads are pegged at 16%
CPU usage. On my T500, they _never_ calm down once they're pegged.
However, things seem to be working normally at boot. It seems that
there is some initial trigger which pegs the threads, then the cursor
lag and other problems are just fallout after that. The system is
essentially unusable after this point.

The threads sometimes get pegged immediately after booting the system,
sometimes it can last an hour or more before showing any problems.
Unfortunately, this is making bisection essentially impossible.

This seems to have been introduced somewhere between 2.6.35-rc1 and
-rc2.

--
Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Andy Whitcroft on
On Wed, Jun 16, 2010 at 3:36 PM, Nick Bowler <nbowler(a)elliptictech.com> wrote:
> On 12:37 Wed 16 Jun � � , Andy Whitcroft wrote:
>> > Can you see what they're doing?
>> >
>> > � � � �watch -n0 cat /sys/kernel/debug/slow_work/runqueue
>>
>> Turned on the debugging and applied the patch from Ted, and when
>> things are bad I see constant cycling of all four threads in the
>> output showing similar to the below, note only one thread shows at a
>> time:
>>
>> Every 0.1s: cat /sys/kernel/debug/slow_work/runqueue � �Wed Jun 16 12:34:52 2010
>>
>> THR PID � ITEM ADDR � � � �FL MARK �DESC
>> === ===== ================ == ===== ==========
>> � �0 � 897 ffff88012bb07510 12 �20ms DRM_CRTC_HELPER: i915(a)pci:0000:00:02.0
>
> Indeed, this looks very similar to mine, except my DESC field is blank
> for some reason. �The FL value is sometimes 12, sometimes 2.

DESC is only filled in if you have Ted's patch applied (earlier in this thread).

> When things are bad, all four of the kslowd threads are pegged at 16%
> CPU usage. �On my T500, they _never_ calm down once they're pegged.
> However, things seem to be working normally at boot. �It seems that
> there is some initial trigger which pegs the threads, then the cursor
> lag and other problems are just fallout after that. �The system is
> essentially unusable after this point.
>
> The threads sometimes get pegged immediately after booting the system,
> sometimes it can last an hour or more before showing any problems.
> Unfortunately, this is making bisection essentially impossible.
>
> This seems to have been introduced somewhere between 2.6.35-rc1 and

I suspect this is introduced by the commit which pulled polling of DRM
connectors into the core.

-apw
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Nick Bowler on
On 15:55 Wed 16 Jun , Andy Whitcroft wrote:
> On Wed, Jun 16, 2010 at 3:36 PM, Nick Bowler <nbowler(a)elliptictech.com> wrote:
> > The threads sometimes get pegged immediately after booting the system,
> > sometimes it can last an hour or more before showing any problems.
> > Unfortunately, this is making bisection essentially impossible.
> >
> > This seems to have been introduced somewhere between 2.6.35-rc1 and
>
> I suspect this is introduced by the commit which pulled polling of DRM
> connectors into the core.

OK. From this hint I looked at the log between -rc1 and -rc2 and found
commit fbf81762e385d ("drm/kms: disable/enable poll around switcheroo
on/off").

Reverting this commit brings everything back to normal.

--
Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/