From: Torsten Kaiser on
[CC:Alex for the radeon KMS problem]

On Sun, Jun 6, 2010 at 6:15 AM, Linus Torvalds
<torvalds(a)linux-foundation.org> wrote:
>
> So -rc2 is out there, and hopefully fixes way more problems than it
> introduces.

It fixes the crash that prevented -rc1 from booting for me, but my
system is still not working with it.

The first problem that shows up is, that after the KMS switches to the
correct video mode (1280x1024 for an DVI attached LCD), the display
begins to flicker. Every 1..2 seconds (guesstimated) the display turns
off and on again. Something in the new powersaving?

This keeps up during userspace bootup, but probably around the time
Xorg starts the display goes blank and does not come back on. I'm not
sure if this final part is really a bug with KMS/radeon/Xorg because
the system died at that point because of the second problem with
2.6.35-rc2, but I wanted to mention it anyway.

The system as a X300 card, that worked perfectly in 2.6.34 (and
previous versions) with KMS:
01:00.0 VGA compatible controller: ATI Technologies Inc RV370 5B60
[Radeon X300 (PCIE)] (prog-if 00 [VGA controller])
Subsystem: ASUSTeK Computer Inc. Device 0083
Flags: bus master, fast devsel, latency 0, IRQ 28
Memory at e0000000 (32-bit, prefetchable) [size=128M]
I/O ports at d000 [size=256]
Memory at efbf0000 (32-bit, non-prefetchable) [size=64K]
Expansion ROM at efbc0000 [disabled] [size=128K]
Capabilities: [50] Power Management version 2
Capabilities: [58] Express Endpoint, MSI 00
Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit+
Kernel driver in use: radeon

01:00.1 Display controller: ATI Technologies Inc RV370 [Radeon X300SE]
Subsystem: ASUSTeK Computer Inc. Device 0082
Flags: bus master, fast devsel, latency 0
Memory at efbe0000 (32-bit, non-prefetchable) [size=64K]
Capabilities: [50] Power Management version 2
Capabilities: [58] Express Endpoint, MSI 00

The second problem is more serious, an OOPS and after that the system
hangs. Ctrl+Alt+Del did not initiate a shutdown, although the magic
SysRq still party worked (A first SysRq+S worked, but SysRq+U or a
second SysRq+S after that did not. SysRq+B still rebooted)
[ 90.040053] general protection fault: 0000 [#1] SMP

[ 90.045062] last sysfs file:
/sys/devices/pci0000:00/0000:00:06.0/0000:05:06.0/resource

[ 90.050007] CPU 0

[ 90.050007] Modules linked in: sg

[ 90.050007]

[ 90.050007] Pid: 335, comm: kblockd/0 Not tainted 2.6.35-rc2 #1
KFN5-D SLI/KFN5-D SLI

[ 90.050007] RIP: 0010:[<ffffffff8135aa64>] [<ffffffff8135aa64>]
ata_find_dev+0x24/0x90

[ 90.050007] RSP: 0018:ffff88007ffdbda0 EFLAGS: 00010082

[ 90.050007] RAX: 0720072007200720 RBX: ffff88007ffc7000 RCX: 0720072007202558

[ 90.050007] RDX: ffff880007009e38 RSI: 0000000000000000 RDI: ffff880007008000

[ 90.050007] RBP: ffff880006cef700 R08: 0000000000000001 R09: 0000000000000008

[ 90.050007] R10: 0000000000000000 R11: ffff88000723edb0 R12: ffff88007f3a3800

[ 90.050007] R13: ffff880007008000 R14: ffffffff81340f80 R15: ffff88007ffc7138

[ 90.050007] FS: 00007f558bc58700(0000) GS:ffff880001c00000(0000)
knlGS:0000000000000000

[ 90.050007] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b

[ 90.050007] CR2: 00007fffa9653000 CR3: 0000000006429000 CR4: 00000000000006f0

[ 90.050007] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000

[ 90.050007] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400

[ 90.050007] Process kblockd/0 (pid: 335, threadinfo
ffff88007ffda000, task ffff88007ff8a7d0)

[ 90.050007] Stack:

[ 90.050007] ffffffff8135ab25 ffffffff8135e7cb ffff880006cef700
ffff88007f3a3800

[ 90.050007] <0> ffff88000723ecd0 0000000000000287 ffff88007ffc7048
ffffffff81341c49

[ 90.050007] <0> ffff88000723ecd0 ffff88007ffc7000 ffff880007290000
ffff88000723ecd0

[ 90.050007] Call Trace:

[ 90.050007] [<ffffffff8135ab25>] ? ata_scsi_find_dev+0x5/0x30

[ 90.050007] [<ffffffff8135e7cb>] ? ata_scsi_queuecmd+0x4b/0x2c0

[ 90.050007] [<ffffffff81341c49>] ? scsi_dispatch_cmd+0xd9/0x210

[ 90.050007] [<ffffffff81348530>] ? scsi_request_fn+0x300/0x3e0

[ 90.050007] [<ffffffff811e31e0>] ? blk_unplug_work+0x0/0x20

[ 90.050007] [<ffffffff811e4624>] ? generic_unplug_device+0x24/0x30

[ 90.050007] [<ffffffff8104ca6b>] ? worker_thread+0xeb/0x180

[ 90.050007] [<ffffffff81050690>] ? autoremove_wake_function+0x0/0x30

[ 90.050007] [<ffffffff8104c980>] ? worker_thread+0x0/0x180

[ 90.050007] [<ffffffff810501fe>] ? kthread+0x8e/0xa0

[ 90.050007] [<ffffffff81003194>] ? kernel_thread_helper+0x4/0x10

[ 90.050007] [<ffffffff81050170>] ? kthread+0x0/0xa0

[ 90.050007] [<ffffffff81003190>] ? kernel_thread_helper+0x0/0x10

[ 90.050007] Code: 1f 84 00 00 00 00 00 8b 87 00 29 00 00 85 c0 75
46 48 8b 87 38 1e 00 00 48 8d 97 38 1e 00 00 48 8d 88 38 1e 00 00 48
39 ca 74 4c <48> 3b 90 f8 28 00 00 74 43 ba 01 00 00 00 39 d6 7d 47 48
63 f6

[ 90.050007] RIP [<ffffffff8135aa64>] ata_find_dev+0x24/0x90

[ 90.050007] RSP <ffff88007ffdbda0>

[ 90.050007] ---[ end trace c14df2a6b8b3b357 ]---


(gdb) list *0xffffffff8135aa64
0xffffffff8135aa64 is in ata_find_dev (include/linux/libata.h:1201).
1196 return ap->nr_pmp_links != 0;
1197 }
1198
1199 static inline int ata_is_host_link(const struct ata_link *link)
1200 {
1201 return link == &link->ap->link || link == link->ap->slave_link;
1202 }
1203 #else /* CONFIG_SATA_PMP */
1204 static inline bool sata_pmp_supported(struct ata_port *ap)
1205 {

CONFIG_SATA_PMP ist set to 'y', because my SiI 3132 should be PMP
capable. (But there are only two normal hdds attached to this
controller)

Please ask, if you need more information or have something to try for me.

Thanks

Torsten
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on


On Sun, 6 Jun 2010, Torsten Kaiser wrote:
>
> The first problem that shows up is, that after the KMS switches to the
> correct video mode (1280x1024 for an DVI attached LCD), the display
> begins to flicker. Every 1..2 seconds (guesstimated) the display turns
> off and on again. Something in the new powersaving?

Or maybe a borderline display timing that the display has trouble syncing
up with?

> The second problem is more serious, an OOPS and after that the system
> hangs. Ctrl+Alt+Del did not initiate a shutdown, although the magic
> SysRq still party worked (A first SysRq+S worked, but SysRq+U or a
> second SysRq+S after that did not. SysRq+B still rebooted)

This one looks very much like ATA. Added Jeff and Tejun to the cc.

Jeff, Tejun, anything ring a bell?

Linus

> [ 90.040053] general protection fault: 0000 [#1] SMP
> [ 90.045062] last sysfs file: /sys/devices/pci0000:00/0000:00:06.0/0000:05:06.0/resource
> [ 90.050007] CPU 0
> [ 90.050007] Modules linked in: sg
> [ 90.050007]
> [ 90.050007] Pid: 335, comm: kblockd/0 Not tainted 2.6.35-rc2 #1 KFN5-D SLI/KFN5-D SLI
> [ 90.050007] RIP: 0010:[<ffffffff8135aa64>] [<ffffffff8135aa64>] ata_find_dev+0x24/0x90
> [ 90.050007] RSP: 0018:ffff88007ffdbda0 EFLAGS: 00010082
> [ 90.050007] RAX: 0720072007200720 RBX: ffff88007ffc7000 RCX: 0720072007202558
> [ 90.050007] RDX: ffff880007009e38 RSI: 0000000000000000 RDI: ffff880007008000
> [ 90.050007] RBP: ffff880006cef700 R08: 0000000000000001 R09: 0000000000000008
> [ 90.050007] R10: 0000000000000000 R11: ffff88000723edb0 R12: ffff88007f3a3800
> [ 90.050007] R13: ffff880007008000 R14: ffffffff81340f80 R15: ffff88007ffc7138
> [ 90.050007] FS: 00007f558bc58700(0000) GS:ffff880001c00000(0000) knlGS:0000000000000000
> [ 90.050007] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ 90.050007] CR2: 00007fffa9653000 CR3: 0000000006429000 CR4: 00000000000006f0
> [ 90.050007] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 90.050007] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [ 90.050007] Process kblockd/0 (pid: 335, threadinfo ffff88007ffda000, task ffff88007ff8a7d0)
> [ 90.050007] Stack:
> [ 90.050007] ffffffff8135ab25 ffffffff8135e7cb ffff880006cef700 ffff88007f3a3800
> [ 90.050007] <0> ffff88000723ecd0 0000000000000287 ffff88007ffc7048 ffffffff81341c49
> [ 90.050007] <0> ffff88000723ecd0 ffff88007ffc7000 ffff880007290000 ffff88000723ecd0
> [ 90.050007] Call Trace:
> [ 90.050007] [<ffffffff8135ab25>] ? ata_scsi_find_dev+0x5/0x30
> [ 90.050007] [<ffffffff8135e7cb>] ? ata_scsi_queuecmd+0x4b/0x2c0
> [ 90.050007] [<ffffffff81341c49>] ? scsi_dispatch_cmd+0xd9/0x210
> [ 90.050007] [<ffffffff81348530>] ? scsi_request_fn+0x300/0x3e0
> [ 90.050007] [<ffffffff811e31e0>] ? blk_unplug_work+0x0/0x20
> [ 90.050007] [<ffffffff811e4624>] ? generic_unplug_device+0x24/0x30
> [ 90.050007] [<ffffffff8104ca6b>] ? worker_thread+0xeb/0x180
> [ 90.050007] [<ffffffff81050690>] ? autoremove_wake_function+0x0/0x30
> [ 90.050007] [<ffffffff8104c980>] ? worker_thread+0x0/0x180
> [ 90.050007] [<ffffffff810501fe>] ? kthread+0x8e/0xa0
> [ 90.050007] [<ffffffff81003194>] ? kernel_thread_helper+0x4/0x10
> [ 90.050007] [<ffffffff81050170>] ? kthread+0x0/0xa0
> [ 90.050007] [<ffffffff81003190>] ? kernel_thread_helper+0x0/0x10
> [ 90.050007] Code: 1f 84 00 00 00 00 00 8b 87 00 29 00 00 85 c0 75
> 46 48 8b 87 38 1e 00 00 48 8d 97 38 1e 00 00 48 8d 88 38 1e 00 00 48
> 39 ca 74 4c <48> 3b 90 f8 28 00 00 74 43 ba 01 00 00 00 39 d6 7d 47 48
> 63 f6
> [ 90.050007] RIP [<ffffffff8135aa64>] ata_find_dev+0x24/0x90
> [ 90.050007] RSP <ffff88007ffdbda0>
> [ 90.050007] ---[ end trace c14df2a6b8b3b357 ]---
>
>
> (gdb) list *0xffffffff8135aa64
> 0xffffffff8135aa64 is in ata_find_dev (include/linux/libata.h:1201).
> 1196 return ap->nr_pmp_links != 0;
> 1197 }
> 1198
> 1199 static inline int ata_is_host_link(const struct ata_link *link)
> 1200 {
> 1201 return link == &link->ap->link || link == link->ap->slave_link;
> 1202 }
> 1203 #else /* CONFIG_SATA_PMP */
> 1204 static inline bool sata_pmp_supported(struct ata_port *ap)
> 1205 {
>
> CONFIG_SATA_PMP ist set to 'y', because my SiI 3132 should be PMP
> capable. (But there are only two normal hdds attached to this
> controller)
>
> Please ask, if you need more information or have something to try for me.
>
> Thanks
>
> Torsten
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Torsten Kaiser on
[CC:Jeff+Tejun not removed, because you might want to look at the
attached dmesgs]

On Sun, Jun 6, 2010 at 4:19 PM, Linus Torvalds
<torvalds(a)linux-foundation.org> wrote:
> On Sun, 6 Jun 2010, Torsten Kaiser wrote:
>>
>> The first problem that shows up is, that after the KMS switches to the
>> correct video mode (1280x1024 for an DVI attached LCD), the display
>> begins to flicker. Every 1..2 seconds (guesstimated) the display turns
>> off and on again. Something in the new powersaving?
>
> Or maybe a borderline display timing that the display has trouble syncing
> up with?

With 2.6.34 and any previous KMS kernels the output was always stable.
(I think, I switch to the radeon KMS on 2.6.32)
The onscreen menu of the monitor showed 1280x1024(a)60.2Hz for
2.6.35-rc2, if I recall correctly.
Now back on 2.6.34 its 1280x1024(a)59.9Hz.

Comparing the DRM output from 2.6.34 and 2.6.35-rc2 I see the
following differences:
On 2.6.35-rc2 this block is missing:
[ 1.907716] [drm] GPU reset succeed (RBBM_STATUS=0x00000140)
[ 1.913403] [drm] 1 Power State(s)
[ 1.916810] [drm] State 0 Default (default)
[ 1.921017] [drm] 16 PCIE Lanes
[ 1.924255] [drm] 1 Clock Mode(s)
[ 1.927662] [drm] 0 engine/memory: 325000/200000
[ 1.932496] [drm] radeon: power management initialized
New on 2.6.35: [ 1.951340] [TTM] Initializing pool allocator.
Only on 2.6.34: [ 2.011963] [drm] radeon: cp idle (0x10000C03)
Only on 2.6.34: [ 2.020478] platform radeon_cp.0: firmware: using
built-in firmware radeon/R300_cp.bin

On 2.6.34 the output for connector 1 is:
[ 2.090935] [drm] Connector 1:
[ 2.094002] [drm] DVI-I
[ 2.096629] [drm] HPD1
[ 2.099174] [drm] DDC: 0x64 0x64 0x64 0x64 0x64 0x64 0x64 0x64
[ 2.105210] [drm] Encoders:
[ 2.108189] [drm] CRT2: INTERNAL_DAC2
[ 2.112222] [drm] DFP1: INTERNAL_TMDS1
With 2.6.35-rc2 the line 'HPD1' switches to 'NONE'

Everything else is identical. I have attached complete dmesg from both
kernels to this mail.

Torsten
From: Himanshu Chauhan on
>
> The first problem that shows up is, that after the KMS switches to the
> correct video mode (1280x1024 for an DVI attached LCD), the display
> begins to flicker. Every 1..2 seconds (guesstimated) the display turns
> off and on again. Something in the new powersaving?
>
> This keeps up during userspace bootup, but probably around the time
> Xorg starts the display goes blank and does not come back on. I'm not
> sure if this final part is really a bug with KMS/radeon/Xorg because
> the system died at that point because of the second problem with
> 2.6.35-rc2, but I wanted to mention it anyway.

Same is the case with me. I have an Acer aspire 3000 and a assembled Pentium 4
machine. Both show the same behaviour. Monitor flickers, display turns off,
and never turns back on. X doesn't always start correctly. I need to do, CTRL_ALT_F2
and CTRL_ALT_F7 couple of times before I get it working again.

- Himanshu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Tejun Heo on
Hello, Torsten, Linus.

On 06/06/2010 04:19 PM, Linus Torvalds wrote:
>> [ 90.040053] general protection fault: 0000 [#1] SMP
>> [ 90.045062] last sysfs file: /sys/devices/pci0000:00/0000:00:06.0/0000:05:06.0/resource
>> [ 90.050007] CPU 0
>> [ 90.050007] Modules linked in: sg
>> [ 90.050007]
>> [ 90.050007] Pid: 335, comm: kblockd/0 Not tainted 2.6.35-rc2 #1 KFN5-D SLI/KFN5-D SLI
>> [ 90.050007] RIP: 0010:[<ffffffff8135aa64>] [<ffffffff8135aa64>] ata_find_dev+0x24/0x90
>> [ 90.050007] RSP: 0018:ffff88007ffdbda0 EFLAGS: 00010082
>> [ 90.050007] RAX: 0720072007200720 RBX: ffff88007ffc7000 RCX: 0720072007202558
>> [ 90.050007] RDX: ffff880007009e38 RSI: 0000000000000000 RDI: ffff880007008000
>> [ 90.050007] RBP: ffff880006cef700 R08: 0000000000000001 R09: 0000000000000008
>> [ 90.050007] R10: 0000000000000000 R11: ffff88000723edb0 R12: ffff88007f3a3800
>> [ 90.050007] R13: ffff880007008000 R14: ffffffff81340f80 R15: ffff88007ffc7138
>> [ 90.050007] FS: 00007f558bc58700(0000) GS:ffff880001c00000(0000) knlGS:0000000000000000
>> [ 90.050007] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>> [ 90.050007] CR2: 00007fffa9653000 CR3: 0000000006429000 CR4: 00000000000006f0
>> [ 90.050007] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [ 90.050007] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
....
>> (gdb) list *0xffffffff8135aa64
>> 0xffffffff8135aa64 is in ata_find_dev (include/linux/libata.h:1201).
>> 1196 return ap->nr_pmp_links != 0;
>> 1197 }
>> 1198
>> 1199 static inline int ata_is_host_link(const struct ata_link *link)
>> 1200 {
>> 1201 return link == &link->ap->link || link == link->ap->slave_link;
>> 1202 }
>> 1203 #else /* CONFIG_SATA_PMP */
>> 1204 static inline bool sata_pmp_supported(struct ata_port *ap)
>> 1205 {

Hmmm... that's really odd. An ata_port contains one ata_link
structure in ap->link which is initialized by ata_link_init() during
ata_port_alloc(), so ap->link.ap == ap should always hold.

In the above case, ap->link.ap is containing 0x0720072007200720 (RAX)
instead of the proper address 0xffff880007008000 (RDI) and
ata_is_host_link() is causing oops trying to derference
ap->link.ap->slave_link.

Odd, I can't think of any change which could cause such behavior
difference. Where would 0x0720072007200720 come from? That's a
rather strange value to be there and it doesn't seem to be a magic
value. I'll see whether I can reproduce the problem. Can you please
try w/o KMS just in case?

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/