From: Török Edwin on
Hi,

Callgraph profiling 32-bit apps on a 64-bit kernel doesn't work.
The reason is that perf_callchain_user tries to read a stackframe with 64-bit
pointers, which is wrong for a 32-bit process.

This patch fixes that, and I am almost able to get nice callgraph profiles
from 32-bit apps now! (except for some problems with perf itself when tracing
kernel modules, see [1])

Page-faults can be traced nicely (sid-ia32 is a 32-bit chroot):

$ sudo perf record -e page-faults -f -g /home/edwin/sid-ia32/usr/bin/glxgears
$ sudo perf report
....
45.33% libc-2.10.2.so [.] __GI_memcpy
|
--- __GI_memcpy
_mesa_BufferDataARB
_mesa_meta_Clear
radeonUserClear
r700Clear
_mesa_Clear
0x8049367
0x804a6ba
__libc_start_main
0x8049111

16.96% libc-2.10.2.so [.] __GI_memset
|
--- __GI_memset
_tnl_init_vertices
_swsetup_CreateContext
r600CreateContext
driCreateNewContext
dri2CreateNewContext
0xf77ab7dd
0xf7783c67
0xf778514c
0x804974f
0x804a33d
__libc_start_main
0x8049111

And CPU cycles can be traced too in userspace:
$ sudo perf record -f -g /home/edwin/sid-ia32/usr/bin/glxgears
$ sudo perf report --sort comm,dso
[...]
44.97% glxgears r600_dri.so
|
|--5.85%-- r700SendSPIState
| radeonEmitState
| r700DrawPrims
| |
| |--95.45%-- vbo_save_playback_vertex_list
| | execute_list
| | _mesa_CallList
| | neutral_CallList
| | |
| | |--38.10%-- 0x80494a8
| | | 0x804a6ba
| | | __libc_start_main
| | | 0x8049111
[....]
40.00% glxgears [kernel]
|
|--3.14%-- copy_user_generic_string
| |
| |--71.70%-- 0xffffffffa01b4493
| | 0xffffffffa01b7c0b
| | 0xffffffffa018b45b
| | 0xffffffffa00ca927
| | 0xffffffffa01c524e
| | compat_sys_ioctl
| | sysenter_dispatch
| | 0xf77ca430
| | drmCommandWriteRead
| | 0xf74d7ab5
| | 0xf74d89a4
| | rcommonFlushCmdBufLocked
| | rcommonFlushCmdBuf
| | radeonFlush
| | _mesa_flush
| | _mesa_Flush
| | 0xf775f270
| | 0x804a6d5
| | __libc_start_main
| | 0x8049111
| |
| |--15.09%-- 0xffffffffa01c524e
| | compat_sys_ioctl
| | sysenter_dispatch
| | 0xf77ca430
| | drmCommandWriteRead

[1] But there is a problem with the perf tool: it can't trace addresses in
kernel modules. This is a problem regardless if the traced app is 32-bit or
64-bit; and regardless if I do callgraph profiling or not.
See the above trace, where the kernel addresses have all ffffffffa0* without a
symbol name.

If I look at /proc/kallsyms I can guess the symbols, for example
0xffffffffa01b4493 is probably this one:
ffffffffa01b4411 t r600_cs_packet_parse [radeon]

If I record/report without callgraph its the same problem:
[...]
24.01% glxgears [kernel] [k] 0xffffffffa01b4ee9
3.96% glxgears libdrm_radeon.so.1.0.0 [.] cs_gem_write_reloc
3.53% glxgears r600_dri.so [.] r700SendSPIState
2.77% glxgears r600_dri.so [.] r700DrawPrims
1.99% glxgears r600_dri.so [.] r700SendVSConsts

Kernel symbol for 0xffffffffa01b4ee9 is not shown, I can guess it is this one
(hey it was an exact match!):
ffffffffa01b4ee9 t r600_packet3_check [radeon]

It would be good if perf knew how to lookup symbols in kernel modules!

Best regards,
--Edwin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Török Edwin on
On 03/15/2010 05:34 PM, Török Edwin wrote:
> Hi,
>
> Callgraph profiling 32-bit apps on a 64-bit kernel doesn't work.
> The reason is that perf_callchain_user tries to read a stackframe with 64-bit
> pointers, which is wrong for a 32-bit process.
>
> This patch fixes that, and I am almost able to get nice callgraph profiles
> from 32-bit apps now! (except for some problems with perf itself when tracing
> kernel modules, see [1])
>
> Page-faults can be traced nicely (sid-ia32 is a 32-bit chroot):
>
> $ sudo perf record -e page-faults -f -g /home/edwin/sid-ia32/usr/bin/glxgears
> $ sudo perf report
> ...
> 45.33% libc-2.10.2.so [.] __GI_memcpy
> |
> --- __GI_memcpy
> _mesa_BufferDataARB
> _mesa_meta_Clear
> radeonUserClear
> r700Clear
> _mesa_Clear
> 0x8049367
> 0x804a6ba
> __libc_start_main
> 0x8049111
>
> 16.96% libc-2.10.2.so [.] __GI_memset
> |
> --- __GI_memset
> _tnl_init_vertices
> _swsetup_CreateContext
> r600CreateContext
> driCreateNewContext
> dri2CreateNewContext
> 0xf77ab7dd
> 0xf7783c67
> 0xf778514c
> 0x804974f
> 0x804a33d
> __libc_start_main
> 0x8049111
>
> And CPU cycles can be traced too in userspace:
> $ sudo perf record -f -g /home/edwin/sid-ia32/usr/bin/glxgears
> $ sudo perf report --sort comm,dso
> [...]
> 44.97% glxgears r600_dri.so
> |
> |--5.85%-- r700SendSPIState
> | radeonEmitState
> | r700DrawPrims
> | |
> | |--95.45%-- vbo_save_playback_vertex_list
> | | execute_list
> | | _mesa_CallList
> | | neutral_CallList
> | | |
> | | |--38.10%-- 0x80494a8
> | | | 0x804a6ba
> | | | __libc_start_main
> | | | 0x8049111
> [....]
> 40.00% glxgears [kernel]
> |
> |--3.14%-- copy_user_generic_string
> | |
> | |--71.70%-- 0xffffffffa01b4493
> | | 0xffffffffa01b7c0b
> | | 0xffffffffa018b45b
> | | 0xffffffffa00ca927
> | | 0xffffffffa01c524e
> | | compat_sys_ioctl
> | | sysenter_dispatch
> | | 0xf77ca430
> | | drmCommandWriteRead
> | | 0xf74d7ab5
> | | 0xf74d89a4
> | | rcommonFlushCmdBufLocked
> | | rcommonFlushCmdBuf
> | | radeonFlush
> | | _mesa_flush
> | | _mesa_Flush
> | | 0xf775f270
> | | 0x804a6d5
> | | __libc_start_main
> | | 0x8049111
> | |
> | |--15.09%-- 0xffffffffa01c524e
> | | compat_sys_ioctl
> | | sysenter_dispatch
> | | 0xf77ca430
> | | drmCommandWriteRead
>
> [1] But there is a problem with the perf tool: it can't trace addresses in
> kernel modules. This is a problem regardless if the traced app is 32-bit or
> 64-bit; and regardless if I do callgraph profiling or not.
> See the above trace, where the kernel addresses have all ffffffffa0* without a
> symbol name.
>
> If I look at /proc/kallsyms I can guess the symbols, for example
> 0xffffffffa01b4493 is probably this one:
> ffffffffa01b4411 t r600_cs_packet_parse [radeon]
>
> If I record/report without callgraph its the same problem:
> [...]
> 24.01% glxgears [kernel] [k] 0xffffffffa01b4ee9
> 3.96% glxgears libdrm_radeon.so.1.0.0 [.] cs_gem_write_reloc
> 3.53% glxgears r600_dri.so [.] r700SendSPIState
> 2.77% glxgears r600_dri.so [.] r700DrawPrims
> 1.99% glxgears r600_dri.so [.] r700SendVSConsts
>
> Kernel symbol for 0xffffffffa01b4ee9 is not shown, I can guess it is this one
> (hey it was an exact match!):
> ffffffffa01b4ee9 t r600_packet3_check [radeon]
>
> It would be good if perf knew how to lookup symbols in kernel modules!

BTW perf report -m -k /home/edwin/builds/linux-2.6/vmlinux doesn't show
the symbols either.

>
> Best regards,
> --Edwin

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Török Edwin on
On 03/15/2010 06:23 PM, Török Edwin wrote:
> On 03/15/2010 05:34 PM, Török Edwin wrote:
>>
>> It would be good if perf knew how to lookup symbols in kernel modules!
>
> BTW perf report -m -k /home/edwin/builds/linux-2.6/vmlinux doesn't show
> the symbols either.

I always forget that, unlike every other program, perf doesn't install
by default to /usr/local!
So I was running the wrong version of perf (from an older kernel), since
perf was installed to $HOME/bin (which of course isn't in sudo's path).

Sorry for the confusion, the 2.6.33 perf DOES know how to lookup the
symbols:
9.92% glxgears [radeon] [k]
r600_packet3_check
|
--- r600_packet3_check
|
|--96.80%-- r600_cs_parse
| radeon_cs_ioctl
| drm_ioctl
| radeon_kms_compat_ioctl
| compat_sys_ioctl
| sysenter_dispatch
| 0xf7759430
| drmCommandWriteRead
| cs_gem_emit
| radeon_cs_emit
| rcommonFlushCmdBufLocked
| rcommonFlushCmdBuf
| radeonFlush
| _mesa_flush
| _mesa_Flush
| 0xf76ee270
| 0x804a6d5
| __libc_start_main
| 0x8049111
[...]

Best regards,
--Edwin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Ingo Molnar on

* T??r??k Edwin <edwintorok(a)gmail.com> wrote:

> On 03/15/2010 06:23 PM, T??r??k Edwin wrote:
> > On 03/15/2010 05:34 PM, T??r??k Edwin wrote:
> >>
> >> It would be good if perf knew how to lookup symbols in kernel modules!
> >
> > BTW perf report -m -k /home/edwin/builds/linux-2.6/vmlinux doesn't show
> > the symbols either.
>
> I always forget that, unlike every other program, perf doesn't install
> by default to /usr/local!
> So I was running the wrong version of perf (from an older kernel), since
> perf was installed to $HOME/bin (which of course isn't in sudo's path).
>
> Sorry for the confusion, the 2.6.33 perf DOES know how to lookup the
> symbols:
> 9.92% glxgears [radeon] [k]
> r600_packet3_check
> |
> --- r600_packet3_check
> |
> |--96.80%-- r600_cs_parse

Ok, great!

I suspect we could install into /usr/local too. Do you want to send a patch
for that?

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Török Edwin on
On 03/16/2010 10:47 AM, Ingo Molnar wrote:
> * T??r??k Edwin <edwintorok(a)gmail.com> wrote:
>
>> On 03/15/2010 06:23 PM, T??r??k Edwin wrote:
>>> On 03/15/2010 05:34 PM, T??r??k Edwin wrote:
>>>> It would be good if perf knew how to lookup symbols in kernel modules!
>>> BTW perf report -m -k /home/edwin/builds/linux-2.6/vmlinux doesn't show
>>> the symbols either.
>> I always forget that, unlike every other program, perf doesn't install
>> by default to /usr/local!
>> So I was running the wrong version of perf (from an older kernel), since
>> perf was installed to $HOME/bin (which of course isn't in sudo's path).
>>
>> Sorry for the confusion, the 2.6.33 perf DOES know how to lookup the
>> symbols:
>> 9.92% glxgears [radeon] [k]
>> r600_packet3_check
>> |
>> --- r600_packet3_check
>> |
>> |--96.80%-- r600_cs_parse
>
> Ok, great!

BTW the patch I sent yesterday for tracing 32-bit apps is still needed,
since that is a kernel patch, and it wasn't due to using the wrong perf.

>
> I suspect we could install into /usr/local too. Do you want to send a patch
> for that?

Sent.

BTW I think perf would need some documentation on how to install, and
what packages you need to build everything, what permissions it needs to
run, etc.

1. manpages
For example by default the manpages don't get built and installed, so
perf report --help doesn't work. It needs a 'make man', and 'make
install-man'.
This is fine, because they need asciidoc and xmlto which aren't usually
installed on every system. But there should be some documentation
mentioning this.

2. privileges
I just found out that perf works without root privileges (I just
assumed it needed root, since oprofile needs it).

3. non-working targets?
Also there are some targets in Documentation that can't be built due to
missing files, like pdf which needs a non-existent user-manual.xml.

4. unresolved symbols
Sometimes I get symbol addresses that are not resolved, like this:
57.03% :32216 7fc7dc0acfa6 [.] 0x007fc7dc0acfa6
12.39% :32216 [radeon] [k] r600_packet3_check
4.92% :32216 [radeon] [k] r600_cs_packet_parse
2.70% :32216 [radeon] [k] r600_cs_parse

Is this due to ASLR? Does perf need ASLR disabled?
That address corresponds to this:
7fc7dc07e000-7fc7dc2ef000 r-xp 00000000 fd:03 27713
/opt/xorg/lib/dri/r600_dri.so

It'd of course be nice if there was a distro package for perf, I think
I'll file a RFP in Debian for that.

Best regards,
--Edwin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/