From: Frederic Weisbecker on
On Mon, Mar 15, 2010 at 05:34:20PM +0200, T�r�k Edwin wrote:
> When profiling a 32-bit process on a 64-bit kernel, callgraph tracing
> stopped after the first function, because it has seen a garbage memory address
> (tried to interpret the frame pointer, and return address as a 64-bit pointer).
>
> Fix this by using a struct stack_frame with 32-bit pointers when the TIF_IA32 flag is set.
>
> Note that TIF_IA32 flag must be used, and not is_compat_task(), because the
> latter is only set when the 32-bit process is executing a syscall,
> which may not always be the case (when tracing page fault events for example).
>
> Signed-off-by: T�r�k Edwin <edwintorok(a)gmail.com>
> ---
> arch/x86/kernel/cpu/perf_event.c | 33 +++++++++++++++++++++++++++++++++
> 1 files changed, 33 insertions(+), 0 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
> index 8c1c070..13ee83a 100644
> --- a/arch/x86/kernel/cpu/perf_event.c
> +++ b/arch/x86/kernel/cpu/perf_event.c
> @@ -2401,6 +2401,20 @@ static int copy_stack_frame(const void __user *fp, struct stack_frame *frame)
> return bytes == sizeof(*frame);
> }
>
> +struct stack_frame_ia32 {
> + u32 next_frame;
> + u32 return_address;
> +};
> +
> +static int copy_stack_frame_ia32(u32 fp, struct stack_frame_ia32 *frame)
> +{
> + unsigned long bytes;
> +
> + bytes = copy_from_user_nmi(frame, (const void __user*)(unsigned long)fp, sizeof(*frame));
> +
> + return bytes == sizeof(*frame);
> +}
> +
> static void
> perf_callchain_user(struct pt_regs *regs, struct perf_callchain_entry *entry)
> {
> @@ -2414,6 +2428,25 @@ perf_callchain_user(struct pt_regs *regs, struct perf_callchain_entry *entry)
>
> callchain_store(entry, PERF_CONTEXT_USER);
> callchain_store(entry, regs->ip);
> + if (test_thread_flag(TIF_IA32)) {
> + /* 32-bit process in 64-bit kernel. */
> + u32 fp = regs->bp;
> + struct stack_frame_ia32 frame;
> + while (entry->nr < PERF_MAX_STACK_DEPTH) {
> + frame.next_frame = 0;
> + frame.return_address = 0;
> +
> + if (!copy_stack_frame_ia32(fp, &frame))
> + break;
> +
> + if ((unsigned long)fp < regs->sp)
> + break;



Shouldn't it be this?

if (fp < (u32)regs->sp)

As the high part of fp is zeroed in the cast but
the high part of regs->sp remains. I don't know what could be
there, but since the user doesn't use it, perhaps just garbage?
And that could messup the comparison.

Otherwise the rest looks pretty good to me, nice catch.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/