From: Frederic Weisbecker on
On Thu, Mar 18, 2010 at 02:39:13PM +0800, Li Zefan wrote:
> David Miller wrote:
> > From: Frederic Weisbecker <fweisbec(a)gmail.com>
> > Date: Thu, 18 Mar 2010 05:49:33 +0100
> >
> >> While using the lock events through perf in a sparc box, I can see
> >> the following message repeated many times:
> >>
> >> Kernel unaligned access at TPC[49357c] perf_trace_lock_acquire+0xb4/0x180
> >>
> >> It actually hangs the box as the messages are sent to a serial console.
> >>
> >> When used with perf, the trace events use a per cpu buffer allocated
> >> in kernel/trace/trace_event_perf.c, and the allocation appears to return
> >> a misaligned percpu pointer. It is aligned to 4 while it seems it
> >> requires to be aligned to 8.
> >
> > Thanks I'll take a look at this.
> >
> > RAW locks (both rwlocks and spinlocks) on sparc64 are 4-bytes
> > in size, maybe some piece of code is assuming that locks
> > are cpu word sized.
> >
> > Where is perf_trace_lock_acquire() I can't find it in Linus's
> > tree? Does it get created by some crazy macro expansion?
> >
>
> Yes, it's expanded by some crazy macro in include/trace/ftrace.h..
>
> In linus' tree, it's called ftrace_profile_lock_acquire(), and it's
> renamed to perf_trace_lock_acquire() in -tip tree by commit
> 97d5a22005f38057b4bc0d95f81cd26510268794.
>
> #undef DECLARE_EVENT_CLASS
> #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \
> static notrace void \
> ftrace_profile_templ_##call(struct ftrace_event_call *event_call, \
> proto) \
> { \
> struct ftrace_data_offsets_##call __maybe_unused __data_offsets;\
> struct ftrace_raw_##call *entry; \
> u64 __addr = 0, __count = 1; \
> unsigned long irq_flags; \
> int __entry_size; \
> int __data_size; \
> int rctx; \
> \
> ...
> }


Yeah indeed. The problem happens in Linus's tree and -tip tree as well,
it's just that I debugged it in -tip and there has been a naming change
inside, I forgot about that. So in mainline the problem happens in
ftrace_profile_templ_lock_acquire (macro generated above).

Thanks.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Frederic Weisbecker on
On Thu, Mar 18, 2010 at 06:30:34PM +0900, Tejun Heo wrote:
> Hello,
>
> On 03/18/2010 01:49 PM, Frederic Weisbecker wrote:
> > Hi,
> >
> > While using the lock events through perf in a sparc box, I can see
> > the following message repeated many times:
> >
> > Kernel unaligned access at TPC[49357c] perf_trace_lock_acquire+0xb4/0x180
> >
> > It actually hangs the box as the messages are sent to a serial console.
> >
> > When used with perf, the trace events use a per cpu buffer allocated
> > in kernel/trace/trace_event_perf.c, and the allocation appears to return
> > a misaligned percpu pointer. It is aligned to 4 while it seems it
> > requires to be aligned to 8.
>
> Does this fix the problem?
>
> diff --git a/kernel/trace/trace_event_profile.c b/kernel/trace/trace_event_profile.c
> index c1cc3ab..d3f7d1b 100644
> --- a/kernel/trace/trace_event_profile.c
> +++ b/kernel/trace/trace_event_profile.c
> @@ -27,13 +27,15 @@ static int ftrace_profile_enable_event(struct ftrace_event_call *event)
> return 0;
>
> if (!total_profile_count) {
> - buf = (char *)alloc_percpu(perf_trace_t);
> + buf = (char *)__alloc_percpu(sizeof(perf_trace_t),
> + __alignof__(unsigned long));
> if (!buf)
> goto fail_buf;
>
> rcu_assign_pointer(perf_trace_buf, buf);
>
> - buf = (char *)alloc_percpu(perf_trace_t);
> + buf = (char *)__alloc_percpu(sizeof(perf_trace_t),
> + __alignof__(unsigned long));
> if (!buf)
> goto fail_buf_nmi;


Yep, it does the trick.

In case you test, I have two other misalignments, one is in
perf_trace_buf_prepare but it is my bad and it is nothing
related to percpu. I'm going to fix it.
Another is in the ring buffer and Steve has a pending fix.

Thanks.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: David Miller on
From: Tejun Heo <tj(a)kernel.org>
Date: Thu, 18 Mar 2010 18:30:34 +0900

>
> if (!total_profile_count) {
> - buf = (char *)alloc_percpu(perf_trace_t);
> + buf = (char *)__alloc_percpu(sizeof(perf_trace_t),
> + __alignof__(unsigned long));
> if (!buf)
> goto fail_buf;

Why not make perf_trace_t have the proper alignment?

That's better than patching around it like this.

Defining it as an array of char[]'s is just asking
for lots of trouble.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Frederic Weisbecker on
On Thu, Mar 18, 2010 at 05:54:13PM -0700, David Miller wrote:
> From: Tejun Heo <tj(a)kernel.org>
> Date: Thu, 18 Mar 2010 18:30:34 +0900
>
> >
> > if (!total_profile_count) {
> > - buf = (char *)alloc_percpu(perf_trace_t);
> > + buf = (char *)__alloc_percpu(sizeof(perf_trace_t),
> > + __alignof__(unsigned long));
> > if (!buf)
> > goto fail_buf;
>
> Why not make perf_trace_t have the proper alignment?


So, making perf_trace_t as align(8) would do the trick?
I lack the knowledge about alignment layout for archs that
need aligned accesses.
At a first glance, what I would except is that every buffer
has a base address aligned, no?


>
> That's better than patching around it like this.
>
> Defining it as an array of char[]'s is just asking
> for lots of trouble.


Yeah but we need a generic type. This is because
our buffer can be of any random type to match all
the trace event layouts we have, all of them being
generated by macros.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Tejun Heo on
Hello,

On 03/19/2010 10:31 AM, Frederic Weisbecker wrote:
> On Thu, Mar 18, 2010 at 05:54:13PM -0700, David Miller wrote:
>> From: Tejun Heo <tj(a)kernel.org>
>> Date: Thu, 18 Mar 2010 18:30:34 +0900
>>
>>>
>>> if (!total_profile_count) {
>>> - buf = (char *)alloc_percpu(perf_trace_t);
>>> + buf = (char *)__alloc_percpu(sizeof(perf_trace_t),
>>> + __alignof__(unsigned long));
>>> if (!buf)
>>> goto fail_buf;
>>
>> Why not make perf_trace_t have the proper alignment?

Sure, I just wanted to verify the cause of the problem.

> So, making perf_trace_t as align(8) would do the trick?
> I lack the knowledge about alignment layout for archs that
> need aligned accesses.

If you can't make it a proper type, __alignof__(unsigned long long)
would be better.

> Yeah but we need a generic type. This is because
> our buffer can be of any random type to match all
> the trace event layouts we have, all of them being
> generated by macros.

I hope those macros align properly according to types.

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/