From: Randy Dunlap on
On Fri, 19 Mar 2010 14:46:10 -0400 Mathieu Desnoyers wrote:

> * Randy Dunlap (randy.dunlap(a)oracle.com) wrote:
> > On 03/18/10 17:59, Mathieu Desnoyers wrote:
> > > * Steven Rostedt (rostedt(a)goodmis.org) wrote:
> > >> On Thu, 2010-03-18 at 16:26 -0700, Randy Dunlap wrote:
> > >>> I can build/boot 2.6.33 with CONFIG_TRACE/TRACING disabled successfully,
> > >>> but when I enable lots of tracing config options and then boot with
> > >>> ftrace=nop on the kernel command line, I see a GP fault when the parport &
> > >>> parport_pc modules are loading/initializing.
> > >>
> > >> Do you see it without adding the "ftrace=nop"? The only thing that
> > >> should do is expand the ring buffer on boot up.
> > >>
> > >>>
> > >>> It happens in drivers/parport/share.c::parport_register_device(), when that
> > >>> function calls try_module_get().
> > >>>
> > >>> If I comment out the trace_module_get() calls in include/linux/module.h,
> > >>> the kernel boots with no problems.
> > >>
> > >>
> > >> Interesting. Well, trace_module_get() is a TRACE_EVENT tracepoint. But
> > >> should be disabled here. It may be something to do with DEFINE_TRACE.
> > >>
> > >> (added Mathieu to Cc since he wrote that code)
> > >
> > > can you try replacing the "local_read(__module_ref_addr(module, cpu))" argument
> > > with "0" ?
> >
> > Yes, that boots with no problems.
>
> clickety-clicketa... git blame include/linux/module.h :
>
> commit 7ead8b8313d92b3a69a1a61b0dcbc4cd66c960dc
> Author: Li Zefan <lizf(a)cn.fujitsu.com>
> Date: Mon Aug 17 16:56:28 2009 +0800
>
> tracing/events: Add module tracepoints
>
> (Adding Li Zefan in CC)
>
> Two things:
>
> 1) In this commit, most of the tracepoints contain argument with side-effects.
> These do not belong there; they should be moved into TRACE_EVENT macros.
>
> 2) There seem to be a null-pointer bug with
> local_read(__module_ref_addr(module, cpu)) in try_module_get(). This should
> be investigated even if we move the argument to TRACE_EVENT.

Hi Li,

Fix this, please?


> Thanks,
>
> Mathieu
>
> >
> > > Arguments with side-effects are not skipped by the jump over disabled
> > > instrumentation. This is why we should do that part within the probe declaration
> > > in the TRACE_EVENT macros.
> > >
> > > But if we find out that the problem really is this argument, then it should be
> > > fixed, because something would be wrong with it (just moving it to TRACE_EVENT
> > > is not a proper solution).
> > >
> > > Thanks,
> > >
> > > Mathieu


---
~Randy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Mathieu Desnoyers on
* Randy Dunlap (randy.dunlap(a)oracle.com) wrote:
> On Fri, 19 Mar 2010 14:46:10 -0400 Mathieu Desnoyers wrote:
>
> > * Randy Dunlap (randy.dunlap(a)oracle.com) wrote:
> > > On 03/18/10 17:59, Mathieu Desnoyers wrote:
> > > > * Steven Rostedt (rostedt(a)goodmis.org) wrote:
> > > >> On Thu, 2010-03-18 at 16:26 -0700, Randy Dunlap wrote:
> > > >>> I can build/boot 2.6.33 with CONFIG_TRACE/TRACING disabled successfully,
> > > >>> but when I enable lots of tracing config options and then boot with
> > > >>> ftrace=nop on the kernel command line, I see a GP fault when the parport &
> > > >>> parport_pc modules are loading/initializing.
> > > >>
> > > >> Do you see it without adding the "ftrace=nop"? The only thing that
> > > >> should do is expand the ring buffer on boot up.
> > > >>
> > > >>>
> > > >>> It happens in drivers/parport/share.c::parport_register_device(), when that
> > > >>> function calls try_module_get().
> > > >>>
> > > >>> If I comment out the trace_module_get() calls in include/linux/module.h,
> > > >>> the kernel boots with no problems.
> > > >>
> > > >>
> > > >> Interesting. Well, trace_module_get() is a TRACE_EVENT tracepoint. But
> > > >> should be disabled here. It may be something to do with DEFINE_TRACE.
> > > >>
> > > >> (added Mathieu to Cc since he wrote that code)
> > > >
> > > > can you try replacing the "local_read(__module_ref_addr(module, cpu))" argument
> > > > with "0" ?
> > >
> > > Yes, that boots with no problems.
> >
> > clickety-clicketa... git blame include/linux/module.h :
> >
> > commit 7ead8b8313d92b3a69a1a61b0dcbc4cd66c960dc
> > Author: Li Zefan <lizf(a)cn.fujitsu.com>
> > Date: Mon Aug 17 16:56:28 2009 +0800
> >
> > tracing/events: Add module tracepoints
> >
> > (Adding Li Zefan in CC)
> >
> > Two things:
> >
> > 1) In this commit, most of the tracepoints contain argument with side-effects.
> > These do not belong there; they should be moved into TRACE_EVENT macros.
> >
> > 2) There seem to be a null-pointer bug with
> > local_read(__module_ref_addr(module, cpu)) in try_module_get(). This should
> > be investigated even if we move the argument to TRACE_EVENT.
>
> Hi Li,
>
> Fix this, please?
>

While we wait for the sun to move to other time zones, can you check if the
following patch fixes your problem ?


module: fix __module_ref_addr()

__module_ref_addr() should use per_cpu_ptr() to obfuscate the pointer
(RELOC_HIDE is needed for per cpu pointers).

This non-standard per-cpu pointer use has been introduced by commit
720eba31f47aeade8ec130ca7f4353223c49170f

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
CC: Eric Dumazet <dada1(a)cosmosbay.com>
CC: Rusty Russell <rusty(a)rustcorp.com.au>
---
include/linux/module.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6-lttng/include/linux/module.h
===================================================================
--- linux-2.6-lttng.orig/include/linux/module.h 2010-03-23 18:11:14.000000000 -0400
+++ linux-2.6-lttng/include/linux/module.h 2010-03-23 18:14:07.000000000 -0400
@@ -467,7 +467,7 @@ void symbol_put_addr(void *addr);
static inline local_t *__module_ref_addr(struct module *mod, int cpu)
{
#ifdef CONFIG_SMP
- return (local_t *) (mod->refptr + per_cpu_offset(cpu));
+ return (local_t *) per_cpu_ptr(mod->refptr, cpu);
#else
return &mod->ref;
#endif


>
> > Thanks,
> >
> > Mathieu
> >
> > >
> > > > Arguments with side-effects are not skipped by the jump over disabled
> > > > instrumentation. This is why we should do that part within the probe declaration
> > > > in the TRACE_EVENT macros.
> > > >
> > > > But if we find out that the problem really is this argument, then it should be
> > > > fixed, because something would be wrong with it (just moving it to TRACE_EVENT
> > > > is not a proper solution).
> > > >
> > > > Thanks,
> > > >
> > > > Mathieu
>
>
> ---
> ~Randy

--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Li Zefan on
Mathieu Desnoyers wrote:
> * Randy Dunlap (randy.dunlap(a)oracle.com) wrote:
>> On Fri, 19 Mar 2010 14:46:10 -0400 Mathieu Desnoyers wrote:
>>
>>> * Randy Dunlap (randy.dunlap(a)oracle.com) wrote:
>>>> On 03/18/10 17:59, Mathieu Desnoyers wrote:
>>>>> * Steven Rostedt (rostedt(a)goodmis.org) wrote:
>>>>>> On Thu, 2010-03-18 at 16:26 -0700, Randy Dunlap wrote:
>>>>>>> I can build/boot 2.6.33 with CONFIG_TRACE/TRACING disabled successfully,
>>>>>>> but when I enable lots of tracing config options and then boot with
>>>>>>> ftrace=nop on the kernel command line, I see a GP fault when the parport &
>>>>>>> parport_pc modules are loading/initializing.
>>>>>> Do you see it without adding the "ftrace=nop"? The only thing that
>>>>>> should do is expand the ring buffer on boot up.
>>>>>>
>>>>>>> It happens in drivers/parport/share.c::parport_register_device(), when that
>>>>>>> function calls try_module_get().
>>>>>>>
>>>>>>> If I comment out the trace_module_get() calls in include/linux/module.h,
>>>>>>> the kernel boots with no problems.
>>>>>>
>>>>>> Interesting. Well, trace_module_get() is a TRACE_EVENT tracepoint. But
>>>>>> should be disabled here. It may be something to do with DEFINE_TRACE.
>>>>>>
>>>>>> (added Mathieu to Cc since he wrote that code)
>>>>> can you try replacing the "local_read(__module_ref_addr(module, cpu))" argument
>>>>> with "0" ?
>>>> Yes, that boots with no problems.
>>> clickety-clicketa... git blame include/linux/module.h :
>>>
>>> commit 7ead8b8313d92b3a69a1a61b0dcbc4cd66c960dc
>>> Author: Li Zefan <lizf(a)cn.fujitsu.com>
>>> Date: Mon Aug 17 16:56:28 2009 +0800
>>>
>>> tracing/events: Add module tracepoints
>>>
>>> (Adding Li Zefan in CC)
>>>
>>> Two things:
>>>
>>> 1) In this commit, most of the tracepoints contain argument with side-effects.
>>> These do not belong there; they should be moved into TRACE_EVENT macros.
>>>
>>> 2) There seem to be a null-pointer bug with
>>> local_read(__module_ref_addr(module, cpu)) in try_module_get(). This should
>>> be investigated even if we move the argument to TRACE_EVENT.
>> Hi Li,
>>
>> Fix this, please?
>>
>
> While we wait for the sun to move to other time zones, can you check if the
> following patch fixes your problem ?
>

Sorry, I overlooked this mail thread..

I'll make a patch to move side-effects arguments from trace_module_xxx()
to the definition of TRACE_EVENT().

But it's for reducing overhead when tracepoints are disabled, this should
not be the real cultprit of the bug here.

>
> module: fix __module_ref_addr()
>
> __module_ref_addr() should use per_cpu_ptr() to obfuscate the pointer
> (RELOC_HIDE is needed for per cpu pointers).
>
> This non-standard per-cpu pointer use has been introduced by commit
> 720eba31f47aeade8ec130ca7f4353223c49170f
>

So the uptream kernel is free from this bug, because __module_ref_addr()
has gone.

> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
> CC: Eric Dumazet <dada1(a)cosmosbay.com>
> CC: Rusty Russell <rusty(a)rustcorp.com.au>
> ---
> include/linux/module.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> Index: linux-2.6-lttng/include/linux/module.h
> ===================================================================
> --- linux-2.6-lttng.orig/include/linux/module.h 2010-03-23 18:11:14.000000000 -0400
> +++ linux-2.6-lttng/include/linux/module.h 2010-03-23 18:14:07.000000000 -0400
> @@ -467,7 +467,7 @@ void symbol_put_addr(void *addr);
> static inline local_t *__module_ref_addr(struct module *mod, int cpu)
> {
> #ifdef CONFIG_SMP
> - return (local_t *) (mod->refptr + per_cpu_offset(cpu));
> + return (local_t *) per_cpu_ptr(mod->refptr, cpu);
> #else
> return &mod->ref;
> #endif
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Randy Dunlap on
On 03/23/10 18:20, Mathieu Desnoyers wrote:
>
> While we wait for the sun to move to other time zones, can you check if the
> following patch fixes your problem ?
>

Hi Mathieu,
Yes, this works. Thanks.

I'll test Li's patch now...

>
> module: fix __module_ref_addr()
>
> __module_ref_addr() should use per_cpu_ptr() to obfuscate the pointer
> (RELOC_HIDE is needed for per cpu pointers).
>
> This non-standard per-cpu pointer use has been introduced by commit
> 720eba31f47aeade8ec130ca7f4353223c49170f
>
> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
> CC: Eric Dumazet <dada1(a)cosmosbay.com>
> CC: Rusty Russell <rusty(a)rustcorp.com.au>
> ---
> include/linux/module.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> Index: linux-2.6-lttng/include/linux/module.h
> ===================================================================
> --- linux-2.6-lttng.orig/include/linux/module.h 2010-03-23 18:11:14.000000000 -0400
> +++ linux-2.6-lttng/include/linux/module.h 2010-03-23 18:14:07.000000000 -0400
> @@ -467,7 +467,7 @@ void symbol_put_addr(void *addr);
> static inline local_t *__module_ref_addr(struct module *mod, int cpu)
> {
> #ifdef CONFIG_SMP
> - return (local_t *) (mod->refptr + per_cpu_offset(cpu));
> + return (local_t *) per_cpu_ptr(mod->refptr, cpu);
> #else
> return &mod->ref;
> #endif


--
~Randy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Steven Rostedt on
On Wed, 2010-03-24 at 13:21 -0700, Randy Dunlap wrote:
> On 03/23/10 18:20, Mathieu Desnoyers wrote:
> >
> > While we wait for the sun to move to other time zones, can you check if the
> > following patch fixes your problem ?
> >
>
> Hi Mathieu,
> Yes, this works. Thanks.
>
> I'll test Li's patch now...
>

If it works, can you send me your "Tested-by". I have it queued to go.

Thanks,

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/