From: Frederic Weisbecker on
On Wed, May 05, 2010 at 07:07:40PM +0400, Cyrill Gorcunov wrote:
> Steven reported
> |
> | I'm getting:
> |
> | Pid: 3477, comm: perf Not tainted 2.6.34-rc6 #2727
> | Call Trace:
> | [<ffffffff811c7565>] debug_smp_processor_id+0xd5/0xf0
> | [<ffffffff81019874>] p4_hw_config+0x2b/0x15c
> | [<ffffffff8107acbc>] ? trace_hardirqs_on_caller+0x12b/0x14f
> | [<ffffffff81019143>] hw_perf_event_init+0x468/0x7be
> | [<ffffffff810782fd>] ? debug_mutex_init+0x31/0x3c
> | [<ffffffff810c68b2>] T.850+0x273/0x42e
> | [<ffffffff810c6cab>] sys_perf_event_open+0x23e/0x3f1
> | [<ffffffff81009e6a>] ? sysret_check+0x2e/0x69
> | [<ffffffff81009e32>] system_call_fastpath+0x16/0x1b
> |
> | When running perf record in latest tip/perf/core
> |
>
> Due to the fact that p4 counters are shared between HT threads
> we synthetically divide the whole set of counters into two
> non-intersected subsets. And while we're borrowing counters
> from these subsets we should not be preempted. So use
> get_cpu/put_cpu pair.
>
> Reported-by: Steven Rostedt <rostedt(a)goodmis.org>
> Tested-by: Steven Rostedt <rostedt(a)goodmis.org>
> CC: Steven Rostedt <rostedt(a)goodmis.org>
> CC: Peter Zijlstra <peterz(a)infradead.org>
> CC: Ingo Molnar <mingo(a)elte.hu>
> CC: Frederic Weisbecker <fweisbec(a)gmail.com>
> Signed-off-by: Cyrill Gorcunov <gorcunov(a)openvz.org>
> ---
> arch/x86/kernel/cpu/perf_event_p4.c | 9 ++++++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
>
> Index: linux-2.6.git/arch/x86/kernel/cpu/perf_event_p4.c
> =====================================================================
> --- linux-2.6.git.orig/arch/x86/kernel/cpu/perf_event_p4.c
> +++ linux-2.6.git/arch/x86/kernel/cpu/perf_event_p4.c
> @@ -421,7 +421,7 @@ static u64 p4_pmu_event_map(int hw_event
>
> static int p4_hw_config(struct perf_event *event)
> {
> - int cpu = raw_smp_processor_id();
> + int cpu = get_cpu();
> u32 escr, cccr;
>
> /*
> @@ -440,7 +440,7 @@ static int p4_hw_config(struct perf_even
> event->hw.config = p4_set_ht_bit(event->hw.config);
>
> if (event->attr.type != PERF_TYPE_RAW)
> - return 0;
> + goto out;
>
> /*
> * We don't control raw events so it's up to the caller
> @@ -455,6 +455,8 @@ static int p4_hw_config(struct perf_even
> (p4_config_pack_escr(P4_ESCR_MASK_HT) |
> p4_config_pack_cccr(P4_CCCR_MASK_HT));
>
> +out:
> + put_cpu();
> return 0;
> }
>
> @@ -741,7 +743,7 @@ static int p4_pmu_schedule_events(struct
> {
> unsigned long used_mask[BITS_TO_LONGS(X86_PMC_IDX_MAX)];
> unsigned long escr_mask[BITS_TO_LONGS(ARCH_P4_TOTAL_ESCR)];
> - int cpu = raw_smp_processor_id();
> + int cpu = get_cpu();
> struct hw_perf_event *hwc;
> struct p4_event_bind *bind;
> unsigned int i, thread, num;
> @@ -777,6 +779,7 @@ reserve:
> }
>
> done:
> + put_cpu();
> return num ? -ENOSPC : 0;
> }



That's no big deal. But I think the schedule_events() is called on
pmu::enable() time, when preemption is already disabled.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Cyrill Gorcunov on
On Wed, May 05, 2010 at 06:57:34PM +0200, Frederic Weisbecker wrote:
....
> > @@ -741,7 +743,7 @@ static int p4_pmu_schedule_events(struct
> > {
> > unsigned long used_mask[BITS_TO_LONGS(X86_PMC_IDX_MAX)];
> > unsigned long escr_mask[BITS_TO_LONGS(ARCH_P4_TOTAL_ESCR)];
> > - int cpu = raw_smp_processor_id();
> > + int cpu = get_cpu();
> > struct hw_perf_event *hwc;
> > struct p4_event_bind *bind;
> > unsigned int i, thread, num;
> > @@ -777,6 +779,7 @@ reserve:
> > }
> >
> > done:
> > + put_cpu();
> > return num ? -ENOSPC : 0;
> > }
>
> That's no big deal. But I think the schedule_events() is called on
> pmu::enable() time, when preemption is already disabled.
>

We'll be on a safe side using get/put_cpu here (ie in case
if something get changed one day).

-- Cyrill
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Frederic Weisbecker on
On Wed, May 05, 2010 at 09:42:34PM +0400, Cyrill Gorcunov wrote:
> On Wed, May 05, 2010 at 06:57:34PM +0200, Frederic Weisbecker wrote:
> ...
> > > @@ -741,7 +743,7 @@ static int p4_pmu_schedule_events(struct
> > > {
> > > unsigned long used_mask[BITS_TO_LONGS(X86_PMC_IDX_MAX)];
> > > unsigned long escr_mask[BITS_TO_LONGS(ARCH_P4_TOTAL_ESCR)];
> > > - int cpu = raw_smp_processor_id();
> > > + int cpu = get_cpu();
> > > struct hw_perf_event *hwc;
> > > struct p4_event_bind *bind;
> > > unsigned int i, thread, num;
> > > @@ -777,6 +779,7 @@ reserve:
> > > }
> > >
> > > done:
> > > + put_cpu();
> > > return num ? -ENOSPC : 0;
> > > }
> >
> > That's no big deal. But I think the schedule_events() is called on
> > pmu::enable() time, when preemption is already disabled.
> >
>
> We'll be on a safe side using get/put_cpu here (ie in case
> if something get changed one day).
>
> -- Cyrill


Sure, no problem with that.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Cyrill Gorcunov on
On Thu, May 06, 2010 at 09:45:24AM -0400, Steven Rostedt wrote:
....
> > > We want the one with the least runtime overhead. These are instrumentation
> > > routines, so we want to optimize them as much as possible.
>
>
> Yeah, my point was either disable preemption or keep the checks. In
> other words, if you don't disable preemption, do not use
> raw_smp_procesor_id(), because then we will not catch it if it changes
> in the future.
>
> > ok, Ingo, dont apply this patch then for a while.
>
> Send another patch, I'll test it again ;-)
>
> -- Steve
>
>

Ingo, Steven, it seems we have potential preemtion available
in perf_event.c:validate_group:x86_pmu.schedule_events() which
is reached via syscall from userspace perf_event_open() call,
so get_cpu is still needed. But I'm a bit messed with call
graph at the moment :(

-- Cyrill
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Cyrill Gorcunov on
On Thu, May 06, 2010 at 06:48:54PM +0400, Cyrill Gorcunov wrote:
> On Thu, May 06, 2010 at 09:45:24AM -0400, Steven Rostedt wrote:
> ...
> > > > We want the one with the least runtime overhead. These are instrumentation
> > > > routines, so we want to optimize them as much as possible.
> >
> >
> > Yeah, my point was either disable preemption or keep the checks. In
> > other words, if you don't disable preemption, do not use
> > raw_smp_procesor_id(), because then we will not catch it if it changes
> > in the future.
> >
> > > ok, Ingo, dont apply this patch then for a while.
> >
> > Send another patch, I'll test it again ;-)
> >
> > -- Steve
> >
> >
>
> Ingo, Steven, it seems we have potential preemtion available
> in perf_event.c:validate_group:x86_pmu.schedule_events() which
> is reached via syscall from userspace perf_event_open() call,
> so get_cpu is still needed. But I'm a bit messed with call
> graph at the moment :(
>
> -- Cyrill

Steve, while I'm diving through call graph could you give this
patch a shot? If preemtion happens -- it'll trigger it fast.

-- Cyrill
---
arch/x86/kernel/cpu/perf_event_p4.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

Index: linux-2.6.git/arch/x86/kernel/cpu/perf_event_p4.c
=====================================================================
--- linux-2.6.git.orig/arch/x86/kernel/cpu/perf_event_p4.c
+++ linux-2.6.git/arch/x86/kernel/cpu/perf_event_p4.c
@@ -421,7 +421,7 @@ static u64 p4_pmu_event_map(int hw_event

static int p4_hw_config(struct perf_event *event)
{
- int cpu = raw_smp_processor_id();
+ int cpu = get_cpu();
u32 escr, cccr;

/*
@@ -440,7 +440,7 @@ static int p4_hw_config(struct perf_even
event->hw.config = p4_set_ht_bit(event->hw.config);

if (event->attr.type != PERF_TYPE_RAW)
- return 0;
+ goto out;

/*
* We don't control raw events so it's up to the caller
@@ -455,6 +455,8 @@ static int p4_hw_config(struct perf_even
(p4_config_pack_escr(P4_ESCR_MASK_HT) |
p4_config_pack_cccr(P4_CCCR_MASK_HT));

+out:
+ put_cpu();
return 0;
}

@@ -741,7 +743,7 @@ static int p4_pmu_schedule_events(struct
{
unsigned long used_mask[BITS_TO_LONGS(X86_PMC_IDX_MAX)];
unsigned long escr_mask[BITS_TO_LONGS(ARCH_P4_TOTAL_ESCR)];
- int cpu = raw_smp_processor_id();
+ int cpu = smp_processor_id();
struct hw_perf_event *hwc;
struct p4_event_bind *bind;
unsigned int i, thread, num;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/