From: Peter Zijlstra on
Hi,

I've been going over perf_disable() usage in kernel/perf_event.c and
wondered if we actually need it at all.

Currently the only thing we seem to require it for is around pmu::enable
calls (and for that powerpc at least does it itself, on x86 we rely on
it to call ->enable_all and reprogram the pmu state).

But I can't really find any NMI races wrt data structures or the like as
seems implied by some comments.

There is a fun little recursion issue with perf_adjust_period(), where
if we fully removed perf_disable() we could end up calling pmu::stop()
twice and such.

But aside from that it looks to me its mostly about optimizing hardware
writes.

If nobody else known about/can find anything, I'm going to mostly remove
perf_disable() for now and later think about how to optimize the
hardware writes again.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Robert Richter on
On 11.06.10 12:29:44, Peter Zijlstra wrote:
> I've been going over perf_disable() usage in kernel/perf_event.c and
> wondered if we actually need it at all.
>
> Currently the only thing we seem to require it for is around pmu::enable
> calls (and for that powerpc at least does it itself, on x86 we rely on
> it to call ->enable_all and reprogram the pmu state).
>
> But I can't really find any NMI races wrt data structures or the like as
> seems implied by some comments.

Yes, it was originally used to disable nmis for some critical sections
in the non-arch code. I do not remember where this was exactly needed,
but my feeling is also this can be optimized and maybe reimplemented
as non-locking code.

We also should avoid the enable_all/disable_all() functions in the x86
implementation as it is expensive on some pmus (namely AMD). It looks
like these functions can be removed then too, or at least made model
specific only.

-Robert

--
Advanced Micro Devices, Inc.
Operating System Research Center
email: robert.richter(a)amd.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Frederic Weisbecker on
On Fri, Jun 11, 2010 at 06:29:44PM +0200, Peter Zijlstra wrote:
> Hi,
>
> I've been going over perf_disable() usage in kernel/perf_event.c and
> wondered if we actually need it at all.
>
> Currently the only thing we seem to require it for is around pmu::enable
> calls (and for that powerpc at least does it itself, on x86 we rely on
> it to call ->enable_all and reprogram the pmu state).
>
> But I can't really find any NMI races wrt data structures or the like as
> seems implied by some comments.



I suspect the problem is also on per context integrity. When you adjust
the period, enable or disable a counter, this counter becomes async with
the rest of the group or the rest of the counters in the same context, for
a small bunch of time.

The longer you run your events, the higher is going to be this jitter.

Take an example, when you adjust a period, you:

perf_disable()
perf_event_stop()
left_period = 0
perf_event_start()
perf_enable()

During all this time, the given event is paused, but the whole rest of
the events running on the cpu continue to count.

The problem is the same on context switch.

And I think this high resolution of synchronisation per context is
sensitive, especially with perf start kind of workflows.

(Although software events are not touched by perf_enable()/perf_disable().


>
> There is a fun little recursion issue with perf_adjust_period(), where
> if we fully removed perf_disable() we could end up calling pmu::stop()
> twice and such.
>
> But aside from that it looks to me its mostly about optimizing hardware
> writes.
>
> If nobody else known about/can find anything, I'm going to mostly remove
> perf_disable() for now and later think about how to optimize the
> hardware writes again.


Not sure that's a good idea IMHO.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Peter Zijlstra on
On Fri, 2010-06-11 at 19:17 +0200, Frederic Weisbecker wrote:
> On Fri, Jun 11, 2010 at 06:29:44PM +0200, Peter Zijlstra wrote:
> > Hi,
> >
> > I've been going over perf_disable() usage in kernel/perf_event.c and
> > wondered if we actually need it at all.
> >
> > Currently the only thing we seem to require it for is around pmu::enable
> > calls (and for that powerpc at least does it itself, on x86 we rely on
> > it to call ->enable_all and reprogram the pmu state).
> >
> > But I can't really find any NMI races wrt data structures or the like as
> > seems implied by some comments.
>
>
>
> I suspect the problem is also on per context integrity. When you adjust
> the period, enable or disable a counter, this counter becomes async with
> the rest of the group or the rest of the counters in the same context, for
> a small bunch of time.
>
> The longer you run your events, the higher is going to be this jitter.
>
> Take an example, when you adjust a period, you:
>
> perf_disable()
> perf_event_stop()
> left_period = 0
> perf_event_start()
> perf_enable()
>
> During all this time, the given event is paused, but the whole rest of
> the events running on the cpu continue to count.
>
> The problem is the same on context switch.
>
> And I think this high resolution of synchronisation per context is
> sensitive, especially with perf start kind of workflows.

I'm not sure what you're arguing, but the knife cuts on both sides, the
above also stops counters that shouldn't be stopped..

> > There is a fun little recursion issue with perf_adjust_period(), where
> > if we fully removed perf_disable() we could end up calling pmu::stop()
> > twice and such.
> >
> > But aside from that it looks to me its mostly about optimizing hardware
> > writes.
> >
> > If nobody else known about/can find anything, I'm going to mostly remove
> > perf_disable() for now and later think about how to optimize the
> > hardware writes again.
>
>
> Not sure that's a good idea IMHO.

Well, we need to do something, the current weak mess needs to go, and
the alternative is basically a loop over all registerd pmus calling
their respective pmu::disable_all, which is utter suckage, so removing
as many of this as possible is a good thing.

We can always come up with some lazy mode later that tries to batch the
hardware writes.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Frederic Weisbecker on
On Fri, Jun 11, 2010 at 06:29:44PM +0200, Peter Zijlstra wrote:
> There is a fun little recursion issue with perf_adjust_period(), where
> if we fully removed perf_disable() we could end up calling pmu::stop()
> twice and such.



We can have a local_t made nesting level on the stop/start that could easily
deal with this.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/