From: Peter Zijlstra on
On Thu, 2010-06-10 at 05:49 +0200, Frederic Weisbecker wrote:
> In order to introduce new context exclusions, software events will
> have to eventually stop when needed. We'll want perf_event_stop() to
> act on every events.
>
> To achieve this, remove the stub stop/start pmu callbacks of software
> and tracepoint events.
>
> This may even optimize the case of hardware and software events
> running at the same time: now we only stop/start all hardware
> events if we reset a hardware event period, not anymore with
> software events.
>
> Signed-off-by: Frederic Weisbecker <fweisbec(a)gmail.com>
> Cc: Ingo Molnar <mingo(a)elte.hu>
> Cc: Peter Zijlstra <a.p.zijlstra(a)chello.nl>
> Cc: Arnaldo Carvalho de Melo <acme(a)redhat.com>
> Cc: Paul Mackerras <paulus(a)samba.org>
> Cc: Stephane Eranian <eranian(a)google.com>
> Cc: Cyrill Gorcunov <gorcunov(a)gmail.com>
> Cc: Zhang Yanmin <yanmin_zhang(a)linux.intel.com>
> Cc: Steven Rostedt <rostedt(a)goodmis.org>
> ---
> kernel/perf_event.c | 29 ++++++++++++++++-------------
> 1 files changed, 16 insertions(+), 13 deletions(-)
>
> diff --git a/kernel/perf_event.c b/kernel/perf_event.c
> index c772a3d..5c004f7 100644
> --- a/kernel/perf_event.c
> +++ b/kernel/perf_event.c
> @@ -1541,11 +1541,23 @@ static void perf_adjust_period(struct perf_event *event, u64 nsec, u64 count)
> hwc->sample_period = sample_period;
>
> if (local64_read(&hwc->period_left) > 8*sample_period) {
> - perf_disable();
> - perf_event_stop(event);
> + bool software_event = is_software_event(event);
> +
> + /*
> + * Only hardware events need their irq period to be
> + * reprogrammed
> + */
> + if (!software_event) {
> + perf_disable();
> + perf_event_stop(event);
> + }
> +
> local64_set(&hwc->period_left, 0);
> - perf_event_start(event);
> - perf_enable();
> +
> + if (!software_event) {
> + perf_event_start(event);
> + perf_enable();
> + }
> }
> }
>
> @@ -4286,16 +4298,9 @@ static void perf_swevent_void(struct perf_event *event)
> {
> }
>
> -static int perf_swevent_int(struct perf_event *event)
> -{
> - return 0;
> -}
> -
> static const struct pmu perf_ops_generic = {
> .enable = perf_swevent_enable,
> .disable = perf_swevent_disable,
> - .start = perf_swevent_int,
> - .stop = perf_swevent_void,
> .read = perf_swevent_read,
> .unthrottle = perf_swevent_void, /* hwc->interrupts already reset */
> };
> @@ -4578,8 +4583,6 @@ static int swevent_hlist_get(struct perf_event *event)
> static const struct pmu perf_ops_tracepoint = {
> .enable = perf_trace_enable,
> .disable = perf_trace_disable,
> - .start = perf_swevent_int,
> - .stop = perf_swevent_void,
> .read = perf_swevent_read,
> .unthrottle = perf_swevent_void,
> };

I really don't like this.. we should be removing differences between
software and hardware pmu implementations, not add more :/

Something like the below would work, the only 'problem' is that it grows
hw_perf_event.

---
include/linux/perf_event.h | 1 +
kernel/perf_event.c | 27 ++++++++++++++++++---------
2 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 9073bde..2292659 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -531,6 +531,7 @@ struct hw_perf_event {
struct { /* software */
s64 remaining;
struct hrtimer hrtimer;
+ int stopped;
};
#ifdef CONFIG_HAVE_HW_BREAKPOINT
/* breakpoint */
diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 403d180..14b691e 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -4113,6 +4113,9 @@ static int perf_swevent_match(struct perf_event *event,
struct perf_sample_data *data,
struct pt_regs *regs)
{
+ if (event->hw.stopped)
+ return 0;
+
if (event->attr.type != type)
return 0;

@@ -4282,22 +4285,28 @@ static void perf_swevent_disable(struct perf_event *event)
hlist_del_rcu(&event->hlist_entry);
}

-static void perf_swevent_void(struct perf_event *event)
+static void perf_swevent_throttle(struct perf_event *event)
{
+ /* hwc->interrupts already reset */
}

-static int perf_swevent_int(struct perf_event *event)
+static int perf_swevent_start(struct perf_event *event)
{
- return 0;
+ event->hw.stopped = 0;
+}
+
+static void perf_swevent_throttle(struct perf_event *event)
+{
+ event->hw.stopped = 1;
}

static const struct pmu perf_ops_generic = {
.enable = perf_swevent_enable,
.disable = perf_swevent_disable,
- .start = perf_swevent_int,
- .stop = perf_swevent_void,
+ .start = perf_swevent_start,
+ .stop = perf_swevent_stop,
.read = perf_swevent_read,
- .unthrottle = perf_swevent_void, /* hwc->interrupts already reset */
+ .unthrottle = perf_swevent_throttle,
};

/*
@@ -4578,10 +4587,10 @@ static int swevent_hlist_get(struct perf_event *event)
static const struct pmu perf_ops_tracepoint = {
.enable = perf_trace_enable,
.disable = perf_trace_disable,
- .start = perf_swevent_int,
- .stop = perf_swevent_void,
+ .start = perf_swevent_start,
+ .stop = perf_swevent_stop,
.read = perf_swevent_read,
- .unthrottle = perf_swevent_void,
+ .unthrottle = perf_swevent_throttle,
};

static int perf_tp_filter_match(struct perf_event *event,


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Peter Zijlstra on
On Thu, 2010-06-10 at 12:46 +0200, Peter Zijlstra wrote:
>
> Something like the below would work, the only 'problem' is that it grows
> hw_perf_event.

If we do the whole PAUSEd thing right, we'd not need this I think.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Ingo Molnar on

* Peter Zijlstra <peterz(a)infradead.org> wrote:

> Something like the below would work, the only 'problem' is that it grows
> hw_perf_event.

> @@ -531,6 +531,7 @@ struct hw_perf_event {
> struct { /* software */
> s64 remaining;
> struct hrtimer hrtimer;
> + int stopped;

IMO that's ok.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Peter Zijlstra on
On Thu, 2010-06-10 at 18:12 +0200, Frederic Weisbecker wrote:
> On Thu, Jun 10, 2010 at 01:10:42PM +0200, Peter Zijlstra wrote:
> > On Thu, 2010-06-10 at 12:46 +0200, Peter Zijlstra wrote:
> > >
> > > Something like the below would work, the only 'problem' is that it grows
> > > hw_perf_event.
> >
> > If we do the whole PAUSEd thing right, we'd not need this I think.
>
>
> It's not needed, and moreover software_pmu:stop/start() can be the same
> than software:pmu:disable/enable() without the need to add another check
> in the fast path.
>
> But we need perf_event_stop/start() to work on software events. And in fact
> now that we use the hlist_del_init, it's safe, but a bit wasteful in
> the period reset path. That's another problem that is not critical, but
> if you want to solve this by ripping the differences between software and
> hardware (which I agree with), we need a ->reset_period callback.
>
Why? ->start() should reprogram the hardware, so a
->stop()/poke-at-state/->start() cycle is much more flexible.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Peter Zijlstra on
On Thu, 2010-06-10 at 18:29 +0200, Frederic Weisbecker wrote:

> Imagine you have several software and hardware events running on the
> same cpu. Each time you reset this period for a software event, you do
> a hw_pmu_disable() / hw_pmu_enable(), which writes/read the hardware
> register for each hardware events, amongst other wasteful things.

hw_perf_disable/enable() are on their way out. They should be replaced
with a struct pmu callback. We must remove all these weak functions if
we want to support multiple pmus.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/