perf: Store active software events in a hashlist [Kernel]

Prev: mm: Check if any page in a pageblock is reserved before marking it MIGRATE_RESERVE
Next: [tip:perf/urgent] perf, x86: Enable Nehalem-EX support

From: Peter Zijlstra on 6 Apr 2010 11:30

On Mon, 2010-04-05 at 16:08 +0200, Frederic Weisbecker wrote:
> Each time a software event triggers, we need to walk through
> the entire list of events from the current cpu and task contexts
> to retrieve a running perf event that matches.
> We also need to check a matching perf event is actually counting.
>
> This walk is wasteful and makes the event fast path scaling
> down with a growing number of events running on the same
> contexts.
>
> To solve this, we store the running perf events in a hashlist to
> get an immediate access to them against their type:event_id when
> they trigger.

So we have a hash-table per-cpu, each event takes a ref on the hash
table, when the thing is empty we free it.

When the event->cpu == -1 (all cpus) we take a ref on all possible cpu's
hash-table (should be online I figure, but that requires adding a
hotplug handler).

Then on event enable/disable we actually add the event to the hash-table
belonging to the cpu the event/task gets scheduled on, since each event
can only ever be active on one cpu.

Right?

So looks good, altough I think we want to do that online/hotplug thing.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Peter Zijlstra on 7 Apr 2010 05:10

On Mon, 2010-04-05 at 16:08 +0200, Frederic Weisbecker wrote:
> +#define SWEVENT_HLIST_BITS 8
> +#define SWEVENT_HLIST_SIZE ((1 << (SWEVENT_HLIST_BITS + 1)) - 1)

That seems to result in 9 bits worth, doesn't it?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Peter Zijlstra on 7 Apr 2010 05:10

On Tue, 2010-04-06 at 17:27 +0200, Peter Zijlstra wrote:
> On Mon, 2010-04-05 at 16:08 +0200, Frederic Weisbecker wrote:
> > Each time a software event triggers, we need to walk through
> > the entire list of events from the current cpu and task contexts
> > to retrieve a running perf event that matches.
> > We also need to check a matching perf event is actually counting.
> >
> > This walk is wasteful and makes the event fast path scaling
> > down with a growing number of events running on the same
> > contexts.
> >
> > To solve this, we store the running perf events in a hashlist to
> > get an immediate access to them against their type:event_id when
> > they trigger.
>
> So we have a hash-table per-cpu, each event takes a ref on the hash
> table, when the thing is empty we free it.
>
> When the event->cpu == -1 (all cpus) we take a ref on all possible cpu's
> hash-table (should be online I figure, but that requires adding a
> hotplug handler).
>
> Then on event enable/disable we actually add the event to the hash-table
> belonging to the cpu the event/task gets scheduled on, since each event
> can only ever be active on one cpu.
>
> Right?
>
> So looks good, altough I think we want to do that online/hotplug thing.

Alternatively, you can simply but the hash table into the per-cpu
structure and not allocate it, its only a single page (half a page if
you use 32bit or actually use 8 bits.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Frederic Weisbecker on 7 Apr 2010 08:00

On Wed, Apr 07, 2010 at 11:04:53AM +0200, Peter Zijlstra wrote:
> On Tue, 2010-04-06 at 17:27 +0200, Peter Zijlstra wrote:
> > On Mon, 2010-04-05 at 16:08 +0200, Frederic Weisbecker wrote:
> > > Each time a software event triggers, we need to walk through
> > > the entire list of events from the current cpu and task contexts
> > > to retrieve a running perf event that matches.
> > > We also need to check a matching perf event is actually counting.
> > >
> > > This walk is wasteful and makes the event fast path scaling
> > > down with a growing number of events running on the same
> > > contexts.
> > >
> > > To solve this, we store the running perf events in a hashlist to
> > > get an immediate access to them against their type:event_id when
> > > they trigger.
> >
> > So we have a hash-table per-cpu, each event takes a ref on the hash
> > table, when the thing is empty we free it.
> >
> > When the event->cpu == -1 (all cpus) we take a ref on all possible cpu's
> > hash-table (should be online I figure, but that requires adding a
> > hotplug handler).
> >
> > Then on event enable/disable we actually add the event to the hash-table
> > belonging to the cpu the event/task gets scheduled on, since each event
> > can only ever be active on one cpu.
> >
> > Right?
> >
> > So looks good, altough I think we want to do that online/hotplug thing.
>
> Alternatively, you can simply but the hash table into the per-cpu
> structure and not allocate it, its only a single page (half a page if
> you use 32bit or actually use 8 bits.

As you prefer. This would indeed make it more simple, but that would also
make these pages unused most of the time.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Peter Zijlstra on 7 Apr 2010 08:00

On Wed, 2010-04-07 at 13:56 +0200, Frederic Weisbecker wrote:
>
> >
> > So looks good, altough I think we want to do that online/hotplug thing.
>
>
> That would let us allocate on online cpus instead of possibles? Yeah right.

Right, so if you want to go this route (and not simply embed it in the
percpu data), the complication I thought of is that you want the
refcount on offline cpus but not the allocation, since its very hard to
reconstruct the number of events that had event->cpu == -1.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

| Next | Last
Pages: 1 2 3
Prev: mm: Check if any page in a pageblock is reserved before marking it MIGRATE_RESERVE
Next: [tip:perf/urgent] perf, x86: Enable Nehalem-EX support