From: Corey Ashford on


On 3/11/2010 11:14 AM, Ingo Molnar wrote:
>
> * Corey Ashford<cjashfor(a)linux.vnet.ibm.com> wrote:
[snip]
>> I'm not sure how that would work. The issue I am trying to solve
>> here is that Power arch chips have a large number of very
>> hardware-specific events that are not generalizable. Many of these
>> events not only have names, but other user-configurable bits as well
>> that select or narrow the scope of which exact events are recorded.
>> This issue is dealt with nicely in libpfm4, as it has mechanisms for
>> parsing event names and attributes (aka modifiers or unit masks),
>> and then produces a usable config field for the perf_events_attr
>> struct.
>>
>> Should I take it from the above that you are completely against the
>> idea of using an external library for hardware-specific event and
>> attribute naming?
>
> Could you give a few relevant examples of events in question, and the kind of
> configurability/attributes they have on Power?

Here are a few examples for the Power A2 processor. I've distorted the names
because PMU architecture isn't publicly released yet.

PM_DE_PMC_9:hrd_mask=0xff:hrd=0x22:pma_mask=0x3fff:pma=0x1b2d:culling_mode=3
PM_EX_0x03:lane=2:vlane=1
PM_OWE_ENG_MAC_FULL:usu=3

Note that the attribute fields shown above are fitted into the config field of
the perf_event_attr struct.

>
> Thanks,
>
> Ingo

Regards,

- Corey

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Corey Ashford on
On 03/11/2010 06:41 PM, Paul Mackerras wrote:
> On Thu, Mar 11, 2010 at 01:46:08PM +0100, Ingo Molnar wrote:
>>
>> * Corey Ashford<cjashfor(a)linux.vnet.ibm.com> wrote:
>>
>>> On 3/3/2010 6:30 PM, Corey Ashford wrote:
>>>> For your review, this patch adds support for arch-dependent symbolic
>>>> event names to the "perf stat" tool, and could be expanded to other
>>>> "perf *" commands fairly easily, I suspect.
>
>> I'm quite much against stop-gap measures like this - they tend to become
>> tomorrow's impossible-to-remove quirk.
>>
>> If you want extensible events you can already do it by providing an ftrace
>> tracepoint event via TRACE_EVENT. They are easy to add and ad-hoc, and are
>> supported throughout by perf.
>
> If I've understood correctly what Corey is doing, I think you're
> missing the point. The idea, I thought, was to provide a way to be
> able to use symbolic names for raw hardware events rather than just
> numbers.

Yes, that's what I meant.

> I don't see how ftrace tracepoint events are relevant to
> that.
>
> Now as to whether an external .so is the best way to provide the
> processor-specific mapping of names to raw events, I'm not sure.
> If the kernel can provide that mapping via procfs, sysfs or eventfs,
> that would be an alternative, but it does mean the kernel has those
> tables in unswappable memory (and potentially the tables for all the
> processors that the kernel supports), which seems unnecessary. Or
> they can just be added to the perf source code.

In addition to the names and attributes, we'd also need text-based
descriptions of the events and attributes.

I'm not opposed to the idea of placing them in sysfs (or other pseudo
fs), but it's also not clear to me how to represent the event data in a
clean, extensible, and space/performance efficient way. That said, I do
like the idea of being able to navigate events by looking through a
directory structure which is possibly organized by the physical topology
of the system and its PMUs.

- Corey


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Corey Ashford on
On 3/11/2010 12:46 PM, Corey Ashford wrote:
>
>
> On 3/11/2010 11:14 AM, Ingo Molnar wrote:
>>
>> * Corey Ashford<cjashfor(a)linux.vnet.ibm.com> wrote:
> [snip]
>>> I'm not sure how that would work. The issue I am trying to solve
>>> here is that Power arch chips have a large number of very
>>> hardware-specific events that are not generalizable. Many of these
>>> events not only have names, but other user-configurable bits as well
>>> that select or narrow the scope of which exact events are recorded.
>>> This issue is dealt with nicely in libpfm4, as it has mechanisms for
>>> parsing event names and attributes (aka modifiers or unit masks),
>>> and then produces a usable config field for the perf_events_attr
>>> struct.
>>>
>>> Should I take it from the above that you are completely against the
>>> idea of using an external library for hardware-specific event and
>>> attribute naming?
>>
>> Could you give a few relevant examples of events in question, and the
>> kind of
>> configurability/attributes they have on Power?
>
> Here are a few examples for the Power A2 processor. I've distorted the
> names because PMU architecture isn't publicly released yet.
>
> PM_DE_PMC_9:hrd_mask=0xff:hrd=0x22:pma_mask=0x3fff:pma=0x1b2d:culling_mode=3
>
> PM_EX_0x03:lane=2:vlane=1
> PM_OWE_ENG_MAC_FULL:usu=3

Just a follow-up note to this...

I learned that the much of the high-level architecture of the new chip that IBM
is working on has been publicly released recently, so I have "undistorted" the
event names below:

PM_DC_PMC_9:lpid_mask=0xff:lpid=0x22:pid_mask=0x3fff:pid=0x1b2d:marking_mode=3
PM_REGX_0x03:lane=2:vlane=1
PM_XML_ENG_MAC_FULL:sus=3


DC = Decompression/Compression accelerator
PMC_9 = Peformance monitoring event 9
REGX = Regular eXpression accelerator
XML = XML parsing accelerator
pid = process id to match
pid_mask = process id match mask
lpid = logical partition id
lpid_mask = logical partition id mask
sus = source unit select
lane, vlane = signal routing fields
marking_mode = used to determine which accelerator work units to mark for
performance monitoring

- Corey

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Ingo Molnar on

* Paul Mackerras <paulus(a)samba.org> wrote:

> On Thu, Mar 11, 2010 at 01:46:08PM +0100, Ingo Molnar wrote:
> >
> > * Corey Ashford <cjashfor(a)linux.vnet.ibm.com> wrote:
> >
> > > On 3/3/2010 6:30 PM, Corey Ashford wrote:
> > > >For your review, this patch adds support for arch-dependent symbolic
> > > >event names to the "perf stat" tool, and could be expanded to other
> > > >"perf *" commands fairly easily, I suspect.
>
> > I'm quite much against stop-gap measures like this - they tend to become
> > tomorrow's impossible-to-remove quirk.
> >
> > If you want extensible events you can already do it by providing an ftrace
> > tracepoint event via TRACE_EVENT. They are easy to add and ad-hoc, and are
> > supported throughout by perf.
>
> If I've understood correctly what Corey is doing, I think you're missing the
> point. The idea, I thought, was to provide a way to be able to use symbolic
> names for raw hardware events rather than just numbers. I don't see how
> ftrace tracepoint events are relevant to that.

tracepoints are relevant because they are the currently best way of how we
assign symbolic names to various kernel-internal events. For ad-hoc usecases
like this:

http://dri.freedesktop.org/wiki/IntelPerformanceTuning

I'd much rather see that facility used (and, to the extent needed, extended)
to provide support for rare arch events that we dont want to enumerate in a
generic way.

Or, if the events are important enough to be hardcoded into the perf ABI
itself, they should be generalized in a meaningful way - even if you dont
expect them to show up on other CPUs.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Ingo Molnar on

* Corey Ashford <cjashfor(a)linux.vnet.ibm.com> wrote:

> On 3/11/2010 12:46 PM, Corey Ashford wrote:
> >
> >
> >On 3/11/2010 11:14 AM, Ingo Molnar wrote:
> >>
> >>* Corey Ashford<cjashfor(a)linux.vnet.ibm.com> wrote:
> >[snip]
> >>>I'm not sure how that would work. The issue I am trying to solve
> >>>here is that Power arch chips have a large number of very
> >>>hardware-specific events that are not generalizable. Many of these
> >>>events not only have names, but other user-configurable bits as well
> >>>that select or narrow the scope of which exact events are recorded.
> >>>This issue is dealt with nicely in libpfm4, as it has mechanisms for
> >>>parsing event names and attributes (aka modifiers or unit masks),
> >>>and then produces a usable config field for the perf_events_attr
> >>>struct.
> >>>
> >>>Should I take it from the above that you are completely against the
> >>>idea of using an external library for hardware-specific event and
> >>>attribute naming?
> >>
> >>Could you give a few relevant examples of events in question, and the
> >>kind of
> >>configurability/attributes they have on Power?
> >
> >Here are a few examples for the Power A2 processor. I've distorted the
> >names because PMU architecture isn't publicly released yet.
> >
> >PM_DE_PMC_9:hrd_mask=0xff:hrd=0x22:pma_mask=0x3fff:pma=0x1b2d:culling_mode=3
> >
> >PM_EX_0x03:lane=2:vlane=1
> >PM_OWE_ENG_MAC_FULL:usu=3
>
> Just a follow-up note to this...
>
> I learned that the much of the high-level architecture of the new
> chip that IBM is working on has been publicly released recently, so
> I have "undistorted" the event names below:
>
> PM_DC_PMC_9:lpid_mask=0xff:lpid=0x22:pid_mask=0x3fff:pid=0x1b2d:marking_mode=3
> PM_REGX_0x03:lane=2:vlane=1
> PM_XML_ENG_MAC_FULL:sus=3
>
>
> DC = Decompression/Compression accelerator
> PMC_9 = Peformance monitoring event 9
> REGX = Regular eXpression accelerator
> XML = XML parsing accelerator
> pid = process id to match
> pid_mask = process id match mask
> lpid = logical partition id
> lpid_mask = logical partition id mask
> sus = source unit select
> lane, vlane = signal routing fields
> marking_mode = used to determine which accelerator work units to
> mark for performance monitoring

Are these special-purpose instructions for compression/regex/xml-parsing
speedups?

I think it would be rather useful to merge the hw (and sw) perf events with
the ftrace/tracepoints symbolic events space. That would be a one-stop-shop
for both perf and other tools to figure out the events we offer, their
characteristics, format, relationship to other events, etc.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/