>>>>> I'm not sure how that would work. The issue I am trying to solve
>>>>> here is that Power arch chips have a large number of very
>>>>> hardware-specific events that are not generalizable. Many of these
>>>>> events not only have names, but other user-configurable bits as well
>>>>> that select or narrow the scope of which exact events are recorded.
>>>>> This issue is dealt with nicely in libpfm4, as it has mechanisms for
>>>>> parsing event names and attributes (aka modifiers or unit masks),
>>>>> and then produces a usable config field for the perf_events_attr
>>>>> struct.
>>>>> Should I take it from the above that you are completely against the
>>>>> idea of using an external library for hardware-specific event and
>>>>> attribute naming?
>>>> Could you give a few relevant examples of events in question, and the
>>>> kind of
>>>> configurability/attributes they have on Power?
>>> Here are a few examples for the Power A2 processor. I've distorted the
>>> names because PMU architecture isn't publicly released yet.
>>> PM_DE_PMC_9:hrd_mask=0xff:hrd=0x22:pma_mask=0x3fff:pma=0x1b2d:culling_mode=3
>>> PM_EX_0x03:lane=2:vlane=1
>> Just a follow-up note to this...
>> I learned that much of the high-level architecture of the new
>> chip that IBM is working on has been publicly released recently, so
>> I have "undistorted" the event names below:
>> PM_DC_PMC_9:lpid_mask=0xff:lpid=0x22:pid_mask=0x3fff:pid=0x1b2d:marking_mode=3
>> PM_REGX_0x03:lane=2:vlane=1
>> DC = Decompression/Compression accelerator
>> PMC_9 = Peformance monitoring event 9
>> REGX = Regular eXpression accelerator
>> XML = XML parsing accelerator
>> pid = process id to match
>> pid_mask = process id match mask
>> lpid = logical partition id
>> lpid_mask = logical partition id mask
>> sus = source unit select
>> lane, vlane = signal routing fields
>> marking_mode = used to determine which accelerator work units to
>> mark for performance monitoring
> Are these special-purpose instructions for compression/regex/xml-parsing
> speedups?

No, these events are for nest (aka uncore) accelerators for
compression/regex/xml-parsing. These accelerators operate independently of the
CPU threads and are given work units via request blocks which are then queued up
by the accelerator.

> I think it would be rather useful to merge the hw (and sw) perf events with
> the ftrace/tracepoints symbolic events space. That would be a one-stop-shop
> for both perf and other tools to figure out the events we offer, their
> characteristics, format, relationship to other events, etc.
> Ingo

Ok, I will look into this. Thank you for your advice.

- Corey

