From: Corey Ashford on
In the last couple of days, I've run across what appears to be a kernel bug in 2.6.33.3 (haven't tested later kernels yet) having to do with using the PERF_FORMAT_GROUP feature in combination with enable_on_exec and reading counts from a remote task.

What happens is that when we go to read the entire group up, only the first counter in can be accessed; the read() call returns too few bytes. This problem doesn't occur if measuring the from the same task.

I have attached a "cut down", though it's not terribly small. It is a cut down from the "task" example program in libpfm4/perf_examples. In addition to trimming the program down a lot, I've removed the dependency on libpfm4 and made modifications so that it will compile in the tools/perf subdirectory. If you copy the attachment into your tools/perf subdir, you should be able to compile it with just:

gcc -o show_fg_bug show_fg_bug.c

Then invoke it by passing it an executable that will give it something to chew on a little, e.g.:

../show_fg_bug md5sum `which gdb`

The test cases creates two counters and places them in the same group, and sets the PERF_FORMAT_GROUP option on the first counter. It fork/execs the child and when the child is done executing, it reads back the counter values.

When I run it, I see this output:

% ./show_fg_bug md5sum `which gdb`
825b15d7279ef21d6c9d018d775758ae /usr/bin/gdb
Error! tried to read 40 bytes, but got 32
58684138 PERF_COUNT_HW_CPU_CYCLES (35469840 : 35469840)
0 PERF_COUNT_HW_INSTRUCTIONS (35469840 : 35469840)

Oddly enough, if you look at the "nr" (number of counters) value that gets read up as part of the group, it is two, but it can only read the first of the two counters. Another data point is that it doesn't matter how many counters you add to the group, only the first can be read up.

Please let me know if you have any questions about this.

Thanks for your consideration,

- Corey

Corey Ashford
Software Engineer
IBM Linux Technology Center, Linux Toolchain
Beaverton, OR
cjashfor(a)us.ibm.com