register_timer_hook use in arch/sh/oprofile [Kernel]

Prev: cgroup: implement eventfd-based generic API for notifications
Next: [PATCH 06/14] drivers/firmware/iscsi_ibft.c: Use %pI4 to print netmask

From: Paul Mundt on 16 Dec 2009 00:00

Hi Martin,

On Wed, Jun 24, 2009 at 02:54:06PM +0200, Martin Schwidefsky wrote:
> On Wed, 24 Jun 2009 21:34:16 +0900
> Paul Mundt <lethal(a)linux-sh.org> wrote:
> > On Wed, Jun 24, 2009 at 02:28:28PM +0200, Martin Schwidefsky wrote:
> > > On Wed, 24 Jun 2009 20:29:29 +0900
> > > Paul Mundt <lethal(a)linux-sh.org> wrote:
> > > > No. oprofile_timer_init() is only entered if the performance counters
> > > > fail to register in the SH7750 case, so there is only one timer hook user
> > > > at a time:
> > > >
> > > > static int __init oprofile_init(void)
> > > > {
> > > > int err;
> > > >
> > > > err = oprofile_arch_init(&oprofile_ops);
> > > >
> > > > if (err < 0 || timer) {
> > > > printk(KERN_INFO "oprofile: using timer interrupt.\n");
> > > > oprofile_timer_init(&oprofile_ops);
> > > > }
> > > > ...
> > >
> > > Oh, I see. That is the reason why the s390 version of
> > > oprofile_arch_init returns -ENODEV. It does so to trigger the fallback
> > > to the timer_hook. That should work for sh as well, no?
> > >
> > It would, yes, but it would also disable access to the SH7750 counters at
> > the same time, so we don't really want to do that. The sh7750 counters
> > are more like timer based profiling with some extra events that can be
> > set and read, so reverting to oprofile_timer_init() would reduce
> > functionality.
> >
> > My current plan is to migrate things over to the perf_counter API and
> > annoy Ingo with my interrupt deprived counters ;-)
> >
> > Given that hrtimers are already generically supported there, it should
> > tie in much cleaner there than in the oprofile case at least.
>
> Ok, that makes sense. So I guess for now I should stop trying to get
> rid of the timer_hook and concentrate to convert the fallback code
> in timer_int.c to hrtimer. Then after sh is fully converted to the
> perf_counter API we can do the cleanup.
>

This is just a follow-up to let you know that the first-step transition
to the perf_counter API is done. There's still more work to do, but we're
already more functional on the perf_counter side than we ever were on
oprofile. As such, I've subsequently killed off the bitrotted oprofile
bits, which includes the timer hook references.

There is now nothing remaining on the SH side blocking the timer hook
removal.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Ingo Molnar on 16 Dec 2009 02:30

* Paul Mundt <lethal(a)linux-sh.org> wrote:

> Hi Martin,
>
> On Wed, Jun 24, 2009 at 02:54:06PM +0200, Martin Schwidefsky wrote:
> > On Wed, 24 Jun 2009 21:34:16 +0900
> > Paul Mundt <lethal(a)linux-sh.org> wrote:
> > > On Wed, Jun 24, 2009 at 02:28:28PM +0200, Martin Schwidefsky wrote:
> > > > On Wed, 24 Jun 2009 20:29:29 +0900
> > > > Paul Mundt <lethal(a)linux-sh.org> wrote:
> > > > > No. oprofile_timer_init() is only entered if the performance counters
> > > > > fail to register in the SH7750 case, so there is only one timer hook user
> > > > > at a time:
> > > > >
> > > > > static int __init oprofile_init(void)
> > > > > {
> > > > > int err;
> > > > >
> > > > > err = oprofile_arch_init(&oprofile_ops);
> > > > >
> > > > > if (err < 0 || timer) {
> > > > > printk(KERN_INFO "oprofile: using timer interrupt.\n");
> > > > > oprofile_timer_init(&oprofile_ops);
> > > > > }
> > > > > ...
> > > >
> > > > Oh, I see. That is the reason why the s390 version of
> > > > oprofile_arch_init returns -ENODEV. It does so to trigger the fallback
> > > > to the timer_hook. That should work for sh as well, no?
> > > >
> > > It would, yes, but it would also disable access to the SH7750 counters at
> > > the same time, so we don't really want to do that. The sh7750 counters
> > > are more like timer based profiling with some extra events that can be
> > > set and read, so reverting to oprofile_timer_init() would reduce
> > > functionality.
> > >
> > > My current plan is to migrate things over to the perf_counter API and
> > > annoy Ingo with my interrupt deprived counters ;-)
> > >
> > > Given that hrtimers are already generically supported there, it should
> > > tie in much cleaner there than in the oprofile case at least.
> >
> > Ok, that makes sense. So I guess for now I should stop trying to get
> > rid of the timer_hook and concentrate to convert the fallback code
> > in timer_int.c to hrtimer. Then after sh is fully converted to the
> > perf_counter API we can do the cleanup.
> >
>
> This is just a follow-up to let you know that the first-step transition to
> the perf_counter API is done. There's still more work to do, but we're
> already more functional on the perf_counter side than we ever were on
> oprofile. [...]

Nice to hear!

Please let us know if you have any problems on the tools/perf/ side as well -
in particular everyday usability. Even small details matter.

Also, embedded-system usability would be of interest as well. Most of the perf
tooling gets tested on x86 so it's quite possible that some aspects are not as
slim as they could be.

One key point of performance is the bona fide overhead of a default "perf
record" profile recording session. On x86, it's roughly in the following
range:

aldebaran:~> perf stat --repeat 3 -e instructions ./loop_10b_instructions

Performance counter stats for './loop_10b_instructions' (3 runs):

10009061648 instructions # 0.000 IPC ( +- 0.067% )

2.407052637 seconds time elapsed ( +- 1.259% )

aldebaran:~> perf stat --repeat 3 -e instructions perf record -f ./loop_10b_instructions
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.073 MB perf.data (~3191 samples) ]
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.073 MB perf.data (~3183 samples) ]
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.073 MB perf.data (~3190 samples) ]

Performance counter stats for 'perf record -f ./loop_10b_instructions' (3 runs):

10029183818 instructions # 0.000 IPC ( +- 0.004% )

2.377064570 seconds time elapsed ( +- 0.215% )

I.e. 10029183818/10009061648 ~== 0.2% direct overhead.

For something very system and task intense (such as hackbench) it can go up to
2.5%.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

|
Pages: 1
Prev: cgroup: implement eventfd-based generic API for notifications
Next: [PATCH 06/14] drivers/firmware/iscsi_ibft.c: Use %pI4 to print netmask