From: Cyrill Gorcunov on
On Mon, Aug 09, 2010 at 09:48:29PM +0200, Robert Richter wrote:
> On 06.08.10 10:21:31, Don Zickus wrote:
> > On Fri, Aug 06, 2010 at 08:52:03AM +0200, Robert Richter wrote:
>
> > > I was playing around with it yesterday trying to fix this. My idea is
> > > to skip an unkown nmi if the privious nmi was a *handled* perfctr
> >
> > You might want to add a little more logic that says *handled* _and_ had
> > more than one perfctr trigger. Most of the time only one perfctr is
> > probably triggering, so you might be eating unknown_nmi's needlessly.
> >
> > Just a thought.
>
> Yes, that's true. It could be implemented on top of the patch below.
>
> >
> > > nmi. I will probably post an rfc patch early next week.
>
> Here it comes:
>

Thanks Robert! Looks good to me, one nit below.

> From d2739578199d881ae6a9537c1b96a0efd1cdea43 Mon Sep 17 00:00:00 2001
> From: Robert Richter <robert.richter(a)amd.com>
> Date: Thu, 5 Aug 2010 16:19:59 +0200
> Subject: [PATCH] perf, x86: try to handle unknown nmis with running perfctrs
>
....
> diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
> index f2da20f..c3cd159 100644
> --- a/arch/x86/kernel/cpu/perf_event.c
> +++ b/arch/x86/kernel/cpu/perf_event.c
> @@ -1200,12 +1200,16 @@ void perf_events_lapic_init(void)
> apic_write(APIC_LVTPC, APIC_DM_NMI);
> }
>
> +static DEFINE_PER_CPU(unsigned int, perfctr_handled);
> +
> static int __kprobes
> perf_event_nmi_handler(struct notifier_block *self,
> unsigned long cmd, void *__args)
> {
> struct die_args *args = __args;
> struct pt_regs *regs;
> + unsigned int this_nmi;
> + unsigned int prev_nmi;
>
> if (!atomic_read(&active_events))
> return NOTIFY_DONE;
> @@ -1214,7 +1218,26 @@ perf_event_nmi_handler(struct notifier_block *self,
> case DIE_NMI:
> case DIE_NMI_IPI:
> break;
> -
> + case DIE_NMIUNKNOWN:
> + /*
> + * This one could be our NMI, two events could trigger
> + * 'simultaneously' raising two back-to-back NMIs. If
> + * the first NMI handles both, the latter will be
> + * empty and daze the CPU.
> + *
> + * So, we drop this unknown NMI if the previous NMI
> + * was handling a perfctr. Otherwise we pass it and
> + * let the kernel handle the unknown nmi.
> + *
> + * Note: this could be improved if we drop unknown
> + * NMIs only if we handled more than one perfctr in
> + * the previous NMI.
> + */
> + this_nmi = percpu_read(irq_stat.__nmi_count);
> + prev_nmi = __get_cpu_var(perfctr_handled);
> + if (this_nmi == prev_nmi + 1)
> + return NOTIFY_STOP;
> + return NOTIFY_DONE;
> default:
> return NOTIFY_DONE;
> }
> @@ -1222,14 +1245,12 @@ perf_event_nmi_handler(struct notifier_block *self,
> regs = args->regs;
>
> apic_write(APIC_LVTPC, APIC_DM_NMI);

If only I'm not missing something this apic_write should go up to
"case DIE_NMIUNKNOWN" site, no?

-- Cyrill
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Cyrill Gorcunov on
On Tue, Aug 10, 2010 at 09:42:00AM +0200, Robert Richter wrote:
> On 09.08.10 16:02:45, Cyrill Gorcunov wrote:
>
> > > @@ -1222,14 +1245,12 @@ perf_event_nmi_handler(struct notifier_block *self,
> > > regs = args->regs;
> > >
> > > apic_write(APIC_LVTPC, APIC_DM_NMI);
> >
> > If only I'm not missing something this apic_write should go up to
> > "case DIE_NMIUNKNOWN" site, no?
>
> This seems to be code from the non-nmi implementation and can be
> removed at all, which should be a separate patch. The vector is
> already set up.
>
> -Robert
>

No, this is just a short way to unmask LVTPC (which is required for
cpus). Actually lookig into this snippet I found that in p4 pmu
I've made one redundant unmaksing operation. will update as only
this area settle down.

-- Cyrill
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Cyrill Gorcunov on
On Tue, Aug 10, 2010 at 06:41:24PM +0200, Robert Richter wrote:
> On 10.08.10 12:16:27, Cyrill Gorcunov wrote:
> > On Tue, Aug 10, 2010 at 09:42:00AM +0200, Robert Richter wrote:
> > > On 09.08.10 16:02:45, Cyrill Gorcunov wrote:
> > >
> > > > > @@ -1222,14 +1245,12 @@ perf_event_nmi_handler(struct notifier_block *self,
> > > > > regs = args->regs;
> > > > >
> > > > > apic_write(APIC_LVTPC, APIC_DM_NMI);
> > > >
> > > > If only I'm not missing something this apic_write should go up to
> > > > "case DIE_NMIUNKNOWN" site, no?
> > >
> > > This seems to be code from the non-nmi implementation and can be
> > > removed at all, which should be a separate patch. The vector is
> > > already set up.
> > >
> > > -Robert
> > >
> >
> > No, this is just a short way to unmask LVTPC (which is required for
> > cpus). Actually lookig into this snippet I found that in p4 pmu
> > I've made one redundant unmaksing operation. will update as only
> > this area settle down.
>
> The vector is setup in hw_perf_enable() and then never masked. The
> perfctrs nmi is alwayes enabled since then. I still see no reason for
> unmasking it again with every nmi. Once you handle the nmi it is also
> enabled.
>

It gets masked on NMI arrival, at least for some models (Core Duo, P4,
P6 M and I suspect more theh that, that was the reason oprofile has
it, also there is a note in SDM V3a page 643).

> -Robert
>
> --
> Advanced Micro Devices, Inc.
> Operating System Research Center
>
-- Cyrill
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Cyrill Gorcunov on
On Tue, Aug 10, 2010 at 09:05:41PM +0200, Robert Richter wrote:
> On 10.08.10 13:24:51, Cyrill Gorcunov wrote:
>
> > It gets masked on NMI arrival, at least for some models (Core Duo, P4,
> > P6 M and I suspect more theh that, that was the reason oprofile has
> > it, also there is a note in SDM V3a page 643).
>
> Yes, that's right, I never noticed that. Maybe it is better to
> implement the apic write it in model specific code then.
>

Perhaps we can make it simplier I think, ie like it was before -- we just
move it under your new DIE_NMIUNKNOWN, in separate patch of course. Though
I'm fine with either way.

(actually it's interesting to know wouldn't we leave lvt masked when
we hit 'second delayed nmi has arrived' situation, I guess we didn't
hit it before in real yet :-)

> -Robert
>
> --
> Advanced Micro Devices, Inc.
> Operating System Research Center
>
-- Cyrill
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Frederic Weisbecker on
On Tue, Aug 10, 2010 at 04:48:56PM -0400, Don Zickus wrote:
> On Mon, Aug 09, 2010 at 09:48:29PM +0200, Robert Richter wrote:
> > On 06.08.10 10:21:31, Don Zickus wrote:
> > > On Fri, Aug 06, 2010 at 08:52:03AM +0200, Robert Richter wrote:
> >
> > > > I was playing around with it yesterday trying to fix this. My idea is
> > > > to skip an unkown nmi if the privious nmi was a *handled* perfctr
> > >
> > > You might want to add a little more logic that says *handled* _and_ had
> > > more than one perfctr trigger. Most of the time only one perfctr is
> > > probably triggering, so you might be eating unknown_nmi's needlessly.
> > >
> > > Just a thought.
> >
> > Yes, that's true. It could be implemented on top of the patch below.
>
> I did, but the changes basically revert the bulk of your patch.
>
> >
> > >
> > > > nmi. I will probably post an rfc patch early next week.
> >
> > Here it comes:
> >
> > From d2739578199d881ae6a9537c1b96a0efd1cdea43 Mon Sep 17 00:00:00 2001
> > From: Robert Richter <robert.richter(a)amd.com>
> > Date: Thu, 5 Aug 2010 16:19:59 +0200
> > Subject: [PATCH] perf, x86: try to handle unknown nmis with running perfctrs
>
> On top of Robert's patch:
> (compiled tested only because I don't have a fancy button to trigger
> unknown nmis)
>
> From 548cf5148f47618854a0eff22b1d55db71b6f8fc Mon Sep 17 00:00:00 2001
> From: Don Zickus <dzickus(a)redhat.com>
> Date: Tue, 10 Aug 2010 16:40:03 -0400
> Subject: [PATCH] perf, x86: only skip NMIs when multiple perfctrs trigger
>
> A small optimization on top of Robert's patch that limits the
> skipping of NMI's to cases where we detect multiple perfctr events
> have happened.



Yeah, I think that's more reasonable. This lowers even more the chances of
losing important hardware errors.

One comment though:


>
> Signed-off-by: Don Zickus <dzickus(a)redhat.com>
>
> ---
> arch/x86/kernel/cpu/perf_event.c | 34 ++++++++++++++++++++--------------
> 1 files changed, 20 insertions(+), 14 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
> index c3cd159..066046d 100644
> --- a/arch/x86/kernel/cpu/perf_event.c
> +++ b/arch/x86/kernel/cpu/perf_event.c
> @@ -1154,7 +1154,7 @@ static int x86_pmu_handle_irq(struct pt_regs *regs)
> /*
> * event overflow
> */
> - handled = 1;
> + handled += 1;
> data.period = event->hw.last_period;
>
> if (!x86_perf_event_set_period(event))
> @@ -1200,7 +1200,7 @@ void perf_events_lapic_init(void)
> apic_write(APIC_LVTPC, APIC_DM_NMI);
> }
>
> -static DEFINE_PER_CPU(unsigned int, perfctr_handled);
> +static DEFINE_PER_CPU(unsigned int, perfctr_skip);
>
> static int __kprobes
> perf_event_nmi_handler(struct notifier_block *self,
> @@ -1208,8 +1208,7 @@ perf_event_nmi_handler(struct notifier_block *self,
> {
> struct die_args *args = __args;
> struct pt_regs *regs;
> - unsigned int this_nmi;
> - unsigned int prev_nmi;
> + int handled = 0;
>
> if (!atomic_read(&active_events))
> return NOTIFY_DONE;
> @@ -1229,14 +1228,11 @@ perf_event_nmi_handler(struct notifier_block *self,
> * was handling a perfctr. Otherwise we pass it and
> * let the kernel handle the unknown nmi.
> *
> - * Note: this could be improved if we drop unknown
> - * NMIs only if we handled more than one perfctr in
> - * the previous NMI.
> */
> - this_nmi = percpu_read(irq_stat.__nmi_count);
> - prev_nmi = __get_cpu_var(perfctr_handled);
> - if (this_nmi == prev_nmi + 1)
> + if (__get_cpu_var(perfctr_skip)){
> + __get_cpu_var(perfctr_skip) -=1;
> return NOTIFY_STOP;
> + }
> return NOTIFY_DONE;
> default:
> return NOTIFY_DONE;
> @@ -1246,11 +1242,21 @@ perf_event_nmi_handler(struct notifier_block *self,
>
> apic_write(APIC_LVTPC, APIC_DM_NMI);
>
> - if (!x86_pmu.handle_irq(regs))
> + handled = x86_pmu.handle_irq(regs);
> + if (!handled)
> + /* not our NMI */
> return NOTIFY_DONE;
> -
> - /* handled */
> - __get_cpu_var(perfctr_handled) = percpu_read(irq_stat.__nmi_count);
> + else if (handled > 1)
> + /*
> + * More than one perfctr triggered. This could have
> + * caused a second NMI that we must now skip because
> + * we have already handled it. Remember it.
> + *
> + * NOTE: We have no way of knowing if a second NMI was
> + * actually triggered, so we may accidentally skip a valid
> + * unknown nmi later.
> + */
> + __get_cpu_var(perfctr_skip) +=1;



May be make it just a pending bit. I mean not something that can
go further 1, because you can't have more than 1 pending anyway. I don't
know how that could happen you get accidental perctr_skip > 1, may be
expected pending NMIs that don't happen somehow, but better be paranoid with
that, as it's about trying not to miss hardware errors.

Thanks.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/