From: Don Zickus on
On Fri, Aug 06, 2010 at 08:52:03AM +0200, Robert Richter wrote:
> On 04.08.10 15:26:34, Cyrill Gorcunov wrote:
>
> > yes, that is what I meant by nmi_sc register. I think we need to restucturize
> > current default_do_nmi handler but how to be with perfs I don't know at moment
> > if perf register gets overflowed (ie already has pedning nmi) but we handle
> > it in early nmi cycle this would lead to strange results. Need to think.
> >
> > >
> > > So you can decide to either get an unrecovered nmi panic triggered by
> > > a perfctr or losing unknown nmis from other sources. Maybe this can be
> > > fixed by implementing handlers for those sources.
>
> I was playing around with it yesterday trying to fix this. My idea is
> to skip an unkown nmi if the privious nmi was a *handled* perfctr

You might want to add a little more logic that says *handled* _and_ had
more than one perfctr trigger. Most of the time only one perfctr is
probably triggering, so you might be eating unknown_nmi's needlessly.

Just a thought.

> nmi. I will probably post an rfc patch early next week.
>
> Another problem I encountered is that unknown nmis from the chipset
> are not reenabled, thus when hitting the nmi button I only see one
> unknown nmi message per boot, if I reenable it, I get an nmi
> storm firing nmi_watchdog. Uhh....

Interesting.

Cheers,
Don
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Andi Kleen on
Peter Zijlstra <peterz(a)infradead.org> writes:
>
> Suppose you have 4 counters (AMD, intel-nhm+), when more than 2 overflow
> the first will raise the PMI, if the other 2+ overflow before we disable
> the PMU it will try to raise 2+ more PMIs, but because hardware only has
> a single interrupt pending bit it will at most cause a single extra
> interrupt after we finish servicing the first one.
>
> So then the first interrupt will see 3+ overflows, return 3+, and will
> thus eat 2+ NMIs, only one of which will be the pending interrupt,
> leaving 1+ NMIs from other sources to consume unhandled.
>
> In which case Yinghai will have to press his NMI button 2+ times before
> it registers.
>
> That said, that might be a better situation than always consuming
> unknown NMIs..

One alternative would be to stop using NMIs for perf counters in common cases.

If you have PEBS and your events support PEBS then PEBS can give you a
lot of information inside the irq off region. That works for common
events at least.

Also traditionally interrupt off regions are shrinking in Linux,
so is it really still worth all the trouble just to profile inside them.

e.g. one could make nmi profiling an option with default off.

-Andi

--
ak(a)linux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Andi Kleen on
Don Zickus <dzickus(a)redhat.com> writes:

> On Wed, Aug 04, 2010 at 08:10:46PM +0400, Cyrill Gorcunov wrote:
>> On Wed, Aug 04, 2010 at 11:50:02AM -0400, Don Zickus wrote:
>> ...
>> > >
>> > > Well, first I guess having Yinghai CC'ed is a bonus ;)
>> > > The second thing is that I don't get why perf handler can't be _last_
>> > > call in default_do_nmi, if there were any nmi with reason (serr or parity)
>> > > I think they should be calling first which of course don't eliminate
>> > > the former issue but somewhat make it weaken.
>> >
>> > Because the reason registers are never set. If they were, then the code
>> > wouldn't have to walk the notify_chain. :-)
>> >
>>
>> maybe we're talking about different things. i meant that if there is nmi
>> with a reason (from 0x61) the handling of such nmi should be done before
>> notify_die I think (if only I not miss something behind).
>
> No we are talking about the same thing. :-) And that code is already
> there. The problem is the bits in register 0x61 are not always set
> correctly in the case of SERRs (well at least in all the cases I have
> dealt with). So you can easily can a flood of unknown nmis from an SERR
> and register 0x61 would have the PERR/SERR bits set to 0. Fun, huh?

Some of this can be handled by APEI on newer systems (if the platform
supports that).

But not all unfortunately if you consider older systems.

-Andi
--
ak(a)linux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/