From: Yong Zhang on
On Tue, Mar 09, 2010 at 08:58:11AM +0100, Thomas Gleixner wrote:
> On Tue, 9 Mar 2010, Lars-Peter Clausen wrote:
>
> > If the kernel has been compiled with preemtion support and handle_level_irq is
> > called from process context for a oneshot irq there is a race between
> > irq_finalize_oneshot and handle_level_irq which results in the irq not being
> > unmasked after its handlers have been run.
> >
> > irq_finalize_oneshot is expected to unmask the irq after the threaded irq
> > handler has been run. It only does so if IRQ_MASKED is set for the irqs status.
> > IRQ_MASKED gets set in the lower part of handle_level_irq after handle_IRQ_event
> > has been called.
> > handle_IRQ_event will wakeup the oneshot irqs threaded handler and if the
> > kernel has been build with preemption there is a chance that the threaded irq
> > handler will finish before execution is returned to handle_level_irq.
> > As a result irq_finalize_oneshot will not unmask the irq and handle_level_irq
> > will set the IRQ_MASKED flag. Thus the irq will stay masked and stalls.
> >
> > In case of an race the call-graph would look like this:
> > handle_level_irq
> > |- mask_ack_irq
> > |- handle_IRQ_event
> > |- wake_up_process
> > |- irq_thread
> > |- action->thread_fn
> > |- irq_finalize_oneshot # Does not unmask the irq
> > |- # Set IRQ_MASKED status flag
>
> Errm, a thread _CANNOT_ preempt a hard interrupt handler.

But on SMP, an irq thread could run on another cpu, right?

Thanks,
Yong

>
> Thanks,
>
> tglx
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo(a)vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Yong Zhang on
On Wed, Mar 10, 2010 at 08:56:22AM +0100, Thomas Gleixner wrote:
> > How about the following patch(maybe a little ugly). I think it will
> > resolve your concerns.
>
> No it does not, but you are right that it's ugly. And it is patently
> wrong as well.
>
> > diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
> > index d70394f..23b79c6 100644
> > --- a/kernel/irq/chip.c
> > +++ b/kernel/irq/chip.c
> > @@ -461,9 +461,24 @@ handle_level_irq(unsigned int irq, struct irq_desc *desc)
> > raw_spin_lock(&desc->lock);
> > mask_ack_irq(desc, irq);

I must say this patch isn't based on your previous one in which
mask_ack_irq() is modified to flag IRQ_MASKED.
I let IRQ_MASKED serialise interrupt-handler and irq-thread in oneshot
mode.

> >
> > - if (unlikely(desc->status & IRQ_INPROGRESS))
> > - goto out_unlock;
> > + /*
> > + * if we are in oneshot mode and the irq thread is running on
> > + * another cpu, just return because the irq thread will unmask
> > + * the irq
> > + */
> > + if (unlikely(desc->status & IRQ_ONESHOT)) {
> > + if (unlikely(desc->status & (IRQ_INPROGRESS | IRQ_MASKED)
> > + == IRQ_INPROGRESS | IRQ_MASKED))
> > + goto out_unlock;
> > + }
> > + else {
> > + if (unlikely(desc->status & IRQ_INPROGRESS))
> > + goto out_unlock;
> > + }
>
> In case of IRQ_SHOT and IRQ_INPROGRESS and the other CPU having
> unmasked the interrupt already you are reentering the handler which
> is a nono.

I have thought of that kind of reentering you point above,
if IRQ_MASKED is cleared by irq_finalize_oneshot(), the thread_fn()
is done as well as the hardware opration. If another irq comes in,
the reentering does happen, but I think it's harmless because at
this time we just set IRQTF_RUNTHREAD and IRQ_MASKED and wake
irq thread up and then the irq thread loops for another time.
So irq will not return unmask in case of oneshot.

If I miss something, correct me please.

Thanks,
Yong


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Yong Zhang on
On Thu, Mar 11, 2010 at 09:41:45AM +0100, Thomas Gleixner wrote:
> No, it is _NOT_ harmless. interrupt handlers are guaranteed _NOT_ to
> be reentered. That's why we check the INPROGRESS flag.
>
> And you totally miss the case which caused the disussion in the first
> place: When the thread finishes _before_ the hard irq handler on the
> other CPU has set IRQ_MASKED. So this patch solves nothing at all, it
> just makes stuff worse than it was.
>
> Here is the solution which solves the inconsistent lock state _AND_
> the reentrancy race.
>
> http://git.kernel.org/?p=linux/kernel/git/tip/linux-2.6-tip.git;a=commit;h=0b1adaa031a55e44f5dd942f234bf09d28e8a0d6
>
> The change log has a full explanation of the scenarios.

Yeah, I see. Thanks for your point.
It will smooth the concerns.

Thanks,
Yong
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/