From: Brian King on
Is IRQF_NODELAY something specific to the RT kernel? I don't see it in mainline...

-Brian


On 05/18/2010 04:33 PM, dvhltc(a)linux.vnet.ibm.com wrote:
>>From ad81664794e33d785f533c5edee37aaba20dd92d Mon Sep 17 00:00:00 2001
> From: Darren Hart <dvhltc(a)us.ibm.com>
> Date: Tue, 18 May 2010 11:07:13 -0700
> Subject: [PATCH RT] ehea: make receive irq handler non-threaded (IRQF_NODELAY)
>
> The underlying hardware is edge triggered but presented by XICS as level
> triggered. The edge triggered interrupts are not reissued after masking. This
> is not a problem in mainline which does not mask the interrupt (relying on the
> EOI mechanism instead). The threaded interrupts in PREEMPT_RT do mask the
> interrupt, and can lose interrupts that occurred while masked, resulting in a
> hung ethernet interface.
>
> The receive handler simply calls napi_schedule(), as such, there is no
> significant additional overhead in making this non-threaded, since we either
> wakeup the threaded irq handler to call napi_schedule(), or just call
> napi_schedule() directly to wakeup the softirqs. As the receive handler is
> lockless, there is no need to convert any of the ehea spinlock_t's to
> atomic_spinlock_t's.
>
> Without this patch, a simple scp file copy loop would fail quickly (usually
> seconds). We have over two hours of sustained scp activity with the patch
> applied.
>
> Credit goes to Will Schmidt for lots of instrumentation and tracing which
> clarified the scenario and to Thomas Gleixner for the incredibly simple
> solution.
>
> Signed-off-by: Darren Hart <dvhltc(a)us.ibm.com>
> Acked-by: Will Schmidt <will_schmidt(a)vnet.ibm.com>
> Cc: Thomas Gleixner <tglx(a)linuxtronix.de>
> Cc: Jan-Bernd Themann <themann(a)de.ibm.com>
> Cc: Nivedita Singhvi <niv(a)us.ibm.com>
> Cc: Brian King <bjking1(a)us.ibm.com>
> Cc: Michael Ellerman <ellerman(a)au1.ibm.com>
> Cc: Doug Maxey <doug.maxey(a)us.ibm.com>
> ---
> drivers/net/ehea/ehea_main.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/net/ehea/ehea_main.c b/drivers/net/ehea/ehea_main.c
> index 977c3d3..2c53df2 100644
> --- a/drivers/net/ehea/ehea_main.c
> +++ b/drivers/net/ehea/ehea_main.c
> @@ -1263,7 +1263,7 @@ static int ehea_reg_interrupts(struct net_device *dev)
> "%s-queue%d", dev->name, i);
> ret = ibmebus_request_irq(pr->eq->attr.ist1,
> ehea_recv_irq_handler,
> - IRQF_DISABLED, pr->int_send_name,
> + IRQF_DISABLED | IRQF_NODELAY, pr->int_send_name,
> pr);
> if (ret) {
> ehea_error("failed registering irq for ehea_queue "


--
Brian King
Linux on Power Virtualization
IBM Linux Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Nivedita Singhvi on
Brian King wrote:
> Is IRQF_NODELAY something specific to the RT kernel? I don't see it in mainline...

Yep, this is RT only.

thanks,
Nivedita
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Darren Hart on
On 05/18/2010 02:52 PM, Brian King wrote:
> Is IRQF_NODELAY something specific to the RT kernel? I don't see it in mainline...

Yes, it basically says "don't make this handler threaded".

--
Darren

>
> -Brian
>
>
> On 05/18/2010 04:33 PM, dvhltc(a)linux.vnet.ibm.com wrote:
>> > From ad81664794e33d785f533c5edee37aaba20dd92d Mon Sep 17 00:00:00 2001
>> From: Darren Hart<dvhltc(a)us.ibm.com>
>> Date: Tue, 18 May 2010 11:07:13 -0700
>> Subject: [PATCH RT] ehea: make receive irq handler non-threaded (IRQF_NODELAY)
>>
>> The underlying hardware is edge triggered but presented by XICS as level
>> triggered. The edge triggered interrupts are not reissued after masking. This
>> is not a problem in mainline which does not mask the interrupt (relying on the
>> EOI mechanism instead). The threaded interrupts in PREEMPT_RT do mask the
>> interrupt, and can lose interrupts that occurred while masked, resulting in a
>> hung ethernet interface.
>>
>> The receive handler simply calls napi_schedule(), as such, there is no
>> significant additional overhead in making this non-threaded, since we either
>> wakeup the threaded irq handler to call napi_schedule(), or just call
>> napi_schedule() directly to wakeup the softirqs. As the receive handler is
>> lockless, there is no need to convert any of the ehea spinlock_t's to
>> atomic_spinlock_t's.
>>
>> Without this patch, a simple scp file copy loop would fail quickly (usually
>> seconds). We have over two hours of sustained scp activity with the patch
>> applied.
>>
>> Credit goes to Will Schmidt for lots of instrumentation and tracing which
>> clarified the scenario and to Thomas Gleixner for the incredibly simple
>> solution.
>>
>> Signed-off-by: Darren Hart<dvhltc(a)us.ibm.com>
>> Acked-by: Will Schmidt<will_schmidt(a)vnet.ibm.com>
>> Cc: Thomas Gleixner<tglx(a)linuxtronix.de>
>> Cc: Jan-Bernd Themann<themann(a)de.ibm.com>
>> Cc: Nivedita Singhvi<niv(a)us.ibm.com>
>> Cc: Brian King<bjking1(a)us.ibm.com>
>> Cc: Michael Ellerman<ellerman(a)au1.ibm.com>
>> Cc: Doug Maxey<doug.maxey(a)us.ibm.com>
>> ---
>> drivers/net/ehea/ehea_main.c | 2 +-
>> 1 files changed, 1 insertions(+), 1 deletions(-)
>>
>> diff --git a/drivers/net/ehea/ehea_main.c b/drivers/net/ehea/ehea_main.c
>> index 977c3d3..2c53df2 100644
>> --- a/drivers/net/ehea/ehea_main.c
>> +++ b/drivers/net/ehea/ehea_main.c
>> @@ -1263,7 +1263,7 @@ static int ehea_reg_interrupts(struct net_device *dev)
>> "%s-queue%d", dev->name, i);
>> ret = ibmebus_request_irq(pr->eq->attr.ist1,
>> ehea_recv_irq_handler,
>> - IRQF_DISABLED, pr->int_send_name,
>> + IRQF_DISABLED | IRQF_NODELAY, pr->int_send_name,
>> pr);
>> if (ret) {
>> ehea_error("failed registering irq for ehea_queue "
>
>


--
Darren Hart
IBM Linux Technology Center
Real-Time Linux Team
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Darren Hart on
On 05/18/2010 06:25 PM, Michael Ellerman wrote:
> On Tue, 2010-05-18 at 15:22 -0700, Darren Hart wrote:
>> On 05/18/2010 02:52 PM, Brian King wrote:
>>> Is IRQF_NODELAY something specific to the RT kernel? I don't see it in mainline...
>>
>> Yes, it basically says "don't make this handler threaded".
>
> That is a good fix for EHEA, but the threaded handling is still broken
> for anything else that is edge triggered isn't it?

No, I don't believe so. Edge triggered interrupts that are reported as
edge triggered interrupts will use the edge handler (which was the
approach Sebastien took to make this work back in 2008). Since XICS
presents all interrupts as Level Triggered, they use the fasteoi path.

>
> The result of the discussion about two years ago on this was that we
> needed a custom flow handler for XICS on RT.

I'm still not clear on why the ultimate solution wasn't to have XICS
report edge triggered as edge triggered. Probably some complexity of the
entire power stack that I am ignorant of.

> Apart from the issue of loosing interrupts there is also the fact that
> masking on the XICS requires an RTAS call which takes a global lock.

Right, one of may reasons why we felt this was the right fix. The other
is that there is no real additional overhead in running this as
non-threaded since the receive handler is so short (just napi_schedule()).

--
Darren Hart
IBM Linux Technology Center
Real-Time Linux Team
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Jan-Bernd Themann on
Hi,


Michael Ellerman <michael(a)ellerman.id.au> wrote on 20.05.2010 03:34:08:


> Subject:
>
> Re: [PATCH RT] ehea: make receive irq handler non-threaded (IRQF_NODELAY)
>
> On Wed, 2010-05-19 at 23:08 +0200, Thomas Gleixner wrote:
> > On Wed, 19 May 2010, Thomas Gleixner wrote:
> > > > I'm still not clear on why the ultimate solution wasn't to
> have XICS report
> > > > edge triggered as edge triggered. Probably some complexity of
> the entire power
> > > > stack that I am ignorant of.
> > > >
> > > > > Apart from the issue of loosing interrupts there is also thefact
that
> > > > > masking on the XICS requires an RTAS call which takes a global
lock.
> > >
> > > Right, I'd love to avoid that but with real level interrupts we'd run
> > > into an interrupt storm. Though another solution would be to issue
the
> > > EOI after the threaded handler finished, that'd work as well, but
> > > needs testing.
> >
> > Thought more about that. The case at hand (ehea) is nasty:
> >
> > The driver does _NOT_ disable the rx interrupt in the card in the rx
> > interrupt handler - for whatever reason.
>
> Yeah I saw that, but I don't know why it's written that way. Perhaps
> Jan-Bernd or Doug will chime in and enlighten us? :)

From our perspective there is no need to disable interrupts for the RX side
as
the chip does not fire further interrupts until we tell the chip to do so
for a
particular queue. We have multiple receive queues with an own interrupt
each
so that the interrupts can arrive on multiple CPUs in parallel.
Interrupts are enabled again when we leave the NAPI Poll function for the
corresponding
receive queue.

Michael, does this answer your question?

Regards,
Jan-Bernd

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/