From: Américo Wang on
2010/3/15 Américo Wang <xiyou.wangcong(a)gmail.com>:
> 2010/3/15 Américo Wang <xiyou.wangcong(a)gmail.com>:
>> On Sat, Mar 13, 2010 at 01:58:38PM -0800, Paul E. McKenney wrote:
>>>On Sat, Mar 13, 2010 at 01:33:56PM +0800, Américo Wang wrote:
>>>> On Fri, Mar 12, 2010 at 02:37:38PM +0100, Eric Dumazet wrote:
>>>> >Le vendredi 12 mars 2010 à 21:11 +0800, Américo Wang a écrit :
>>>> >
>>>> >> Oh, but lockdep complains about rcu_read_lock(), it said
>>>> >> rcu_read_lock() can't be used in softirq context.
>>>> >>
>>>> >> Am I missing something?
>>>> >
>>>> >Well, lockdep might be dumb, I dont know...
>>>> >
>>>> >I suggest you read rcu_read_lock_bh kernel doc :
>>>> >
>>>> >/**
>>>> > * rcu_read_lock_bh - mark the beginning of a softirq-only RCU critical
>>>> >section
>>>> > *
>>>> > * This is equivalent of rcu_read_lock(), but to be used when updates
>>>> > * are being done using call_rcu_bh(). Since call_rcu_bh() callbacks
>>>> > * consider completion of a softirq handler to be a quiescent state,
>>>> > * a process in RCU read-side critical section must be protected by
>>>> > * disabling softirqs. Read-side critical sections in interrupt context
>>>> > * can use just rcu_read_lock().
>>>> > *
>>>> > */
>>>> >
>>>> >
>>>> >Last sentence being perfect :
>>>> >
>>>> >Read-side critical sections in interrupt context
>>>> >can use just rcu_read_lock().
>>>> >
>>>>
>>>> Yeah, right, then it is more likely to be a bug of rcu lockdep.
>>>> Paul is looking at it.
>>>
>>>Except that it seems to be working correctly for me...
>>>
>>
>> Hmm, then I am confused. The only possibility here is that this is
>> a lockdep bug...
>>
>
> I believe so...
>
> Peter, this looks odd:
>
>  kernel:  (usbfs_mutex){+.?...}, at: [<ffffffff8146419f>]
> netif_receive_skb+0xe7/0x819
>
> netif_receive_skb() never has a chance to take usbfs_mutex. How can this
> comes out?
>

Ok, I think I found what lockdep really complains about, it is that we took
spin_lock in netpoll_poll_lock() which is in hardirq-enabled environment,
later, we took another spin_lock with spin_lock_irqsave() in netpoll_rx(),
so lockdep thought we broke the locking rule.

I don't know why netpoll_rx() needs irq disabled, it looks like that no one
takes rx_lock in hardirq context. So can we use spin_lock(&rx_lock)
instead? Or am I missing something here? Eric? David?

Thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Eric Dumazet on
Le lundi 15 mars 2010 à 17:39 +0800, Américo Wang a écrit :

>
> Ok, I think I found what lockdep really complains about, it is that we took
> spin_lock in netpoll_poll_lock() which is in hardirq-enabled environment,
> later, we took another spin_lock with spin_lock_irqsave() in netpoll_rx(),
> so lockdep thought we broke the locking rule.
>
> I don't know why netpoll_rx() needs irq disabled, it looks like that no one
> takes rx_lock in hardirq context. So can we use spin_lock(&rx_lock)
> instead? Or am I missing something here? Eric? David?

I am a bit lost.

Could you give the complete picture, because I cannot find it in my
netdev archives.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Américo Wang on
On Mon, Mar 15, 2010 at 6:04 PM, Eric Dumazet <eric.dumazet(a)gmail.com> wrote:
> Le lundi 15 mars 2010 à 17:39 +0800, Américo Wang a écrit :
>
>>
>> Ok, I think I found what lockdep really complains about, it is that we took
>> spin_lock in netpoll_poll_lock() which is in hardirq-enabled environment,
>> later, we took another spin_lock with spin_lock_irqsave() in netpoll_rx(),
>> so lockdep thought we broke the locking rule.
>>
>> I don't know why netpoll_rx() needs irq disabled, it looks like that no one
>> takes rx_lock in hardirq context. So can we use spin_lock(&rx_lock)
>> instead? Or am I missing something here? Eric? David?
>
> I am a bit lost.
>
> Could you give the complete picture, because I cannot find it in my
> netdev archives.
>

Sure, sorry for this.

Here is the whole thread:

http://lkml.org/lkml/2010/3/11/100
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Eric Dumazet on
Le lundi 15 mars 2010 à 18:12 +0800, Américo Wang a écrit :
> On Mon, Mar 15, 2010 at 6:04 PM, Eric Dumazet <eric.dumazet(a)gmail.com> wrote:
> > Le lundi 15 mars 2010 à 17:39 +0800, Américo Wang a écrit :
> >
> >>
> >> Ok, I think I found what lockdep really complains about, it is that we took
> >> spin_lock in netpoll_poll_lock() which is in hardirq-enabled environment,
> >> later, we took another spin_lock with spin_lock_irqsave() in netpoll_rx(),
> >> so lockdep thought we broke the locking rule.
> >>
> >> I don't know why netpoll_rx() needs irq disabled, it looks like that no one
> >> takes rx_lock in hardirq context. So can we use spin_lock(&rx_lock)
> >> instead? Or am I missing something here? Eric? David?
> >
> > I am a bit lost.
> >
> > Could you give the complete picture, because I cannot find it in my
> > netdev archives.
> >
>
> Sure, sorry for this.
>
> Here is the whole thread:
>
> http://lkml.org/lkml/2010/3/11/100

OK thanks

netpoll_rx() can be called from hard irqs (netif_rx()), so rx_lock
definitly needs irq care.

netpoll_poll_lock() does take a spinlock with irq enabled, but its not
rx_lock, its napi->poll_lock.

I dont see what could be the problem, is it reproductible with vanilla
kernel ?


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Américo Wang on
On Mon, Mar 15, 2010 at 6:41 PM, Eric Dumazet <eric.dumazet(a)gmail.com> wrote:
> Le lundi 15 mars 2010 à 18:12 +0800, Américo Wang a écrit :
>> On Mon, Mar 15, 2010 at 6:04 PM, Eric Dumazet <eric.dumazet(a)gmail.com> wrote:
>> > Le lundi 15 mars 2010 à 17:39 +0800, Américo Wang a écrit :
>> >
>> >>
>> >> Ok, I think I found what lockdep really complains about, it is that we took
>> >> spin_lock in netpoll_poll_lock() which is in hardirq-enabled environment,
>> >> later, we took another spin_lock with spin_lock_irqsave() in netpoll_rx(),
>> >> so lockdep thought we broke the locking rule.
>> >>
>> >> I don't know why netpoll_rx() needs irq disabled, it looks like that no one
>> >> takes rx_lock in hardirq context. So can we use spin_lock(&rx_lock)
>> >> instead? Or am I missing something here? Eric? David?
>> >
>> > I am a bit lost.
>> >
>> > Could you give the complete picture, because I cannot find it in my
>> > netdev archives.
>> >
>>
>> Sure, sorry for this.
>>
>> Here is the whole thread:
>>
>> http://lkml.org/lkml/2010/3/11/100
>
> OK thanks
>
> netpoll_rx() can be called from hard irqs (netif_rx()), so rx_lock
> definitly needs irq care.
>
> netpoll_poll_lock() does take a spinlock with irq enabled, but its not
> rx_lock, its napi->poll_lock.

Yeah, I knew, but besides rcu locks, these two locks are the only
locks that can be taken in the call chain. I suspect lockdep got
something wrong.

>
> I dont see what could be the problem, is it reproductible with vanilla
> kernel ?
>

No. I don't know why, my patch doesn't touch any function in the
call chain.

I already "fix" this in another way, so no need to worry this any more.

Thanks for your help!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/