From: Alan Cox on
> The do_tty_hangup()->tty_fasync() path takes the locks in the
> file_list_lock()->lock_kernel() direction whereas most other code takes
> them in the other direction, which cannot be good. But I'm not sure

Thats a bug - the BKL does want to be taken first (and will
sleep/yield/drop)

> Have a trace. I'm actually wondering if perhaps there's a missing
> unlock_kernel() somewhere else, and the tty code is just the victim of
> that.

It's introduced by the BKL shuffle. Try the following


commit 1f61f07a985c7e8cfc20ad8fcced2f3d226bd0dc
Author: Alan Cox <alan(a)linux.intel.com>
Date: Sat Dec 12 10:32:36 2009 +0000

tty: Fix the AB-BA locking bug introduced in the BKL split

The fasync path takes the BKL (it probably doesn't need to in fact) but
this causes lock inversions and deadlocks so we can't do that. Leave the
BKL over that bit for the moment.

Identified by AKPM.

Signed-off-by: Alan Cox <alan(a)linux.intel.com>

diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c
index 684f0e0..f15df40 100644
--- a/drivers/char/tty_io.c
+++ b/drivers/char/tty_io.c
@@ -516,7 +516,6 @@ static void do_tty_hangup(struct work_struct *work)
/* inuse_filps is protected by the single kernel lock */
lock_kernel();
check_tty_count(tty, "do_tty_hangup");
- unlock_kernel();

file_list_lock();
/* This breaks for file handles being sent over AF_UNIX sockets ? */
@@ -531,7 +530,6 @@ static void do_tty_hangup(struct work_struct *work)
}
file_list_unlock();

- lock_kernel();
tty_ldisc_hangup(tty);

read_lock(&tasklist_lock);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on


On Sat, 12 Dec 2009, Thomas Gleixner wrote:
>
> Just patched the following in and it catched your problem nicely. With
> your AB/BA fix patch applied everything is fine.

Actually, the patch I just suggested might be better. Admittedly the
CONFIG_PREEMPT case is the more important one (and the one that will catch
more cases), but even without preemptyion the might_sleep() in
_lock_kernel() (one underscore) would trigger on the case of somebody
doing lock_kernel with local interrupts disabled.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on


On Sat, 12 Dec 2009, Thomas Gleixner wrote:
> >
> > Anybody willing to be the guinea-pig?
>
> Replaced my patch with yours and it works the same way (except for the
> PREEMPT=n case)
>
> Acked-and-Tested-by: Thomas Gleixner <tglx(a)linutronix.de>

Ok, I also decided to just test it myself too (after applying the tty
layer fix) and it doesn't seem to cause any problems, so I've committed
it.

If there is dubious BKL usage that triggers the new might_sleep() warning
(and it turns out that it's necessary and not really fixable), we can
always just remove it again. But on the other hand, maybe it shows some
other potential problems that should just be fixed.

We've had quite a bit of BKL work this merge-window. Maybe we'll even get
rid of it one of these days. There are "only" about 600 instances of
"lock_kernel()" in the tree right now ;)

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Thomas Gleixner on
On Sat, 12 Dec 2009, Alan Cox wrote:
> > I think we could possibly add a "__might_sleep()" to _lock_kernel(). It
> > doesn't really sleep, but it's invalid to take the kernel lock in an
> > atomic region, so __might_sleep() might be the right thing anyway.
>
> It's only invalid if you don't already hold the lock. The old tty code
> worked because every path into tty_fasync already held the lock ! That
> specific case - taking it the first time should definitely
> __might_sleep().
>
> Mind you it's probably still rather dumb and would be a good debugging
> aid for -next to be able to warn on all offences if only to catch this
> stuff for the future BKL removal work.

Just patched the following in and it catched your problem nicely. With
your AB/BA fix patch applied everything is fine.

Thanks,

tglx
---
Subject: BKL: Add might sleep to __lock_kernel
From: Thomas Gleixner <tglx(a)linutronix.de>
Date: Sat, 12 Dec 2009 20:29:00 +0100

Catches all offenders which take the BKL first time in an atomic
region. Recursive lock_kernel calls are not affected.

Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
---
lib/kernel_lock.c | 2 ++
1 file changed, 2 insertions(+)

Index: linux-2.6/lib/kernel_lock.c
===================================================================
--- linux-2.6.orig/lib/kernel_lock.c
+++ linux-2.6/lib/kernel_lock.c
@@ -64,6 +64,8 @@ void __lockfunc __release_kernel_lock(vo
#ifdef CONFIG_PREEMPT
static inline void __lock_kernel(void)
{
+ might_sleep();
+
preempt_disable();
if (unlikely(!_raw_spin_trylock(&kernel_flag))) {
/*
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Thomas Gleixner on
On Sat, 12 Dec 2009, Linus Torvalds wrote:

>
> On Sat, 12 Dec 2009, Alan Cox wrote:
>
> > > I think we could possibly add a "__might_sleep()" to _lock_kernel(). It
> > > doesn't really sleep, but it's invalid to take the kernel lock in an
> > > atomic region, so __might_sleep() might be the right thing anyway.
> >
> > It's only invalid if you don't already hold the lock.
>
> True.
>
> > The old tty code worked because every path into tty_fasync already held
> > the lock ! That specific case - taking it the first time should
> > definitely __might_sleep().
>
> That would give us at least somewhat better debugging. And it's a very
> natural thing to do. IOW, just something like the appended.
>
> But maybe it complains about valid (but unusual) things. For example, it's
> not strictly speaking _wrong_ to take the kernel lock while preemption is
> disabled, even though it's a really bad idea.
>
> Anybody willing to be the guinea-pig?

Replaced my patch with yours and it works the same way (except for the
PREEMPT=n case)

Acked-and-Tested-by: Thomas Gleixner <tglx(a)linutronix.de>

> Linus
>
> ---
> lib/kernel_lock.c | 4 +++-
> 1 files changed, 3 insertions(+), 1 deletions(-)
>
> diff --git a/lib/kernel_lock.c b/lib/kernel_lock.c
> index 4ebfa5a..5526b46 100644
> --- a/lib/kernel_lock.c
> +++ b/lib/kernel_lock.c
> @@ -122,8 +122,10 @@ void __lockfunc _lock_kernel(const char *func, const char *file, int line)
>
> trace_lock_kernel(func, file, line);
>
> - if (likely(!depth))
> + if (likely(!depth)) {
> + might_sleep();
> __lock_kernel();
> + }
> current->lock_depth = depth;
> }
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/