From: Arjan van de Ven on
On Sun, 13 Dec 2009 07:58:44 +0100
Ingo Molnar <mingo(a)elte.hu> wrote:

>
> * Linus Torvalds <torvalds(a)linux-foundation.org> wrote:
>
> > We've had quite a bit of BKL work this merge-window. Maybe we'll
> > even get rid of it one of these days. There are "only" about 600
> > instances of "lock_kernel()" in the tree right now ;)
>
> I tend to use unlock_kernel() as the metric. (as it's more precisely
> greppable and it is also more indicative of the underlying complexity
> of locking, as it gets used more in more complex scenarios)

another metric is... how many times do we take the BKL for some
workload. (For example booting or compiling a kernel).
A counter like "BKLs-per-second" would be nice to expose
(and then we can track that number going up as a regression etc)

For me, a secondary metric would be "how many times do we depend on the
magic auto-drop/reget behavior".. also easy to build a counter for.

--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Ingo Molnar on

* Linus Torvalds <torvalds(a)linux-foundation.org> wrote:

>
>
> On Sun, 13 Dec 2009, Ingo Molnar wrote:
> >
> > In the last ~4.5 years:
> >
> > earth4:~/tip> git checkout v2.6.12
> > Date: Fri Jun 17 12:48:29 2005 -0700
> > earth4:~/tip> git grep -w unlock_kernel | wc -l
> > 713
> >
> > earth4:~/tip> git checkout linus
> > Date: Fri Dec 11 20:58:20 2009 -0800
> > earth4:~/tip> git grep -w unlock_kernel | wc -l
> > 841
>
> Git hint of the day: you don't need to check out a kernel to do "git
> grep".
>
> Do this:
>
> git grep -w unlock_kernel v2.6.12 | wc
>
> and it will JustWork(tm).

/me adds it to the metal toolbox

> > Also, a lot of BKL use was hidden before, and due to the BKL removal
> > activities (by Thomas, Frederic, Jon, Alan and others) the remaining BKL using
> > sites are a lot more well defined, a lot more isolated and thus a lot more
> > removable than ever before.
>
> That's the main thing. We've been pushing them down a lot.
>
> We still have a few annoying core ones, though. I hate the execve() and file
> locking ones.

When we did the BKL-as-a-mutex trick and let lockdep loose on it, three areas
were particularly tricky: tty, reiser3 and NFS. tty and reiserfs should be ok
now, but i havent seen much activity on the NFS front.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Trond Myklebust on
On Sun, 2009-12-13 at 19:17 +0100, Ingo Molnar wrote:
> When we did the BKL-as-a-mutex trick and let lockdep loose on it, three areas
> were particularly tricky: tty, reiser3 and NFS. tty and reiserfs should be ok
> now, but i havent seen much activity on the NFS front.

I've got a couple of NFS bkl removal patches queued up that I'll send on
to Linus today, but they will not suffice to fully remove BKL from the
NFS code.

The main remaining problem area is that of file locking (i.e. anything
that references inode->i_flock). I've started work on that, but a couple
of higher interrupts have prevented me from pulling it all together in
time for this merge window...

Trond

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on


On Sun, 13 Dec 2009, Trond Myklebust wrote:
>
> The main remaining problem area is that of file locking (i.e. anything
> that references inode->i_flock). I've started work on that, but a couple
> of higher interrupts have prevented me from pulling it all together in
> time for this merge window...

I'm pretty sure we've had at least two trees with the file locking code
fixed, but NFS in a status of "unknown".

If I recall correctly, the file locking code itself is not that hard:
we've done it without the kernel lock in the past (long long ago), and the
lock usage doesn't nest (or at least it didn't at some point back then ;).
In fact, I think we even do the actual lock data structure allocations
outside of the kernel lock exactly because we at one time had a patch that
used a spinlock for protection of the lists.

(Again, not only my memory, but the code itself may have bitrotted in the
meantime, of course).

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Frederic Weisbecker on
On Sun, Dec 13, 2009 at 07:17:26PM +0100, Ingo Molnar wrote:
>
> * Linus Torvalds <torvalds(a)linux-foundation.org> wrote:
>
> >
> >
> > On Sun, 13 Dec 2009, Ingo Molnar wrote:
> > >
> > > In the last ~4.5 years:
> > >
> > > earth4:~/tip> git checkout v2.6.12
> > > Date: Fri Jun 17 12:48:29 2005 -0700
> > > earth4:~/tip> git grep -w unlock_kernel | wc -l
> > > 713
> > >
> > > earth4:~/tip> git checkout linus
> > > Date: Fri Dec 11 20:58:20 2009 -0800
> > > earth4:~/tip> git grep -w unlock_kernel | wc -l
> > > 841
> >
> > Git hint of the day: you don't need to check out a kernel to do "git
> > grep".
> >
> > Do this:
> >
> > git grep -w unlock_kernel v2.6.12 | wc
> >
> > and it will JustWork(tm).
>
> /me adds it to the metal toolbox
>
> > > Also, a lot of BKL use was hidden before, and due to the BKL removal
> > > activities (by Thomas, Frederic, Jon, Alan and others) the remaining BKL using
> > > sites are a lot more well defined, a lot more isolated and thus a lot more
> > > removable than ever before.
> >
> > That's the main thing. We've been pushing them down a lot.
> >
> > We still have a few annoying core ones, though. I hate the execve() and file
> > locking ones.
>
> When we did the BKL-as-a-mutex trick and let lockdep loose on it, three areas
> were particularly tricky: tty, reiser3 and NFS. tty and reiserfs should be ok
> now, but i havent seen much activity on the NFS front.
>
> Ingo


Nfs was a problem in the BKL-as-a-mutex tree because it is acquired
recursively and then locked up. And I suspect this recursion happens
only on mount time because vfs acquires it too there.

But looking at it more closely, it doesn't look that dramatic at
a first glance.

git-grep unlock_kernel fs/nfs

fs/nfs/callback.c: unlock_kernel();
fs/nfs/callback.c: unlock_kernel();

In nfs4_callback_svc() it embraces the socket
listening/processing callback thread.

svc_recv() might sleep so the bkl can be
lost in the middle.

And then svc_process().

This doesn't seem to protect anything there.

In nfs4_callback_svc() it's about the same thing, plus
a list manipulation but protected by a spinlock.


fs/nfs/delegation.c: unlock_kernel();
fs/nfs/delegation.c: unlock_kernel();
fs/nfs/nfs4state.c: unlock_kernel();
fs/nfs/nfs4state.c: unlock_kernel();



In the above cases we have the following comment:

/* Protect inode->i_flock using the BKL */

And really it doesn't seem to protect anything else,
fortunately it is acquired in a short path.


fs/nfs/super.c: unlock_kernel();
fs/nfs/super.c: unlock_kernel();

Protect the mount/unmount path, a bit trickier there.

But really that looks way much easier to fix than it was
with reiserfs.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/