From: Frederic Weisbecker on
On Fri, Jul 02, 2010 at 03:44:27PM +0200, Peter Zijlstra wrote:
> On Fri, 2010-07-02 at 15:12 +0200, Frederic Weisbecker wrote:
> >
> > I don't think the deadlock can really happen, as we can't release the directory while
> > we are reading it. Plus I guess we can't mmap a directory (someone correct me if
> > I'm wrong).
> >
>
> > Is there someone who could give me a hint here?
>
> If its purely directories you can try and give directory inode locks a
> different class.


We have a static layout in include/linux/fs.h:

enum inode_i_mutex_lock_class
{
I_MUTEX_NORMAL,
I_MUTEX_PARENT,
I_MUTEX_CHILD,
I_MUTEX_XATTR,
I_MUTEX_QUOTA
};


I fear none of them fits in our scheme, except the normal one.

And playing with a supplementary set of classes only used in
a single place would make the lockdep checks useless there,
even worse it would make us missing a lot of right checks.

vfs_readdir() locks the directory before calling the fs, and none
of the nested classes would be in its good right there, as the
directory is the current inode to work on, not a parent nor a
child or so.

And unfortunately it's the same in reiserfs_file_release() that
can close about whatever inode, we are locking the current inode,
not a parent, child, xattr, or so.


So changing the nesting class here would not be a good thing to do
I fear.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Peter Zijlstra on
On Fri, 2010-07-02 at 16:34 +0200, Frederic Weisbecker wrote:
> vfs_readdir() locks the directory before calling the fs, and none
> of the nested classes would be in its good right there, as the
> directory is the current inode to work on, not a parent nor a
> child or so.

Nah, what I was referring to is sticking the i_mutex into a different
class on inode creation, but apparently we already do, see
unlock_new_inode().

So its not a pure directory thing..
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Al Viro on
On Fri, Jul 02, 2010 at 03:12:52PM +0200, Frederic Weisbecker wrote:
> Right.
>
>
> The problem is:
>
> vfs_readdir() { do_munmap() {
> mutex_lock(inode); read or write(don't know)_lock(mm->mmap_sem)
> reiserfs_readdir() { reiserfs_file_release() {
> read_lock(mm->mmap_sem) mutex_lock(inode);
> } }
> } }
>
>
>
> I don't think the deadlock can really happen, as we can't release the directory while
> we are reading it. Plus I guess we can't mmap a directory (someone correct me if
> I'm wrong).

Gyah... For the 1001st time: readdir() is far from being the only thing that
nests mmap_sem inside i_mutex. In particular, write() does the same thing.

So yes, it *is* a real deadlock, TYVM, with no directories involved. Open the
same file twice, mmap one fd, close it, then have munmap() hitting i_mutex
in reiserfs_file_release() race with write() through another fd.

Incidentally, reiserfs_file_release() checks in the fastpath look completely
bogus. Checking i_count? What the hell is that one about? And no, these
checks won't stop open() coming between them and grabbing i_mutex, so they
couldn't prevent the deadlock in question anyway.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Al Viro on
On Sat, Jul 03, 2010 at 10:24:42AM +0100, Al Viro wrote:

> Gyah... For the 1001st time: readdir() is far from being the only thing that
> nests mmap_sem inside i_mutex. In particular, write() does the same thing.
>
> So yes, it *is* a real deadlock, TYVM, with no directories involved. Open the
> same file twice, mmap one fd, close it, then have munmap() hitting i_mutex
> in reiserfs_file_release() race with write() through another fd.
>
> Incidentally, reiserfs_file_release() checks in the fastpath look completely
> bogus. Checking i_count? What the hell is that one about? And no, these
> checks won't stop open() coming between them and grabbing i_mutex, so they
> couldn't prevent the deadlock in question anyway.

.... and unfortunately it's been that way since the the initial merge in 2.4.early.
FWIW, it seems that i_count check was a misguided attempt to check that no other
opened struct file are there, but it's
a) wrong, since way, _way_ back - open() affects d_count, not i_count
b) wrong even with such modification (consider hardlinks)
c) wrong for even more reasons since forever - i_count and d_count could
be bumped by many things at any time
d) hopelessly racy anyway, since another open() could very well have
happened just as we'd finished these checks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/