From: Peter Zijlstra on
On Fri, 2009-09-04 at 16:51 +1000, npiggin(a)suse.de wrote:
> @@ -932,6 +984,7 @@ static int select_parent(struct dentry *
> int found = 0;
>
> spin_lock(&dcache_lock);
> + spin_lock(&this_parent->d_lock);
> repeat:
> next = this_parent->d_subdirs.next;
> resume:
> @@ -939,8 +992,9 @@ resume:
> struct list_head *tmp = next;
> struct dentry *dentry = list_entry(tmp, struct dentry, d_u.d_child);
> next = tmp->next;
> + BUG_ON(this_parent == dentry);
>
> - spin_lock(&dentry->d_lock);
> + spin_lock_nested(&dentry->d_lock, DENTRY_D_LOCK_NESTED);

Right, so this isn't going to work well, this dentry recursion is
basically unbounded afaict, so the 2nd subdir will also be locked using
DENRTY_D_LOCKED_NESTED, resulting in the 1st and 2nd subdir both having
the same (sub)class and lockdep doesn't like that much.

Do we really need to keep the whole path locked? One of the comments
seems to suggest we could actually drop some locks and re-acquire.

> dentry_lru_del_init(dentry);
> /*
> * move only zero ref count dentries to the end
> @@ -950,33 +1004,45 @@ resume:
> dentry_lru_add_tail(dentry);
> found++;
> }
> - spin_unlock(&dentry->d_lock);
>
> /*
> * We can return to the caller if we have found some (this
> * ensures forward progress). We'll be coming back to find
> * the rest.
> */
> - if (found && need_resched())
> + if (found && need_resched()) {
> + spin_unlock(&dentry->d_lock);
> goto out;
> + }
>
> /*
> * Descend a level if the d_subdirs list is non-empty.
> */
> if (!list_empty(&dentry->d_subdirs)) {
> + spin_unlock(&this_parent->d_lock);
> + spin_release(&dentry->d_lock.dep_map, 1, _RET_IP_);
> this_parent = dentry;
> + spin_acquire(&this_parent->d_lock.dep_map, 0, 1, _RET_IP_);
> goto repeat;
> }
> +
> + spin_unlock(&dentry->d_lock);
> }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Nick Piggin on
On Thu, Jun 17, 2010 at 05:13:35PM +0200, Peter Zijlstra wrote:
> On Fri, 2009-09-04 at 16:51 +1000, npiggin(a)suse.de wrote:
> > @@ -932,6 +984,7 @@ static int select_parent(struct dentry *
> > int found = 0;
> >
> > spin_lock(&dcache_lock);
> > + spin_lock(&this_parent->d_lock);
> > repeat:
> > next = this_parent->d_subdirs.next;
> > resume:
> > @@ -939,8 +992,9 @@ resume:
> > struct list_head *tmp = next;
> > struct dentry *dentry = list_entry(tmp, struct dentry, d_u.d_child);
> > next = tmp->next;
> > + BUG_ON(this_parent == dentry);
> >
> > - spin_lock(&dentry->d_lock);
> > + spin_lock_nested(&dentry->d_lock, DENTRY_D_LOCK_NESTED);
>
> Right, so this isn't going to work well, this dentry recursion is
> basically unbounded afaict, so the 2nd subdir will also be locked using
> DENRTY_D_LOCKED_NESTED, resulting in the 1st and 2nd subdir both having
> the same (sub)class and lockdep doesn't like that much.

No it's a bit of a trucky loop, but it is not unbounded. It takes the
parent, then the child, then it may continue again with the child as
the new parent but in that case it drops the parent lock and tricks
lockdep into not barfing.


> Do we really need to keep the whole path locked? One of the comments
> seems to suggest we could actually drop some locks and re-acquire.

As far as I can tell, RCU should be able to cover it without taking more
than 2 locks at a time. John saw some issues in the -rt tree (I haven't
reproduced yet) so he's locking the full chains there but I hope that
won't be needed.

>
> > dentry_lru_del_init(dentry);
> > /*
> > * move only zero ref count dentries to the end
> > @@ -950,33 +1004,45 @@ resume:
> > dentry_lru_add_tail(dentry);
> > found++;
> > }
> > - spin_unlock(&dentry->d_lock);
> >
> > /*
> > * We can return to the caller if we have found some (this
> > * ensures forward progress). We'll be coming back to find
> > * the rest.
> > */
> > - if (found && need_resched())
> > + if (found && need_resched()) {
> > + spin_unlock(&dentry->d_lock);
> > goto out;
> > + }
> >
> > /*
> > * Descend a level if the d_subdirs list is non-empty.
> > */
> > if (!list_empty(&dentry->d_subdirs)) {
> > + spin_unlock(&this_parent->d_lock);
> > + spin_release(&dentry->d_lock.dep_map, 1, _RET_IP_);
> > this_parent = dentry;
> > + spin_acquire(&this_parent->d_lock.dep_map, 0, 1, _RET_IP_);
> > goto repeat;

^^^ That's what we do when descending.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Peter Zijlstra on
On Fri, 2010-06-18 at 02:53 +1000, Nick Piggin wrote:

> > Right, so this isn't going to work well, this dentry recursion is
> > basically unbounded afaict, so the 2nd subdir will also be locked using
> > DENRTY_D_LOCKED_NESTED, resulting in the 1st and 2nd subdir both having
> > the same (sub)class and lockdep doesn't like that much.
>
> No it's a bit of a trucky loop, but it is not unbounded. It takes the
> parent, then the child, then it may continue again with the child as
> the new parent but in that case it drops the parent lock and tricks
> lockdep into not barfing.

Ah, indeed the thing you pointed out below should work.

> > Do we really need to keep the whole path locked? One of the comments
> > seems to suggest we could actually drop some locks and re-acquire.
>
> As far as I can tell, RCU should be able to cover it without taking more
> than 2 locks at a time. John saw some issues in the -rt tree (I haven't
> reproduced yet) so he's locking the full chains there but I hope that
> won't be needed.

Right, so I was staring at the -rt splat, so its John who created that
wreckage?

static int select_parent(struct dentry * parent)
{
struct dentry *this_parent;
struct list_head *next;
unsigned seq;
int found;

rename_retry:
found = 0;
this_parent = parent;
seq = read_seqbegin(&rename_lock);

spin_lock(&this_parent->d_lock);
repeat:
next = this_parent->d_subdirs.next;
resume:
while (next != &this_parent->d_subdirs) {
struct list_head *tmp = next;
struct dentry *dentry = list_entry(tmp, struct dentry, d_u.d_child);
next = tmp->next;

spin_lock_nested(&dentry->d_lock, DENTRY_D_LOCK_NESTED);
dentry_lru_del_init(dentry);
/*
* move only zero ref count dentries to the end
* of the unused list for prune_dcache
*/
if (!atomic_read(&dentry->d_count)) {
dentry_lru_add_tail(dentry);
found++;
}

/*
* We can return to the caller if we have found some (this
* ensures forward progress). We'll be coming back to find
* the rest.
*/
if (found && need_resched()) {
spin_unlock(&dentry->d_lock);
goto out;
}

/*
* Descend a level if the d_subdirs list is non-empty.
* Note that we keep a hold on the parent lock while
* we descend, so we don't have to reacquire it on
* ascend.
*/
if (!list_empty(&dentry->d_subdirs)) {
this_parent = dentry;
goto repeat;
}

spin_unlock(&dentry->d_lock);
}
/*
* All done at this level ... ascend and resume the search.
*/
if (this_parent != parent) {
struct dentry *tmp;
struct dentry *child;

tmp = this_parent->d_parent;
child = this_parent;
next = child->d_u.d_child.next;
spin_unlock(&this_parent->d_lock);
this_parent = tmp;
goto resume;
}

out:
/* Make sure we unlock all the way back up the tree */
while (this_parent != parent) {
struct dentry *tmp = this_parent->d_parent;
spin_unlock(&this_parent->d_lock);
this_parent = tmp;
}
spin_unlock(&this_parent->d_lock);
if (read_seqretry(&rename_lock, seq))
goto rename_retry;
return found;
}


> > > /*
> > > * Descend a level if the d_subdirs list is non-empty.
> > > */
> > > if (!list_empty(&dentry->d_subdirs)) {
> > > + spin_unlock(&this_parent->d_lock);
> > > + spin_release(&dentry->d_lock.dep_map, 1, _RET_IP_);
> > > this_parent = dentry;
> > > + spin_acquire(&this_parent->d_lock.dep_map, 0, 1, _RET_IP_);
> > > goto repeat;
>
> ^^^ That's what we do when descending.

You can write that as:
lock_set_subclass(&this_parent->d_lock.dep_map, 0, _RET_IP_);

See kernel/sched.c:double_unlock_balance().



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Nick Piggin on
On Mon, Jun 21, 2010 at 03:35:22PM +0200, Peter Zijlstra wrote:
> On Fri, 2010-06-18 at 02:53 +1000, Nick Piggin wrote:
>
> > > Right, so this isn't going to work well, this dentry recursion is
> > > basically unbounded afaict, so the 2nd subdir will also be locked using
> > > DENRTY_D_LOCKED_NESTED, resulting in the 1st and 2nd subdir both having
> > > the same (sub)class and lockdep doesn't like that much.
> >
> > No it's a bit of a trucky loop, but it is not unbounded. It takes the
> > parent, then the child, then it may continue again with the child as
> > the new parent but in that case it drops the parent lock and tricks
> > lockdep into not barfing.
>
> Ah, indeed the thing you pointed out below should work.
>
> > > Do we really need to keep the whole path locked? One of the comments
> > > seems to suggest we could actually drop some locks and re-acquire.
> >
> > As far as I can tell, RCU should be able to cover it without taking more
> > than 2 locks at a time. John saw some issues in the -rt tree (I haven't
> > reproduced yet) so he's locking the full chains there but I hope that
> > won't be needed.
>
> Right, so I was staring at the -rt splat, so its John who created that
> wreckage?

It was, but apparently they saw an RCU bug there somewhere and hit it
with the big hammer. I haven't been able to reproduce it on a non-rt
kernel yet, and I see yet why RCU is not good enough here.

> > > > /*
> > > > * Descend a level if the d_subdirs list is non-empty.
> > > > */
> > > > if (!list_empty(&dentry->d_subdirs)) {
> > > > + spin_unlock(&this_parent->d_lock);
> > > > + spin_release(&dentry->d_lock.dep_map, 1, _RET_IP_);
> > > > this_parent = dentry;
> > > > + spin_acquire(&this_parent->d_lock.dep_map, 0, 1, _RET_IP_);
> > > > goto repeat;
> >
> > ^^^ That's what we do when descending.
>
> You can write that as:
> lock_set_subclass(&this_parent->d_lock.dep_map, 0, _RET_IP_);
>
> See kernel/sched.c:double_unlock_balance().

OK I'll keep that in mind, thanks!

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Peter Zijlstra on
On Tue, 2010-06-22 at 00:48 +1000, Nick Piggin wrote:
> > Right, so I was staring at the -rt splat, so its John who created that
> > wreckage?
>
> It was, but apparently they saw an RCU bug there somewhere and hit it
> with the big hammer. I haven't been able to reproduce it on a non-rt
> kernel yet, and I see yet why RCU is not good enough here.

John, could you describe the failure you spotted?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/