From: Paul E. McKenney on
On Wed, Feb 24, 2010 at 03:59:58PM -0800, Paul E. McKenney wrote:
> On Thu, Feb 25, 2010 at 12:07:47AM +0100, Arnd Bergmann wrote:
> > On Wednesday 24 February 2010 23:17:49 Paul E. McKenney wrote:
> > > > Ok, this is a significant limitation of the list rcu annotation
> > > > then, it's not possible to pass the same list into list_for_each_entry
> > > > and list_for_each_entry_rcu with the way I changed the rcu list
> > > > definition. I would be possible to do a __list_for_each_entry_rcu
> > > > macro that takes an rcu_list_head but does not actually use
> > > > rcu_dereference, but I'm not sure if that's good enough.
> > >
> > > Hmmm... If the __rcu annotation was visible at runtime, it would be
> > > easy provide an annotated version of list_for_each_entry_rcu() that
> > > checks for module_mutex being held under lockdep.
> >
> > Well, if we keep the struct rcu_list_head and make it mandatory for
> > rcu protected lists, it could be defined as
> >
> > struct rcu_list_head {
> > struct list_head head;
> > #ifdef CONFIG_PROVE_RCU
> > bool (*check)(void);
> > #endif
> > };
> >
> > #ifdef CONFIG_PROVE_RCU
> > #define RCU_LIST_HEAD_INIT_CHECK(__head, __check) \
> > { .head = LIST_HEAD_INIT((__head).head), .check = (__check) }
> > #else
> > #define RCU_LIST_HEAD_INIT_CHECK(__list,__check) {.head = LIST_HEAD_INIT((__head).head) }
> > #endif
> >
> > #define RCU_LIST_HEAD_INIT(head) RCU_LIST_HEAD_INIT_CHECK(head,&rcu_read_lock_held)
> > #define RCU_LIST_HEAD_INIT_BH(head) RCU_LIST_HEAD_INIT_CHECK(head,&rcu_read_lock_bh_held)
> >
> > #define list_entry_rcu_check(ptr, type, member, check) \
> > container_of(rcu_dereference_check((void __rcu __force *)(ptr), check), type, member)
> >
> > #define list_for_each_entry_rcu(pos, __head, member) \
> > for (pos = list_entry_rcu_check((__head)->head.next, typeof(*pos), \
> > member, (__head)->check); \
> > prefetch(pos->member.next), &pos->member != (head); \
> > pos = list_entry_rcu_check(pos->member.next, typeof(*pos), \
> > member, (__head)->check)))
> >
> > That would let us check all the heads for correct usage, and at the same
> > time avoid having to annotate all the list entries.
>
> Cool!!!
>
> The nice thing about this is that we don't end up with the API explosion
> for the RCU list primitives. However, it does require that a given
> rcu_list_head have a single synchronization-design rule for all uses.
> Of course, if there were multiple rules, one could construct a check
> that was simply the union of all the rules, but that would miss some
> types of errors.
>
> Of course, if this became a problem, there could be an argument to the
> ->check function that the normal list_for_each_entry_rcu() defaults to
> "no effect".
>
> Or is there a better way to handle this?

One approach would be to use your original sparse-based approach, but
use an rcu_deference_const(ptr,lockdep_condition) for cases when the
value cannot change, for example, when the update-side lock is held.
This should eliminate most of the false positives, in particular,
eliminate the need for otherwise-useless rcu_read_lock()s -- and also
for the compiler constraints in the normal rcu_dereference().

Your pointer-to-function idea could be a really cool way to handle the
tree algorithms that can be protected either by RCU or by locking.
The tree nodes could have the pointer to check function, and the
current rcu_dereference_raw() calls could be replaced by an invocation
of rcu_dereference_check() that calls the check function. A check
function for an RCU-protected tree would use "rcu_read_lock_held() ||
lockdep_is_held(&update_side_lock)", while a lock-protected tree would
just use "lockdep_is_held(&update_side_lock)".

Thoughts?

Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Arnd Bergmann on
On Thursday 25 February 2010, Paul E. McKenney wrote:
> >
> > The nice thing about this is that we don't end up with the API explosion
> > for the RCU list primitives. However, it does require that a given
> > rcu_list_head have a single synchronization-design rule for all uses.
> > Of course, if there were multiple rules, one could construct a check
> > that was simply the union of all the rules, but that would miss some
> > types of errors.

What would it miss? E.g. having the module code check for
(mutex_is_locked(&module_lock) || rcu_read_lock_held) should
cover all cases as far as I can tell.

> > Of course, if this became a problem, there could be an argument to the
> > ->check function that the normal list_for_each_entry_rcu() defaults to
> > "no effect".

I've also been thinking about adding a list_for_each_entry_norcu()
macro that takes an rcu_list_head but then just performs a simple
list_for_each_entry().

> > Or is there a better way to handle this?
>
> One approach would be to use your original sparse-based approach, but
> use an rcu_deference_const(ptr,lockdep_condition) for cases when the
> value cannot change, for example, when the update-side lock is held.
> This should eliminate most of the false positives, in particular,
> eliminate the need for otherwise-useless rcu_read_lock()s -- and also
> for the compiler constraints in the normal rcu_dereference().

Right.

> Your pointer-to-function idea could be a really cool way to handle the
> tree algorithms that can be protected either by RCU or by locking.
> The tree nodes could have the pointer to check function, and the
> current rcu_dereference_raw() calls could be replaced by an invocation
> of rcu_dereference_check() that calls the check function. A check
> function for an RCU-protected tree would use "rcu_read_lock_held() ||
> lockdep_is_held(&update_side_lock)", while a lock-protected tree would
> just use "lockdep_is_held(&update_side_lock)".

I've postponed that problem for now, and updated my series to split
the rculist annotations from the basic __rcu pointer annotations,
as well as to apply on top of your patches in tip/core/rcu,
see http://git.kernel.org/?p=linux/kernel/git/arnd/playground.git;\
a=shortlog;h=refs/heads/rcu-annotate-tip.

Should we merge the simple annotations in this merge window and
then think about rculist and trees separately?

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Paul E. McKenney on
On Thu, Feb 25, 2010 at 07:10:34PM +0100, Arnd Bergmann wrote:
> On Thursday 25 February 2010, Paul E. McKenney wrote:
> > >
> > > The nice thing about this is that we don't end up with the API explosion
> > > for the RCU list primitives. However, it does require that a given
> > > rcu_list_head have a single synchronization-design rule for all uses.
> > > Of course, if there were multiple rules, one could construct a check
> > > that was simply the union of all the rules, but that would miss some
> > > types of errors.
>
> What would it miss? E.g. having the module code check for
> (mutex_is_locked(&module_lock) || rcu_read_lock_held) should
> cover all cases as far as I can tell.

My concern is single data structures used in different parts of the code
with different update-side locks, perhaps also different flavors of RCU.
Some of the tree data structures in the Linux kernel can be protected by
either locking or RCU, for example.

> > > Of course, if this became a problem, there could be an argument to the
> > > ->check function that the normal list_for_each_entry_rcu() defaults to
> > > "no effect".
>
> I've also been thinking about adding a list_for_each_entry_norcu()
> macro that takes an rcu_list_head but then just performs a simple
> list_for_each_entry().

We might need to do something like this, but if we do, we need to
minimize the need to use it.

> > > Or is there a better way to handle this?
> >
> > One approach would be to use your original sparse-based approach, but
> > use an rcu_deference_const(ptr,lockdep_condition) for cases when the
> > value cannot change, for example, when the update-side lock is held.
> > This should eliminate most of the false positives, in particular,
> > eliminate the need for otherwise-useless rcu_read_lock()s -- and also
> > for the compiler constraints in the normal rcu_dereference().
>
> Right.
>
> > Your pointer-to-function idea could be a really cool way to handle the
> > tree algorithms that can be protected either by RCU or by locking.
> > The tree nodes could have the pointer to check function, and the
> > current rcu_dereference_raw() calls could be replaced by an invocation
> > of rcu_dereference_check() that calls the check function. A check
> > function for an RCU-protected tree would use "rcu_read_lock_held() ||
> > lockdep_is_held(&update_side_lock)", while a lock-protected tree would
> > just use "lockdep_is_held(&update_side_lock)".
>
> I've postponed that problem for now, and updated my series to split
> the rculist annotations from the basic __rcu pointer annotations,
> as well as to apply on top of your patches in tip/core/rcu,
> see http://git.kernel.org/?p=linux/kernel/git/arnd/playground.git;\
> a=shortlog;h=refs/heads/rcu-annotate-tip.
>
> Should we merge the simple annotations in this merge window and
> then think about rculist and trees separately?

I haven't given up on the possibility of getting the whole thing into
this merge window, but if that is not possible, it would be good to
start on the annotations. Of course, the annotations would need to be
done so that they don't rain false positives on people who are not
actively looking to see them.

Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Paul E. McKenney on
On Thu, Feb 25, 2010 at 12:05:32PM -0800, Paul E. McKenney wrote:
> On Thu, Feb 25, 2010 at 07:10:34PM +0100, Arnd Bergmann wrote:

[ . . . ]

> > I've postponed that problem for now, and updated my series to split
> > the rculist annotations from the basic __rcu pointer annotations,
> > as well as to apply on top of your patches in tip/core/rcu,
> > see http://git.kernel.org/?p=linux/kernel/git/arnd/playground.git;\
> > a=shortlog;h=refs/heads/rcu-annotate-tip.

At first glance, this looks reasonably sane. I have looked up through
the "scheduler: __rcu annotations" commit.

Some comments:

o The name rcu_dereference_const() makes more sense to me than
does __rcu_dereference(), as it documents when you can safely
use it -- when something is preventing the RCU-protected
pointer in question from changing.

o Uses of __rcu_dereference() in your playground.git that are
safe because some lock is held should be changed to
rcu_dereference_check(), mentioning that lock. Ditto zero
reference counts.

For example, in your first change to put_ctx() in
kernel/perf_event.c, the:

put_ctx(__rcu_dereference(ctx->parent_ctx));

should instead be:

put_ctx(rcu_dereference_check(ctx->parent_ctx,
ctx->refcount == 0));

This does take a bit more space, but very clearly documents
the synchronization design and enables the combination of
sparse and lockdep to enforce it. And yes, this example has
the "if" right above the use, but many other cases are not
so easy to see so quickly. And a future change might well
rearrange the code so that the "if" is a long ways away from
the dereference.

o Whatever we choose for the name of what is __rcu_dereference()
in your tree, uses should be commented, just as things like
smp_mb() are commented. For example:

q = __rcu_dereference(p->next); /* Initialization. */

to indicate that the structure is still being initialized so
that no other CPU or task has access to it.

Again, looks promising!

Thanx, Paul

> > Should we merge the simple annotations in this merge window and
> > then think about rculist and trees separately?
>
> I haven't given up on the possibility of getting the whole thing into
> this merge window, but if that is not possible, it would be good to
> start on the annotations. Of course, the annotations would need to be
> done so that they don't rain false positives on people who are not
> actively looking to see them.
>
> Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/