From: Paul E. McKenney on
On Tue, May 04, 2010 at 11:41:49PM +0200, Arnd Bergmann wrote:
> On Tuesday 04 May 2010 23:26:31 Stephen Hemminger wrote:
> > > The new versions of the rcu_dereference() APIs requires that any pointers
> > > passed to one of these APIs be fully defined. The ->br_port field
> > > in struct net_device points to a struct net_bridge_port, which is an
> > > incomplete type. This commit therefore changes ->br_port to be a void*,
> > > and introduces a br_port() helper function to convert the type to struct
> > > net_bridge_port, and applies this new helper function where required.
> >
> > I would rather make the bridge hook generic and not take a type argument.
>
> Not sure if you were confused by the comment in the same way that I was.
>
> The bridge hook is not impacted by this at all, since we can either pass
> a void* or a struct net_bridge_port* to it. The br_port() helper
> is used for all the places where we actually want to dereference
> dev->br_port and access its contents.

What should I change in the commit message to clear this up?

Of course, if the code needs to change, please let me know what should
change there as well.

Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Paul E. McKenney on
On Thu, May 06, 2010 at 04:09:25PM +0200, Arnd Bergmann wrote:
> On Thursday 06 May 2010, Paul E. McKenney wrote:
> > On Tue, May 04, 2010 at 11:41:49PM +0200, Arnd Bergmann wrote:
> > > On Tuesday 04 May 2010 23:26:31 Stephen Hemminger wrote:
> > > > > The new versions of the rcu_dereference() APIs requires that any pointers
> > > > > passed to one of these APIs be fully defined. The ->br_port field
> > > > > in struct net_device points to a struct net_bridge_port, which is an
> > > > > incomplete type. This commit therefore changes ->br_port to be a void*,
> > > > > and introduces a br_port() helper function to convert the type to struct
> > > > > net_bridge_port, and applies this new helper function where required.
> > > >
> > > > I would rather make the bridge hook generic and not take a type argument.
> > >
> > > Not sure if you were confused by the comment in the same way that I was.
> > >
> > > The bridge hook is not impacted by this at all, since we can either pass
> > > a void* or a struct net_bridge_port* to it. The br_port() helper
> > > is used for all the places where we actually want to dereference
> > > dev->br_port and access its contents.
> >
> > What should I change in the commit message to clear this up?
> >
> > Of course, if the code needs to change, please let me know what should
> > change there as well.
>
> I think it's both ok, I was mostly confused by the discussion we had earlier.
> Maybe add a sentence like:
>
> The br_handle_frame_hook now needs a forward declaration of struct net_bridge_port.

Done!

> Or you just change br_handle_frame_hook to take a void* to avoid the forward
> declaration. Not sure what Stephen was referring to really.

This sounds like a way to make things quite a bit more intrusive, so
holding off on this.

Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Stephen Hemminger on
On Wed, 12 May 2010 14:33:23 -0700
"Paul E. McKenney" <paulmck(a)linux.vnet.ibm.com> wrote:

> diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
> index 9101a4e..3f66cd1 100644
> --- a/net/bridge/br_fdb.c
> +++ b/net/bridge/br_fdb.c
> @@ -246,7 +246,7 @@ int br_fdb_test_addr(struct net_device *dev, unsigned char *addr)
> return 0;
>
> rcu_read_lock();
> - fdb = __br_fdb_get(dev->br_port->br, addr);
> + fdb = __br_fdb_get(br_port(dev)->br, addr);
> ret = fdb && fdb->dst->dev != dev &&
> fdb->dst->state == BR_STATE_FORWARDING;
> rcu_read_unlock();
> diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
> index 846d7d1..4fedb60 100644
> --- a/net/bridge/br_private.h
> +++ b/net/bridge/br_private.h
> @@ -229,6 +229,14 @@ static inline int br_is_root_bridge(const struct net_bridge *br)
> return !memcmp(&br->bridge_id, &br->designated_root, 8);
> }
>
> +static inline struct net_bridge_port *br_port(const struct net_device *dev)
> +{
> + if (!dev)
> + return NULL;
> +
> + return rcu_dereference(dev->br_port);
> +}

Looks like this is wrapping existing problems, and hurting not helping.

Why introduce a wrapper that could return NULL and not check the
result?

I would rather that:
1. dev should never be null in this cases so the first if() is
unnecessary, and confuses the semantics.
2. don't use wrapper br_port()
3. have callers check that rcu_dereference(dev->br_port) did not
return NULL.
If they derefernce does return NULL, then it means other CPU
has started tear down and this CPU should just go home quietly.

--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Paul E. McKenney on
On Wed, May 12, 2010 at 02:44:53PM -0700, Stephen Hemminger wrote:
> On Wed, 12 May 2010 14:33:23 -0700
> "Paul E. McKenney" <paulmck(a)linux.vnet.ibm.com> wrote:
>
> > diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
> > index 9101a4e..3f66cd1 100644
> > --- a/net/bridge/br_fdb.c
> > +++ b/net/bridge/br_fdb.c
> > @@ -246,7 +246,7 @@ int br_fdb_test_addr(struct net_device *dev, unsigned char *addr)
> > return 0;
> >
> > rcu_read_lock();
> > - fdb = __br_fdb_get(dev->br_port->br, addr);
> > + fdb = __br_fdb_get(br_port(dev)->br, addr);
> > ret = fdb && fdb->dst->dev != dev &&
> > fdb->dst->state == BR_STATE_FORWARDING;
> > rcu_read_unlock();
> > diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
> > index 846d7d1..4fedb60 100644
> > --- a/net/bridge/br_private.h
> > +++ b/net/bridge/br_private.h
> > @@ -229,6 +229,14 @@ static inline int br_is_root_bridge(const struct net_bridge *br)
> > return !memcmp(&br->bridge_id, &br->designated_root, 8);
> > }
> >
> > +static inline struct net_bridge_port *br_port(const struct net_device *dev)
> > +{
> > + if (!dev)
> > + return NULL;
> > +
> > + return rcu_dereference(dev->br_port);
> > +}
>
> Looks like this is wrapping existing problems, and hurting not helping.
>
> Why introduce a wrapper that could return NULL and not check the
> result?

Fair point!

> I would rather that:
> 1. dev should never be null in this cases so the first if() is
> unnecessary, and confuses the semantics.
> 2. don't use wrapper br_port()
> 3. have callers check that rcu_dereference(dev->br_port) did not
> return NULL.
> If they derefernce does return NULL, then it means other CPU
> has started tear down and this CPU should just go home quietly.

OK.

The reason for br_port() is to allow ->br_port to be a void*. If we
eliminate br_port(), then it is necessary to make the definition of the
struct net_bridge_port available everywhere that ->br_port is given to
rcu_dereference(). The reason for this is that Arnd's sparse-based RCU
checking code uses __rcu to tag the data pointed to by an RCU-protected
pointer. This in turn means that rcu_dereference() and friends must
now have access to the pointed-to type, as is done in patch 6 in this
series.

One way to make struct net_bridge_port available is to put:

#include "../../net/bridge/br_private.h"

in include/linux/netdevice.h.

However, when I try this, I get lots of build errors, which was what led
us to the path of making ->br_port be a void*, thus requiring the br_port()
helper function in cases where the caller needs the underlying type.

What should we be doing instead?

Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Stephen Hemminger on
On Wed, 12 May 2010 15:35:25 -0700
"Paul E. McKenney" <paulmck(a)linux.vnet.ibm.com> wrote:

> On Wed, May 12, 2010 at 02:44:53PM -0700, Stephen Hemminger wrote:
> > On Wed, 12 May 2010 14:33:23 -0700
> > "Paul E. McKenney" <paulmck(a)linux.vnet.ibm.com> wrote:
> >
> > > diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
> > > index 9101a4e..3f66cd1 100644
> > > --- a/net/bridge/br_fdb.c
> > > +++ b/net/bridge/br_fdb.c
> > > @@ -246,7 +246,7 @@ int br_fdb_test_addr(struct net_device *dev, unsigned char *addr)
> > > return 0;
> > >
> > > rcu_read_lock();
> > > - fdb = __br_fdb_get(dev->br_port->br, addr);
> > > + fdb = __br_fdb_get(br_port(dev)->br, addr);
> > > ret = fdb && fdb->dst->dev != dev &&
> > > fdb->dst->state == BR_STATE_FORWARDING;
> > > rcu_read_unlock();
> > > diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
> > > index 846d7d1..4fedb60 100644
> > > --- a/net/bridge/br_private.h
> > > +++ b/net/bridge/br_private.h
> > > @@ -229,6 +229,14 @@ static inline int br_is_root_bridge(const struct net_bridge *br)
> > > return !memcmp(&br->bridge_id, &br->designated_root, 8);
> > > }
> > >
> > > +static inline struct net_bridge_port *br_port(const struct net_device *dev)
> > > +{
> > > + if (!dev)
> > > + return NULL;
> > > +
> > > + return rcu_dereference(dev->br_port);
> > > +}
> >
> > Looks like this is wrapping existing problems, and hurting not helping.
> >
> > Why introduce a wrapper that could return NULL and not check the
> > result?
>
> Fair point!
>
> > I would rather that:
> > 1. dev should never be null in this cases so the first if() is
> > unnecessary, and confuses the semantics.
> > 2. don't use wrapper br_port()
> > 3. have callers check that rcu_dereference(dev->br_port) did not
> > return NULL.
> > If they derefernce does return NULL, then it means other CPU
> > has started tear down and this CPU should just go home quietly.
>
> OK.
>
> The reason for br_port() is to allow ->br_port to be a void*. If we
> eliminate br_port(), then it is necessary to make the definition of the
> struct net_bridge_port available everywhere that ->br_port is given to
> rcu_dereference(). The reason for this is that Arnd's sparse-based RCU
> checking code uses __rcu to tag the data pointed to by an RCU-protected
> pointer. This in turn means that rcu_dereference() and friends must
> now have access to the pointed-to type, as is done in patch 6 in this
> series.

Then ok. leave the wrapper, but get rid of the !dev part.

I can do it if you want.

Still don't like changing working code to conform to code checking tools.
Especially when code checking tool is missing bad RCU usage that already
exists (like this case). It is a big problem if code assumes rcu_deref
always returns non NULL.

--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/