From: David Miller on
From: David Miller <davem(a)davemloft.net>
Date: Tue, 02 Feb 2010 17:40:45 -0800 (PST)

> From: "Kevin Pedretti" <ktpedre(a)sandia.gov>
> Date: Tue, 2 Feb 2010 18:24:02 -0700
>
>> 2. May want to use alloc_netdev() -> Didn't do this. Would there be a
>> substantial advantage to doing this?
>
> I think you're going to end up having to make this change.

Ignore this, using alloc_etherdev() should be just fine.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: David Miller on
From: "Kevin Pedretti" <ktpedre(a)sandia.gov>
Date: Tue, 2 Feb 2010 18:24:02 -0700

> 4. Device only supports IPv4? -> Yes, that's correct. No IPv6 support.
> The driver squashes everything but IPv4 in eth2ss().

Not just IPV6, what about other ethernet protocols?

What about ARP? How does IPV4 work if you only accept ETH_P_IP? You
need to accept at least ETH_P_ARP for things to work.

> 2. May want to use alloc_netdev() -> Didn't do this. Would there be a
> substantial advantage to doing this?

I think you're going to end up having to make this change.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Randy Dunlap on
On 02/02/10 17:24, Kevin Pedretti wrote:
> Thank you all for the review comments. I believe most of the issues
> have been addressed in the patch just posted. I apologize if there are
> still issues, and certainly appreciate further comments.
>
>
> Randy Dunlap's comments:
> 2. Odd spacing -> I'm not seeing this. Spacing looks correct to me.

Thanks. Seems to be something that Thunderbird 3.0 is doing for me. :(

--
~Randy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Kevin Pedretti on
On Tue, 2010-02-02 at 18:40 -0700, David Miller wrote:
> From: "Kevin Pedretti" <ktpedre(a)sandia.gov>
> Date: Tue, 2 Feb 2010 18:24:02 -0700
>
> > 4. Device only supports IPv4? -> Yes, that's correct. No IPv6 support.
> > The driver squashes everything but IPv4 in eth2ss().
>
> Not just IPV6, what about other ethernet protocols?
>
> What about ARP? How does IPV4 work if you only accept ETH_P_IP? You
> need to accept at least ETH_P_ARP for things to work.


The only thing the driver supports currently is point-to-point IPv4,
nothing else. The limitation is that the header format for datagram
messages is fixed, and it isn't really setup for Ethernet encapsulation:

Ethernet Frame: [6 bytes h_dest][6 bytes h_source][2 bytes h_proto][data...]
SeaStar DG Message: [2 bytes length][1 byte MBZ][1 byte msg type (2 << 5) for IP)][data...]

I think it would be possible to re-factor it so that the Ethernet frame
is encapsulated in its entirety within a seastar message, rather than
the current scheme of jamming the critical info from the Ethernet header
into the seastar datagram header. I will pursue that if you want... the
drawback is that it would break compatibility with Cray's existing
proprietary IP over SeaStar driver, making this driver pretty much
useless for the kinds of things us and others would like to do (e.g.,
leave service nodes booted with Cray's proprietary software stack and
talking to compute nodes running this driver).

As far as ARP goes, it isn't supported since the underlying network is
point-to-point only with no hardware broadcast. At bootup, each node's
ARP table is pre-populated with entries for every node in the system,
and each node's MAC address encodes its node ID on the mesh. This
driver uses the NID in the target MAC address to know who to send the
skb to.

I think it would be possible to emulate ARP in software using
point-to-point messages (send this packet to my 6 nearest neighbors,
neighbors send to their neighbors, etc.) but that would be quite a bit
more complicated compared to the static ARP table solution. Again, it
would also break compatibility with Cray's proprietary driver.

Please let me know if these issues are show-stoppers as far as inclusion
goes. We would like to get this open-source driver into the kernel so
us and others with Cray XT systems can start to benefit from it, and
continue to automatically track kernel API changes.

Kevin


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Alan Cox on
> Ethernet Frame: [6 bytes h_dest][6 bytes h_source][2 bytes h_proto][data...]
> SeaStar DG Message: [2 bytes length][1 byte MBZ][1 byte msg type (2 << 5) for IP)][data...]
>
> I think it would be possible to re-factor it so that the Ethernet frame
> is encapsulated in its entirety within a seastar message, rather than
> the current scheme of jamming the critical info from the Ethernet header
> into the seastar datagram header. I will pursue that if you want... the
> drawback is that it would break compatibility with Cray's existing
> proprietary IP over SeaStar driver, making this driver pretty much
> useless for the kinds of things us and others would like to do (e.g.,
> leave service nodes booted with Cray's proprietary software stack and
> talking to compute nodes running this driver).

Perhaps it shouldn't be pretending to be an ethernet driver - that is
sort of the root cause of all the confusion and the fact things like the
bridging layer will try and grab it etc ? If it claimed to be a new
hardware type you'd take a brief hit on getting the new hw type into the
tools but it would mean

- tcpdump etc once coaxed would display seastar frames not fake ethernet
- the config tools would actually report what it really was
- non IP layers and userspace won't keep trying to do things you don't
want (what does it do right now with vlans I wonder 8))
- there will be no ARP confusion

If it wants to stay compatible and pretend to be ethernet you probably
need a message type for "encapsulated ethernet", you can then encapsulate
anything not IP and stay compatible by keeping IP sent the way it is
now ? Thats if it wants to in the first place.

Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/