From: Rahul on
I ususally use "jumbo frames" on a computational cluster of machines since
all the connected adapters are large MTU capable. THis is a private VLAN
and has its own associated subnet.

But recently we wanted to have a border-server straddle the network. This
has two adapters and the one in the private-VLAN can do large MTU's and the
public-internet-adapter could do normal MTU's.

But now we were planning on running NAT (via iptables and masqurade) on the
border-server. Is this a problem? I know that Jumbo-frames are problematic
unless all hardware end-to-end supports large MTU's. Unfortunately I don't
know how exactly NAT affects this. If an interior node tries to communicate
with the wider internet (via NAT) will it still use a large MTU and cause
problems?

Does NAT operate at the network layer and hence this will be a problem? Or
not? Are there ways of getting around this?

--
Rahul
From: Rick Jones on
Ignoring NAT for a moment, when a JF host tries to establish a TCP
connection to a non-JF host, 99 times out of 10, the MSS options in
the SYNchronize segments will mean that the JF host will actually use
a non-JF MSS for that connection. The real problem arises with UDP
communications - there is no MSS exchange there, so when a JF host
sends the 9Kish UDP datagram to the non-JF host it will hit a point
where the MTU goes non-JF and likely be dropped as a giant frame or
somesuch.

Now, the above was for a single broadcast domain. If there is a
router bewteen the JF and non-JF networks, the same TCP stuff applies,
the UDP datagram will be received by the router and then one of two
things happens when the router tries to forward it out the non-JF
interface. Either DF was *not* set in the IP header, in which case
the router will simply fragment the IP datagram carrying the UDP
datagram. If DF (don't fragment) is set in the IP header (I'm
assuming IPv4 here) then in the router will drop the Ip datagram and
may send-back an ICMP Datagram Too Big message.

rick jones
--
The computing industry isn't as much a game of "Follow The Leader" as
it is one of "Ring Around the Rosy" or perhaps "Duck Duck Goose."
- Rick Jones
these opinions are mine, all mine; HP might not want them anyway... :)
feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...
From: Rahul on
Rick Jones <rick.jones2(a)hp.com> wrote in news:hl2beh$3s0$1
@usenet01.boi.hp.com:

Thanks Rick! That explaination is very helpful.

> Ignoring NAT for a moment, when a JF host tries to establish a TCP
> connection to a non-JF host, 99 times out of 10, the MSS options in
> the SYNchronize segments will mean that the JF host will actually use
> a non-JF MSS for that connection. The real problem arises with UDP
> communications - there is no MSS exchange there, so when a JF host
> sends the 9Kish UDP datagram to the non-JF host it will hit a point
> where the MTU goes non-JF and likely be dropped as a giant frame or
> somesuch.

Luckily all my UDP communication is on the private VLAN. I don't expect
any UDP to be NAT'ed. So I should be safe. All hosts on the private VLAN
are Jumbo-Frame compliant.

>
> Now, the above was for a single broadcast domain. If there is a
> router bewteen the JF and non-JF networks, the same TCP stuff applies,

If I do NAT+masquerade via IPtables then that is my "router", I assume?
Just making sure the iptables-NAT does not pack any nasty surprises as
opposed to a "physical" router.

> the UDP datagram will be received by the router and then one of two
> things happens when the router tries to forward it out the non-JF
> interface. Either DF was *not* set in the IP header, in which case
> the router will simply fragment the IP datagram carrying the UDP
> datagram. If DF (don't fragment) is set in the IP header (I'm
> assuming IPv4 here) then in the router will drop the Ip datagram and
> may send-back an ICMP Datagram Too Big message.

And, out of curiosity, is there a way to tell routers (e.g. iptables-NAT
mode) that: "Even if DF was set for a UDP (or any other) datagram AND
Datagram is larger than a certain MTU; ignore the DF bit and please
fragment the datagram and send out". Or am I dabbling in fantasy here?
(or worse).

Disobeying the application layer's DF request seems the lesser evil than
dropping the datagram entirely because it was a JF. Or not?

[Of course, a worse option for the router is just to send a JF datagram
with the DF bit out into the larger world and then have some other non-JF
hardware drop it silently. Glad the router is smarter than that!]

--
Rahul
From: Rick Jones on
In comp.os.linux.networking Rahul <nospam(a)nospam.invalid> wrote:
> And, out of curiosity, is there a way to tell routers
> (e.g. iptables-NAT mode) that: "Even if DF was set for a UDP (or any
> other) datagram AND Datagram is larger than a certain MTU; ignore
> the DF bit and please fragment the datagram and send out". Or am I
> dabbling in fantasy here? (or worse).

I have heard that in the past various pieces of broken kit could be
configured to behave that way. I don't recall which but I would not
touch any of it with a 10 meter pole.

> Disobeying the application layer's DF request seems the lesser evil
> than dropping the datagram entirely because it was a JF. Or not?

NOT! Dropping the datagram, *and* sending the ICMP message about it
back to the source IP is the correct thing to do - it is how PathMTU
discovery works. It also happens to be what the specs for IPv4 say
should be done :)

rick jones
--
denial, anger, bargaining, depression, acceptance, rebirth...
where do you want to be today?
these opinions are mine, all mine; HP might not want them anyway... :)
feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...