2008.02.19 NANOG 42 JumboFrames across the Interet

Matthew Petach mpetach at netflight.com
Wed Feb 20 02:08:45 UTC 2008


*whew*  Two more to go.  :)

Matt


2008.0-2.19
Tom Scholl, ATT labs, Jumbo packets on the internet

MTU   = maximum transmit unit
PMTUD = path mtu discovery

Every media has its own 'standard' MTU.
Most have large MTU of at least 4470
Ethernet MTU is 1500, which makes the
large size in core mostly useless.

benefits of bigger packets==fewer packets
per second, fewer lookups, fewer interrupts.
on the host, does it really help?

What about jumboframes?
what's a jumbo frame?  anything bigger than 1500;
modern gear can do larger than 1500; driven by
customers, not by a standard.

Path MTU discovery
fragmenting packets and reassembling is difficult
PMTUD detects lower MTU, sends message to host
asking it to readjust packet size.
PMTUD is very easy to break.
filtering ICMP kills it if not done right.
If mismatched MTUs on a link, PMTU can't
detect it.

path mtu mismatch can't be communicated back
if the far end of a link has an MTU too small,
as upstream never gets the message.

Interprovider jumboframes works great on
point to point links; but what about point
to multipoint links?  Can't negotiate to a
given router, so has to be set across the
fabric/subnet.

how about adaptive arp protocol; use existing
arp to find MAC address, then use that to probe
for MTU on path.
good luck getting it implemented.
hacking it in BGP might work, but would
require everyone to run BGP.

some exchanges have different vlans for
jumboframe (NetNOD)
but we already have so many vlans!

what's a good target MTU?  How about 8192
plus lots of header room and encapsulation
room.

anything bigger than 1500 makes sense.

What about going all the way up to 65k?

how do you specify the MTU on an interface?
is it frame payload
frame payload plus headers
...plus FCS
...plus 802.1q
depends on vendors and cards you're running.

Not all vendors can do 9k
 older cisco gear
 older juniper PICs
enabling  jumbo can be production impacting.

may be unrealistic to rexpect jumboframes to
reach all the way to home users for any time
soon.

Actions required
have IEEE standardize on a new MTU value?
Need a negotiation to discover neighbor MTU
need less breakable replacement for PMTUd

http://darkwing.uoregon.edu/~joe/jumbo-clean-gear.html

Q: better PMTUd--could end host report back to sending
host that the biggest sized fragment I got back was Y,
don't send packets bigger than that please?
Sounds reasonable.

Q: Todd, Renesys.  Lots of challenges, and a few benefits;
does this mean we're stuck with 1500?  Will larger MTUs
actually happen?  Can we at least support 1500 across
the board?
larger would give more room for encapsulation as we add
more and more header foo into the mix.

Q: Danny notes that options aren't being calculated
correctly; is MD5 calculated before or after CRC,
etc.

Q: Kevin Oberman, ESnet, RE networks around the world
are generally running jumboframes, the joint engineering
task force, under department of defense came up with
recommended value of 9000 bytes, and the RnD community
has standardized on 9000 payload bytes to be usable from
end to end; intra-AS is your own business, but if you want
to talk to the

Q: Darrel, Calren, when you have a boundry, you have a
big cloud of 9000 byte MTU, your edge device will have
to handle the fragmentation; be aware of the performance
impacts that you may cause in having edge router have
to process and deal with or discard.

Q: Patrick--we're basically talking about distributing
the fragmentation processing challenge out towards the
edge; sounds like a reasonable idea to implement.

Q: David Sinn notes that he's dealt with issues at the
gigapop where setting the DF bit is actually nicer than
making the edge router try to handle the fragementation.

What about using rate limiters

Q: RAS, why are we doing it at all?  IPv6 doesn't
support it at all; why not just stop doing
fragmentation across the board--if you forget to
set DF, you just blackhole yourself.


OK, Break time.

Survey is linked off nanog.org, go fill it out!

In spite of late start first two days, we start
at 9am tomorrow--wake up extra early.



More information about the NANOG mailing list