IP tunnel MTU

Ray Soucy rps at maine.edu
Mon Oct 29 19:17:46 UTC 2012


Sorry, glanced at this and thought it was someone having problems with
tunnel MTU without adjusting TCP MSS.

Nice work, though my preference is to avoid tunnels at all costs :-)




On Mon, Oct 29, 2012 at 12:39 PM, Templin, Fred L
<Fred.L.Templin at boeing.com> wrote:
> Hi Ray,
>
> MSS rewriting has been well known and broadly applied for a long
> time now, but only applies to TCP. The subject of MSS rewriting
> comes up all the time in the IETF wg discussions, but has failed
> to reach consensus as a long-term alternative.
>
> Plus, MSS rewriting does no good for tunnels-within-tunnels. If
> the innermost tunnel rewrites MSS to a value that *it* thinks is
> safe there is no guarantee that the packets will fit within any
> outer tunnels that occur further down the line.
>
> What I want to get to is an indefinite tunnel MTU; i.e., admit
> any packet into the tunnel regardless of its size then make any
> necessary adaptations from within the tunnel. That is exactly
> what SEAL does:
>
>  https://datatracker.ietf.org/doc/draft-templin-intarea-seal/
>
> Thanks - Fred
> fred.l.templin at boeing.com
>
>> -----Original Message-----
>> From: Ray Soucy [mailto:rps at maine.edu]
>> Sent: Monday, October 29, 2012 7:55 AM
>> To: Templin, Fred L
>> Cc: Dobbins, Roland; NANOG list
>> Subject: Re: IP tunnel MTU
>>
>> The core issue here is TCP MSS. PMTUD is a dynamic process for
>> adjusting MSS, but requires that ICMP be permitted to negotiate the
>> connection.  The realistic alternative, in a world that filters all
>> ICMP traffic, is to manually rewrite the MSS.  In IOS this can be
>> achieved via "ip tcp adjust-mss" and on Linux-based systems, netfilter
>> can be used to adjust MSS for example.
>>
>> Keep in mind that the MSS will be smaller than your MTU.
>> Consider the following example:
>>
>>  ip mtu 1480
>>  ip tcp adjust-mss 1440
>>  tunnel mode ipip
>>
>> IP packets have 20 bytes of overhead, leaving 1480 bytes for data.  So
>> for an IP-in-IP tunnel, you'd set your MTU of your tunnel interface to
>> 1480.  Subtract another 20 bytes for the tunneled IP header and 20
>> bytes (typical) for your TCP header and you're left with 1440 bytes
>> for data in a TCP connection.  So in this case we write the MSS as
>> 1440.
>>
>> I use IP-in-IP as an example because it's simple.  GRE tunnels can be
>> a little more complex.  While the GRE header is typically 4 bytes, it
>> can grow up to 16 bytes depending on options used.
>>
>> So for a typical GRE tunnel (4 byte header), you would subtract 20
>> bytes for the IP header and 4 bytes for the GRE header from your base
>> MTU of 1500.  This would mean an MTU of 1476, and a TCP MMS of 1436.
>>
>> Keep in mind that a TCP header can be up to 60 bytes in length, so you
>> may want to go higher than the typical 20 bytes for your MSS if you're
>> seeing problems.
>>
>>
>>
>>
>> On Tue, Oct 23, 2012 at 10:07 AM, Templin, Fred L
>> <Fred.L.Templin at boeing.com> wrote:
>> > Hi Roland,
>> >
>> >> -----Original Message-----
>> >> From: Dobbins, Roland [mailto:rdobbins at arbor.net]
>> >> Sent: Monday, October 22, 2012 6:49 PM
>> >> To: NANOG list
>> >> Subject: Re: IP tunnel MTU
>> >>
>> >>
>> >> On Oct 23, 2012, at 5:24 AM, Templin, Fred L wrote:
>> >>
>> >> > Since tunnels always reduce the effective MTU seen by data packets
>> due
>> >> to the encapsulation overhead, the only two ways to accommodate
>> >> > the tunnel MTU is either through the use of path MTU discovery or
>> >> through fragmentation and reassembly.
>> >>
>> >> Actually, you can set your tunnel MTU manually.
>> >>
>> >> For example, the typical MTU folks set for a GRE tunnel is 1476.
>> >
>> > Yes; I was aware of this. But, what I want to get to is
>> > setting the tunnel MTU to infinity.
>> >
>> >> This isn't a new issue; it's been around ever since tunneling
>> technologies
>> >> have been around, and tons have been written on this topic.  Look at
>> your
>> >> various router/switch vendor Web sites, archives of this list and
>> others,
>> >> etc.
>> >
>> > Sure. I've written a fair amount about it too over the span
>> > of the last ten years. What is new is that there is now a
>> > solution near at hand.
>> >
>> >> So, it's been known about, dealt with, and documented for a long time.
>> In
>> >> terms of doing something about it, the answer there is a) to allow the
>> >> requisite ICMP for PMTU-D to work to/through any networks within your
>> span
>> >> of administrative control and b)
>> >
>> > That does you no good if there is some other network further
>> > beyond your span of administrative control that does not allow
>> > the ICMP PTBs through. And, studies have shown this to be the
>> > case in a non-trivial number of instances.
>> >
>> >> b) adjusting your own tunnel MTUs to
>> >> appropriate values based upon experimentation.
>> >
>> > Adjust it down to what? 1280? Then, if your tunnel with the
>> > adjusted MTU enters another tunnel with its own adjusted MTU
>> > there is an MTU underflow that might not get reported if the
>> > ICMP PTB messages are lost. An alternative is to use IP
>> > fragmentation, but recent studies have shown that more and
>> > more operators are unconditionally dropping IPv6 fragments
>> > and IPv4 fragmentation is not an option due to wrapping IDs
>> > at high data rates.
>> >
>> > Nested tunnels-within-tunnels occur in operational scenarios
>> > more and more, and adjusting the MTU for only one tunnel in
>> > the nesting does you no good if there are other tunnels that
>> > adjust their own MTUs.
>> >
>> >> Enterprise endpoint networks are notorious for blocking *all* ICMP (as
>> >> well as TCP/53 DNS) at their edges due to 'security' misinformation
>> >> propagated by Confused Information Systems Security Professionals and
>> >> their ilk.  Be sure that your own network policies aren't part of the
>> >> problem affecting your userbase, as well as anyone else with a need to
>> >> communicate with properties on your network via tunnels.
>> >
>> > Again, all an operator can control is that which is within their
>> > own administrative domain. That does no good for ICMPs that are
>> > lost beyond their administrative domain.
>> >
>> > Thanks - Fred
>> > fred.l.templin at boeing.com
>> >
>> >> -----------------------------------------------------------------------
>> >> Roland Dobbins <rdobbins at arbor.net> // <http://www.arbornetworks.com>
>> >>
>> >>         Luck is the residue of opportunity and design.
>> >>
>> >>                      -- John Milton
>> >>
>> >
>> >
>>
>>
>>
>> --
>> Ray Patrick Soucy
>> Network Engineer
>> University of Maine System
>>
>> T: 207-561-3526
>> F: 207-561-3531
>>
>> MaineREN, Maine's Research and Education Network
>> www.maineren.net



-- 
Ray Patrick Soucy
Network Engineer
University of Maine System

T: 207-561-3526
F: 207-561-3531

MaineREN, Maine's Research and Education Network
www.maineren.net




More information about the NANOG mailing list