IPv6 day and tunnels

Owen DeLong owen at delong.com
Wed Jun 6 04:44:59 UTC 2012


On Jun 5, 2012, at 6:02 PM, Jimmy Hess wrote:

> On 6/5/12, Owen DeLong <owen at delong.com> wrote:
>> This is a horrible misconfiguration of the devices on that link.
>> If your MTU setting on your interface is larger than the smallest MTU
>> of any L2 forwarder on the link, then, you have badly misconfigured
> 
> Not really;   The network layer and L2 protocols should both be
> designed to handle this, it is a design error in the protocol that it
> doesn't.    You say it's "misconfiguration",  but if IP handled the
> situation reasonably, it shouldn't be necessary to configure anything
> in the first place.   Whether the neighbors are LAN or  cross-tunnel,
> the issues are similar.
> 

Really, no. The L3 MTU on an interface should be configured to the
lowest MTU reachable via that link without crossing a router. It's
just that simple. Anything else _IS_ a misconfiguration.

First, your idea of handling the situation reasonably is a layering
violation.

Second, you are correct. All L2 bridges for a given media type
should support the largest configurable MTU for that media
type, so, it is arguably a design flaw in the bridges. However,
in an environment where you have broken L2 devices (design
flaw), you have to configure appropriately for that.

> It's only a misconfiguration because of flaws in the protocol.

No, it's a misconfiguration because of the limitations of the
hardware due to its design defects. L3 should not need
to test the end-to-end L2 capabilities. It should be able to
depend on what the OS tells it.

> Just like you expect to plug devices in a typical LAN and it's not a
> configuration error to fail to manually find every switch in the LAN
> and enter MAC addresses into a forwarding table by hand;  likewise,
> you shouldn't expect to key a MTU into every device by hand.

You don't expect to ever care about the MAC addresses of any of
the switches in the LAN let alone enter them into any form of forwarding
table at all.

You do expect to need to know about the MAC addresses of adjacent
systems you are trying to reach, and, you use either ND or ARP to
map L3 addresses onto their corresponding L2 addresses as needed.

I will note that this depends on sending a packet out to an address that
reaches all of the candidate hosts (In the case of ND, this is a multicast
to all hosts which have the same last 24 bits in their IP suffix. In the case
of ARP, this is a broadcast packet) and expects them (at L3) to answer
"That's ME!". Of course you can enter them by hand in situations where
ARP or ND don't work for whatever reason.

You expect ARP or ND to work and a bridge that didn't forward ARP
would be just as broken as a bridge which doesn't support the full
interface MTU.

I would expect to have to enter MAC adjacencies manually if I had
a bridge that didn't pass ARP/ND traffic, just as I expect to have to
enter the MTU manually if I have a bridge that doesn't support the
correct full MTU of the network.

> IP should be designed so that devices on the link that _can_  handle
> the large transmission unit,  which provides efficiency gains, should
> be allowed to fully utilize those capabilities,  without breakage of
> connectivity to devices on the same link that  have more limited
> capabilities and can only receive the Minimum required frame size
> (smaller MTU),   and without separating the subnet or installing
> dividing  Proxy ARP servers  to send ICMP TooBig packets.

No, it really shouldn't. Doing this is a serious layering violation for one,
and, it can't be achieved efficiently number two. It adds lots of overhead
and is very error prone. There's no signaling mechanism for L3 to
be informed when the L2 topology changes, for example, which might
necessitate a recalculation of the MTU.

A given link should have a single MTU period. I don't know of ANY L3
protocol which supports anything else. Not IP, not IPX, not DECNET,
not AppleTalk, no Banyan Vines, not XNS, none of them support the
idea of MTU per adjacency.

If you can only have one MTU per link, then, it must be the lowest common
denominator of all participants and forwarders on that link.

>> Adding probing to compensate for this misconfiguration merely
>> serves to perpetuate such errant configurations.
> 
> Just like adding MAC address learning to Ethernet switches to
> compensate for the misconfiguration of failing to manually enter
> hardware addreses into your switches, serves to perpetuate such errant
> configurations,   where the state of the forwarding tables
> are unreliably left in a non-deterministic state.

Apples and oranges. See above.

In fact, MAC address learning on the switches is utterly unrelated to the
MAC adjacency table maintained by ARP/ND.

One is an L2 forwarding tree never learned by anything at L3 (the MAC
forwarding table learned on the switches) and the other is a MAC adjacency
table for a given link used by the L2 software on the host to populate the
L2 packet header based on the L3 information.

>>> You've got an issue if there are 100ms between two peers on your LAN.
>>> You're right, you don't need to probe for possible MTUs below 1280.
>> LAN, sure. However, consider that there are intercontinental L2 links.
> 
> Intercontinental multi-access L2  links, perhaps, are a horrible
> misconfiguration.
> 

No, they are not. They may be a horribly bad idea in many cases, but, there
are actually legitimate applications for them and they conform to the existing
documented standards.

Owen





More information about the NANOG mailing list