[NANOG] Microsoft.com PMTUD black hole?

Bjørn Mork bjorn at mork.no
Wed May 7 08:10:35 UTC 2008


Iljitsch van Beijnum <iljitsch at muada.com> writes:

> Many years ago I had occasion to terminate dial-up service over L2TP  
> from modem pools operated by a service provider who shall remain  
> nameless to protect the guilty. This service had the unfortunate  
> tendency to drap all packets larger than 576 bytes. So we needed to  
> negotiate a 576-byte MTU over PPP.
>
> We then got many complaints from users who dialed in using ISDN  
> routers (yes this was a while ago) because of broken path MTU  
> discovery. The behavior that Microsoft exhibits was EXTREMELY common  
> in those days, and I have no reason to assume it's any less common  
> today. (I also see it regularly with IPv6.) What I did was clear the  
> DF bit on packets going out to the L2TP virtual interfaces so the  
> packets could be fragmented.

Right.  I once stumbled across a SOHO-router doing just that.  I never
understood why, but now you've given at least one explanation how it
could appear to be a good idea.  

I can also provide the reason why we found it to be an extremely bad
idea at the time: Some (most? all?) systems won't set both the DF flag
and the identification field at the same time.  If you clear the DF flag
without changing the identification field, you might end up with
fragmented packets that are impossible to reassemble.  Which was why I
stumbled across the DF-clearing SOHO-router in the first place.  The
random problems it generated were extremely difficult to debug, and when
we started we truly believed that we had a problem with a layer 4 load
balancing switch. 

Note: There are solutions that will both clear the DF flag and generate
a new id.  E.g. http://www.openbsd.org/faq/pf/scrub.html 

This is the proper way to clear DF, if you must.  Never just clear it.



Bjørn




More information about the NANOG mailing list