Did your BGP crash today?
cjeker at diehard.n-r-g.com
Sat Aug 28 15:26:44 UTC 2010
On Sat, Aug 28, 2010 at 02:51:17PM +0200, Thomas Mangin wrote:
> We had ASN4, AS-PATH and this one. More or less we hit this session
> reset problem once a year but nothing was done yet to change the RFC.
You are mixing up three totaly different problems. Sure the result was the
same (session drops). This time a IOS XR device was corrupting an
attribute before sending it out. The corruption had to be in the header
section of the attribute or the other side would not have detected it
(since the neighbor did not know about this attribute either). Now if a
system sends out corrupted BGP messages there is no way out, you need to
close the session because not doing so may result in much bigger mayhem.
It was not mentioned what the corruption was actually, was the lenght
wrong or was the optional flag missing (makeing the attribute well known)?
Unlike in the ASN4 issue this time the session to the faulty system was
dropped and by doing so stopped further issues.
> So I am to blame as much as every network engineer to not have pushed
> for a change or at least a comprehensive explanation on the session
> teardown behaviour is like it is and should not be changed.
> It is only our fault for not having dealt with the problem the first
> time correctly, and will be next time if nothing is changed once more.
> I agree correctly framed invalid packet should be discarded without
> tearing the session down.
Great, corrupting your RIB and FIB and every of your peers RIB. Thanks a
lot for routing loops and wrong announcements. The only thing you can drop
without causing troubles are (tranistive) optional attributes. This is
covered by draft-ietf-idr-optional-transitive and hopefully it will be
adopted as RFC and implemented by vendors.
If a well known attribute like AS-PATH is corrupted then there is no
choice, the session needs to be reset. Which is bad when the AS-PATH
validation code has a bug.
More information about the NANOG