Did your BGP crash today?
rbf+nanog at panix.com
Sun Aug 29 14:11:03 UTC 2010
On Sun, Aug 29, 2010 at 12:30:21AM -0700, Paul Ferguson wrote:
> It would seem to me that there should actually be a better option, e.g.
> recognizing the malformed update, and simply discarding it (and sending the
> originator an error message) instead of resetting the session.
> Resetting of BGP sessions should only be done in the most dire of
> circumstances, to avoid a widespread instability incident.
The only thing you know for sure when you receive a malformed update
is that the router on the other end of the connection is broken (or
that there's something in between the other router and you that is
corrupting messages, but for the purposes of this, that's essentially
the same thing).
Accepting information received from a router known to be broken, and
then passing that on to other routers, is a bad idea and something that
could lead to a widespread instability incident. Of course, in theory,
you discard the bad updates and only pass on the good updates, but
doing that relies on the assumption that the known-to-be-broken router
on the other end of the connection is broken in such a way that ensures
that all the corrupted messages it sends will be recognizable as
malformed and can be discarded. There's plenty of corruption that
can't be detected on the receiving end.
On top of that, there's problems with being out of sync with the router
on the other end. For example, suppose a router developed a condition
that caused it to malform all withdraw messages (or, more precisely,
all UPDATE messages where the withdrarn routes length field is
non-zero). If we implement what you suggest above, then we'll accept
all the advertisements from that router, but ignore all the withdraws,
and end up sending that router a bunch of traffic that it won't
actually be able to handle.
More information about the NANOG