Persistent BGP peer flapping - do you care?

Dickson, Brian brian.dickson at velocita.com
Fri Jan 18 00:30:00 UTC 2002


Here's my two cents...

A good rule of thumb (possibly from RFC 822) is, be liberal in what you
accept and strict in what you send.

When applied to BGP, I would suggest that any implementation should choose a
canonical form for constructing updates, but a parser that allows for
rule-bending without rule-breaking.

On the issue of existing vendor implementations, and how to build the specs
to prevent meltdowns:

I would suspect that during implementation, brand C routers were the victims
during testing, and perhaps the change was made to avoid that happening.

The current state of affairs is very much like the classical game-theory
"prisoner's dilemna".

The new spec should have two goals - discourage any implementation which can
lead to meltdowns, and encourage strict adherence to the spec. The latter
can be achieved via the former, in fact, if the mechanisms are well chosen.

My suggestion would be, rather than a back-off of resetting BGP sessions,
that first attempt strict interpretation (to insulate against completely
insane routers), and then loose interpretation. The model is "Fool me once,
shame on you, fool me twice, shame on me."

On first receiving a bad update, reset. If upon re-establishing the session,
the same bad update is heard, drop the bad update but keep the session up
(along with the messages back, etc.)

One additional optional behaviour I would suggest - look at the AS path
and/or path length and/or announcing router IP address. If heard from the
originator, drop the session (and either keep it down, or try one more time
before requiring operator intervention); it may be the case that only these
conditions strictly require a reset, and that all other situations may only
require the "ignore bad routes" behaviour.

Resetting BGP more than a small, finite number of times is, IMHO, a bad
idea. After all, BGP is a stateful protocol, and state changes should be
triggered deterministically, even if that requires operator input.

Brian Dickson
Velocita



More information about the NANOG mailing list