Persistent BGP peer flapping - do you care?

Susan Hares skh at
Sat Jan 19 02:35:35 UTC 2002


Thank-you for your 2 cents.  I'm gathering all the input until
Sunday night.  I really appreciate your comments.  I'll summarize
all the input to the list at that time, and suggest some ideas.

I'll try to boil all the input on this problem into a document that
I can post to IDR and NANOG.


PS - I'm away from email from now until Monday am. Thanks nanog folks!!

At 07:30 PM 1/17/2002 -0500, Dickson, Brian wrote:

>Here's my two cents...
>A good rule of thumb (possibly from RFC 822) is, be liberal in what you
>accept and strict in what you send.
>When applied to BGP, I would suggest that any implementation should choose a
>canonical form for constructing updates, but a parser that allows for
>rule-bending without rule-breaking.
>On the issue of existing vendor implementations, and how to build the specs
>to prevent meltdowns:
>I would suspect that during implementation, brand C routers were the victims
>during testing, and perhaps the change was made to avoid that happening.
>The current state of affairs is very much like the classical game-theory
>"prisoner's dilemna".
>The new spec should have two goals - discourage any implementation which can
>lead to meltdowns, and encourage strict adherence to the spec. The latter
>can be achieved via the former, in fact, if the mechanisms are well chosen.
>My suggestion would be, rather than a back-off of resetting BGP sessions,
>that first attempt strict interpretation (to insulate against completely
>insane routers), and then loose interpretation. The model is "Fool me once,
>shame on you, fool me twice, shame on me."
>On first receiving a bad update, reset. If upon re-establishing the session,
>the same bad update is heard, drop the bad update but keep the session up
>(along with the messages back, etc.)
>One additional optional behaviour I would suggest - look at the AS path
>and/or path length and/or announcing router IP address. If heard from the
>originator, drop the session (and either keep it down, or try one more time
>before requiring operator intervention); it may be the case that only these
>conditions strictly require a reset, and that all other situations may only
>require the "ignore bad routes" behaviour.
>Resetting BGP more than a small, finite number of times is, IMHO, a bad
>idea. After all, BGP is a stateful protocol, and state changes should be
>triggered deterministically, even if that requires operator input.
>Brian Dickson

More information about the NANOG mailing list