Persistent BGP peer flapping - do you care?
Christopher A. Woodfield
rekoil at semihuman.com
Thu Jan 17 22:00:06 UTC 2002
See the "BGP Noise Tonight?" thread from the NANOG archives, October 2001.
A bogus prefix (a leaked comfederation string) originated from $NETWORK, and that
networks' peers/upstreams, whether or not they dropped the peer, propogated the
prefix, in violation of RFC behavior. Said prefix propagated to most of the 'net,
and every RFC-compliant router that got it dropped its peering session. The
customer I was working with, for example, lost complete connectivity because
all three of his upstream providers sent him the bad route.
If everyone had followed the RFC, the prefix would have never made it past the
first peering sessions, and the damage would have been contained. But because most
Cisco routers don't follow the RFC, it became a much more widespread operational
issue as the bad prefix hit many, many RFC-compliant routers and caused many, many
peers to drop unnecessarily. We witnessed exactly the kind of looping you're
talking about; session gets bad prefix, drop, reopens, gets same bad prefix again,
drops, and on and on; this could definitely benefit from the holddown timer you're
The problem here, ironically, is that it's not the Cisco networks that fail in this
case, but the hardware that actually tries to do the right thing, which is fine on
paper, but in this case, turned out to be exactly the Wrong Thing.
My opinion would be to modify the spec as follows:
1. Upon receipt of an invalid prefix advertisement, notify the sender of the error
and flush the advertisement. Do not drop the peering session unless a large number
of bad prefixes are received (possibly as a percentage of total route updates
received over a given amount of time).
2. Upon receipt of invalid BGP control/negotiation data (i.e. data that's not part
of a prefix advertisement, such as keepalives, etc), notify sender of the error and
drop the peering session.
I agree with your holddown timer proposal in cases of the peer being dropped due to
errors, as the resultant loops can result in extreme prefix dampening. But my
assertation is that BGP peering sessions should be a bit more robust and not drop
everything at the first sign of trouble.
On Thu, Jan 17, 2002 at 04:14:10PM -0500, Susan Hares wrote:
> Thanks for the input. This is the revisit the
> specification time. Just to confirm your
> answer, I'll paraphrase it and let you know what happened.
> the persistent bgp peer flapping
> happens when you (one of the paths)
> 1) Error causes stop
> (bad prefix --> drop connection)
> 2) BGP peer goes to IDLE state
> 3) Automatic restart happens (cisco doesn't utilize the
> 4) Open sent
> 5) active
> 6) error due to bad prefix still being sent
> 7) Idle Hold time (time delay here)
> --> go back to #1
> Specification says to slow down the cycle of
> the establishing by increase the time delay
> in step #7.
> I think we are describing the same problem. Could you
> please confirm?
> At 03:10 PM 1/17/2002 -0500, Christopher A. Woodfield wrote:
> >This has been bandied about before, but one should note that the "drop the
> >peer if an error is received" is only really effective if the session that
> >initiated the error does not propogate it. Most Cisco routers running
> >common IOS images not only do not drop the session, but pass along the
> >bad prefix, which
> >leads to the occasional bad route dropping peering sessions on most of
> >the Enterasys(*) routers on the planet.
> Do the peering sessions drop once or repeatedly until
> the bad prefix gets cleared out?
> >I guess the main question is what is considered an "error" - if the peer
> >starts obviously misbehaving, then yet, drop the peer. But don't drop the
> >peer due to an invalid prefix that most likely did not ori0ginate on that
> >router - it would be much better for the 'net as a whole to
> > just drop the bad prefix and carry on. Maybe a
> >algorithm could be built in where the peer could be dropped if the number
> >of bad
> >prefixes exceeds a set threshold...
> The algorithms for what constitutes a "drop" can be an implementation
> detail or be specified as an optional portion of the next version
> of the BGP specification.
> >In short, the "drop the session when you get a bad prefix" only works its
> >purpose when every router that speaks BGP does this. If that can't be had,
> >should really revisit the spec in that regard.
> The specification says "recommended" (should) now and as we noted with
> cisco, not all vendors implement it. We are documenting
> existing practice so recommended/should will remain.
> If you think it is a very serious operational issue, you
> can always input to the idr mailing list that the "should" needs
> to be "must" due to an operational issues.
> Thanks again for answering the cry for help!
Christopher A. Woodfield rekoil at semihuman.com
PGP Public Key: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xB887618B
More information about the NANOG