What Worked - What Didn't

Iljitsch van Beijnum iljitsch at muada.com
Mon Sep 17 21:31:34 UTC 2001


On Mon, 17 Sep 2001, Randy Bush wrote:

> > When a BGP router loses power, it takes minutes for the peer on the
> > other side of the connection to notice something is wrong and reroute
> > the traffic.

> as i do not see this in rfc 1771 or draft-ietf-idr-bgp4-1[23].txt, i
> suspect that this is implementation specific.

You are right. From the RFC:

"The suggested value for the Hold Time is 90 seconds.  The suggested value
for the KeepAlive timer is 30 seconds."

and Cisco's defaults seem to be twice that. Is there any reason these
values should be this high? I mean, other than to mimic RIP behavior?

Fortunately, the lower of the values configured on both peers is used, so
this can easily be changed to 3/1 seconds. But people still have to do it.

Don't think this is a trivial issue. When the Amsterdam Internet Exchange
lost power a couple of months ago, we couldn't reach most of Europe for
about ten minutes when iBGP sessions of one of our transit ISPs started to
time out as they ran out of battery power.

I'm still not sure why all of this took this long, but three minutes are
pretty much guaranteed on any Cisco running BGP with "out of the box"
timers over a switched layer 2 network or when "no fast-external-fallover"
is in effect.




More information about the NANOG mailing list