Multi-homed clients and BGP timers

Iljitsch van Beijnum iljitsch at muada.com
Sat May 23 12:54:49 UTC 2009


On 23 mei 2009, at 0:58, Zaid Ali wrote:

>> From experience I found that you need to keep all the timers in  
>> sync with all your peers. Something like this for every peer in  
>> your bgp config.

> neighbor xxx.xx.xx.x timers 30 60


30 60 isn't a good choice because that means that after 30.1 seconds a  
keepalive comes in and then after 60.0 seconds the session will expire  
while the second one would be there in 60.1 seconds.

The other side will typically use hold timer / 3 for their keepalive  
interval. If you set it to something not divisible by 3 then you get  
all 3 of those within the hold timer.

I often recommended 5 16 in the past but that's a bit on the short  
side, some less robust BGP implementations work single threaded and  
may not be able to send keepalives every 15 seconds when they're very  
busy.

The minimum possible hold time is 3.

If you only change the setting at your end you can change it to  
something higher when bad stuff happens, if the other end also sets it  
then you'll have to change it at both ends as the hold time is  
negotiated and the lowest is used.

If you really want fast failover terminate the fiber in the BGP router  
and make sure fast-external-failover is on (I think it's the default).

For manual failover, simply shut down the BGP sessions on the router  
that you don't want to handle traffic at that time. If you have  
peergroups you can do "neighbor peergroup shutdown" for the fastest  
results. Shutting down interfaces is not such a good idea, then the  
routing protocols have to time out.



>
> Make sure that this is communicated to your peer as well so that  
> their timer setting are reflected the same.
>
> Zaid
> ----- Original Message -----
> From: "Steve Bertrand" <steve at ibctech.ca>
> To: "nanog list" <nanog at nanog.org>
> Sent: Friday, May 22, 2009 3:45:20 PM GMT -08:00 US/Canada Pacific
> Subject: Multi-homed clients and BGP timers
>
> Hi all,
>
> I've got numerous single-site 100Mb fibre clients who have backup SDSL
> links to my PoP. The two services terminate on separate
> distribution/access routers.
>
> The CPE that peers to my fibre router sets a community, and my end  
> sets
> the pref to 150 based on it. The CPE also sets a higher pref for
> prefixes from the fibre router. The SDSL router to CPE leaves the
> default preference in place. Both of my PE gear sends default- 
> originate
> to the CPE. There is (generally) no traffic that should ever be on the
> SDSL link while the fibre is up.
>
> Both of the PE routers then advertise the learnt client route up into
> the core:
>
> *>i208.70.107.128/28
>                    172.16.104.22             0    150      0 64762 i
> * i                 172.16.104.23             0    100      0 64762 i
>
> My problem is the noticeable delay for switchover when the fibre  
> happens
> to go down (God forbid).
>
> I would like to know if BGP timer adjustment is the way to adjust  
> this,
> or if there is a better/different way. It's fair to say that the fibre
> doesn't 'flap'. Based on operational experience, if there is a problem
> with the fibre network, it's down for the count.
>
> While I'm at it, I've got another couple of questions:
>
> - whatever technique you might recommend to reduce the convergence
> throughout the network, can the same principles be applied to iBGP  
> as well?
>
> - if I need to down core2, what is the quickest and easiest way to
> ensure that all gear connected to the cores will *quickly* switch to
> preferring core1?
>
> Steve
>





More information about the NANOG mailing list