route converge time
bill at herrin.us
Sat Nov 21 23:09:47 UTC 2015
On Sat, Nov 21, 2015 at 8:44 AM, Baldur Norddahl
<baldur.norddahl at gmail.com> wrote:
> I got a network with two routers and two IP transit providers, each with
> the full BGP table. Router A is connected to provider A and router B to
> provider B. We use MPLS with a L3VPN with a VRF called "internet".
> Everything happens inside that VRF.
> Now if I interrupt one of the IP transit circuits, the routers will take
> several minutes to remove the now bad routes and move everything to the
> remaining transit provider. This is very noticeable to the customers. I am
> looking into ways to improve that.
Buy a router with a beefier CPU. It takes a lot of operations to
remove the hundreds of thousands of stale routes from the RIB and
completely recalculate FIB.
> I added a default static route 0.0.0.0 to provider A on router A and did
> the same to provider B on router B. This is supposed to be a trick that
> allows the network to move packets before everything is fully converged.
> Traffic might not leave the most optimal link, but it will be delivered.
No. The router already has the alternate route in its RIB, just as
soon as the CPU can find time to remove the dead one and recalculate
the FIB. It won't get around to it any faster just because you also
have a default route in the RIB.
You -could- elect not to receive a full routing table -at all- and
then tie default routes to something in the partial table you accept.
Fewer routes = less recalculation. The trade off is that when the
problem is upstream from your particular link to a service provider,
it's less likely that you will recover from the error -at all- since
your router knows fewer of the individual routes. This will also
damage your ability to balance the load between the service providers.
William Herrin ................ herrin at dirtside.com bill at herrin.us
Owner, Dirtside Systems ......... Web: <http://www.dirtside.com/>
More information about the NANOG