route converge time

Baldur Norddahl baldur.norddahl at
Sat Nov 28 23:36:11 UTC 2015


The IP transit links are direct links (not multihop). It is my impression
that a link down event is handled with no significant delay by the router
that has the link. The problem is the other router, the one that has to go
through the first router to access the link the went down.

The transit links are not unstable and in fact they have never been down
due to a fault. But we are a young network and still frequently have to
change things while we build it out. There have been cases where I have had
to take down the link for various reasons. There seems to be no way to do
this without causing significant disruption to the network.

Our routers are 2015 hardware. The spec has 2M IPv4 + 1M IPv6 routes in FIB
and 10M routes in RIB. Route convergence time is specified as 15k
routes/second. 8 GB ram on the route engines.

Say transit T1 is connected to router R1 and transit T2 is connected to
router R2.

I believe the underlying problem is that due to MPLS L3VPN the next hop on
R2 for routes out through T1 is not the transit provider router as usual.
Instead it is the loopback IP of R1. This means that when T1 goes down, the
next hop is still valid and R2 is unable to deactivate the invalid routes
as a group operation due to invalid next hop.

I am considering adding a loopback2 interface that has a trigger on the
transit interface, such that a shutdown on loopback2 is triggered if the
transit interface goes down. And then force next hop to be loopback2. That
way our IGP will signal that the next hop is gone and that should
invalidate all the routes as a group operation.



More information about the NANOG mailing list