internet routing table in a vrf

Adam Vitkovsky adam.vitkovsky at swan.sk
Fri Mar 8 23:42:07 UTC 2013


There's some fundamental misunderstanding here. 
By default with vpnv4 and vpnv6 address-familie there's next hop self set by
the PE. 

Local-Repair and label-retention was around many years before PIC came
along. 
It worked nicely with eibgp multipath and allowed the primary PE to work
around the failed PE-CE link and send traffic to alternate PE that
advertised the same prefix. 
The added value with PIC is you don't have to have equal attributes in order
to have an alternate path installed into FIB

There are no micro-loops involved on an alternate PE. 
During normal operation packet incoming on Primary PE would be
label-switched based on the per-prefix or per-ce label via PE-CE link as
directed by the L2 overwrite in the FIB. 
In case of the local PE-CE link failure. 
PIC or Local-Repair will just label switch the incoming label with label
advertised by the alternate PE. 
Once the alternate PE receives the labeled packet it will just label-switch
it out the PE-CE link. 
During normal operation or during failure there is no recursive lookup done
just label-switching. 

As Ytti pointed out already you don't want the PE-CE links to be carried by
the IGP as you can fast reroute over their failure and perform a
"local-repair" until the BGP converges and the ingress PE starts forwarding
traffic to alternate PE/NH. 
The only case when you experience an excessive loss of connectivity is when
the egress PE fails -in that case you need to really on the speed of IGP
convergence to inform the ingress PE to switch to a preprogramed backup
path/NH (PIC CORE). 
There are already some RFCs that propose P-core to fast reroute to alternate
PE in case the primary PE fails - can't wait :). 
 

adam
-----Original Message-----
From: Matt Newsom [mailto:matt.newsom at RACKSPACE.COM] 
Sent: Friday, March 08, 2013 7:18 PM
To: Saku Ytti; nanog at nanog.org
Subject: RE: internet routing table in a vrf

     If you run PIC and hide the next hop information between a loopback
which is what will happen in a vpn environment you will lose awareness of
the failure of an edge link on a remote PE. The remote PE will continue to
send traffic to the PE with the failed link until it has completely
converged both at the control plane, and written to the FIB. If the remote
PE has PIC running he can bounce that traffic back to his backup path via
another PE. There will be some percentage of your traffic that will then
form a transient micro loop though because that remote PE will have his
primary path through the failed link due to shortest as path length etc, and
he will not have converged yet around the failure on the remote PE and has
no awareness of the failure. One possible solution to this is to guarantee
that a PE will never use another PE for a primary transit route. This can be
accomplished via metrics such as weight etc.. Again one of the downsides of
this is you need to run VRF labels so that a local IP lookup can be done on
the PE with the failed link and it can execute a local repair when it see's
the link drop. 

-----Original Message-----
From: Saku Ytti [mailto:saku at ytti.fi] 
Sent: Friday, March 08, 2013 11:23 AM
To: nanog at nanog.org
Subject: Re: internet routing table in a vrf

On (2013-03-08 16:40 +0000), Matt Newsom wrote:

> 2) forward plane (recursive lookup issues)
>           Most platforms program prefix's with associated labels slower so
your base convergence will suffer. 

Do you have any reference you could share? What level of penalty per prefix
have you observed in each platform tested?

>In addition if you want to run PIC you will likely be left with a bit of
custom engineering to make it  	work. VPN's hide the next hop behind the
loopback of the PE so next hop failure awareness of an edge tie will be
lost. If you can stomach the double lookup you can run per-vrf labels (per
prefix isn't feasible on most platforms) and weight up your edge ties and
force a bounce back to another PE, otherwise you will be stuck with bgp
control plane based convergence with per-ce labels.

PIC is about converging each prefix at the same time. It does not make
statement where next_hop is pointing, is it loop0 (next-hop-self in INET) or
is it edge CE.

If your IGP carries all edge links, and you don't run next-hop-self, far end
PE can converge faster in INET scenario. But current efforts are not to fix
this, current efforts are to make the local PE do hitless repair when
arriving frame is pointing to dead edge interface.
It seems to be very rare to run INET in this way, majority don't carry edge
links in IGP and do run next-hop-self.

--
  ++ytti







More information about the NANOG mailing list