BGP offloading (fixing legacy router BGP scalability issues)

Łukasz Bromirski lukasz at bromirski.net
Thu Apr 9 12:56:36 UTC 2015


Hi Frederik,

> On 09 Apr 2015, at 13:24, Frederik Kriewitz <frederik at kriewitz.eu> wrote:
> 
> Thank you very much for all your responses.
> 
> First of all, the problems we see are really RIB (Processor memory)
> and CPU related.
> The TCAM/FIB limits are properly configured. From the FIB capacity
> view they should last a couple of more years. Software routing doesn't
> cause the problem.
> The most extreme case of Cisco 6500/SUP720 abuse I'm aware of is a
> setup with 4 full table transit connections + 2 RR sessions + ~20
> peerings, no downstreams. Besides the IPv4 and IPv6 peerings it's
> pretty much only handling a small amount of OSPF and MPLS (<5k
> prefixes ~500 routers). No netflow or any other memory hog. Under
> normal condition it's running at 20% CPU and 90% processor memory
> (1G/SUP720 XL).

The main limit here apart from the rather slow CPU for RP is
the amount of memory you can have. I’d setup a CSR1000v as RR
and offload the 6500 from the control-plane completely. It’s nice
box to do very fast hardware forwarding as long as the FIB fits
in the TCAMs, which it seems it does in your scenario.

> In case a session with a lot of prefixes (e.g. a transit) fails, it
> takes up to 5 minutes for the BGP Router process to recompute the RIB,
> etc.. During that time it's running at 100% CPU. Low priority
> processes are completely ignored (e.g. SNMP based monitoring stops
> working). Occasionally it even drops OSPF neighbours or other BGP
> sessions due to expired hold timers causing further havoc.

You can tune this with process time tweaks.

> Applying a /22 filter was suggested. In order to actually safe the RIB
> memory we would have to disable soft-reconfiguration on the
> corresponding sessions.
> I don't like that option for various reasons as it trades less memory
> usage for longer convergence times and significant bigger impacts on
> route map updates.
> Due to the IPv4 exhaustion we expect to see more small prefixes in the
> future which can't be aggregated (considering the AS path). Simply
> dropping them would result in less optimal routing.

If you have to filter somewhere on something, I’d rather try to filter
by AS_PATH (neighbors, etc) than prefix lengths.

-- 
"There's no sense in being precise when |               Łukasz Bromirski
 you don't know what you're talking     |      jid:lbromirski at jabber.org
 about."               John von Neumann |    http://lukasz.bromirski.net




More information about the NANOG mailing list