Scalability issues in the Internet routing system

Daniel Senie dts at senie.com
Tue Oct 18 16:50:23 UTC 2005


At 11:30 AM 10/18/2005, Andre Oppermann wrote:

>I guess it's time to have a look at the actual scalability issues we
>face in the Internet routing system.  Maybe the area of action becomes
>a bit more clear with such an assessment.
>
>In the current Internet routing system we face two distinctive scalability
>issues:
>
>1. The number of prefixes*paths in the routing table and interdomain
>    routing system (BGP)
>
>This problem scales with the number of prefixes and available paths
>to a particlar router/network in addition to constant churn in the
>reachablility state.  The required capacity for a routers control
>plane is:
>
>  capacity = prefix * path * churnfactor / second
>
>I think it is safe, even with projected AS and IP uptake, to assume
>Moore's law can cope with this.

Moore will keep up reasonably with both the CPU needed to keep BGP 
perking, and with memory requirements for the RIB, as well as other 
non-data-path functions of routers.



>2. The number of longest match prefixes in the forwarding table
>
>This problem scales with the number of prefixes and the number of
>packets per second the router has to process under full or expected
>load.  The required capacity for a routers forwarding plane is:
>
>  capacity = prefixes * packets / second
>
>This one is much harder to cope with as the number of prefixes and
>the link speeds are rising.  Thus the problem is multiplicative to
>quadratic.
>
>Here I think Moore's law doesn't cope with the increase in projected
>growth in longest prefix match prefixes and link speed.  Doing longest
>prefix matches in hardware is relatively complex.  Even more so for
>the additional bits in IPv6.  Doing perfect matches in hardware is
>much easier though...

Several items regarding FIB lookup:

1) The design of the FIB need not be the same as the RIB. There is 
plenty of room for creativity in router design in this space. 
Specifically, the FIB could be dramatically reduced in size via 
aggregation. The number of egress points (real or virtual) and/or 
policies within a router is likely FAR smaller than the total number 
of routes. It's unclear if any significant effort has been put into this.

2) Nothing says the design of the FIB lookup hardware has to be 
longest match. Other designs are quite possible. Again, some 
creativity in design could go a long way. The end result must match 
that which would be provided by longest-match lookup, but that 
doesn't mean the ASIC/FPGA or general purpose CPUs on the line card 
actually have to implement the mechanism in that fashion.

3) Don't discount novel uses of commodity components. There are fast 
CPU chips available today that may be appropriate to embed on line 
cards with a bit of firmware, and may be a lot more cost effective 
and sufficiently fast compared to custom ASICs of a few years ago. 
The definition of what's hardware and what's software on line cards 
need not be entirely defined by whether the design is executed 
entirely by a hardware engineer or a software engineer.

Finally, don't discount the value and performance of software-based 
routers. MPLS was first "sold" as a way to deal with core routers not 
handling Gigabit links. The idea was to get the edge routers to take 
over. Present CPU technology, especially with good embedded systems 
software design, is quite capable of performing the functions needed 
for edge routers in many circumstances. It may well make sense to 
consider a mix of router types based on port count and speed at edges 
and/or chassis routers with line cards that are using general purpose 
CPUs for forwarding engines instead of ASICs for lower-volume sites. 
If we actually wind up with the core of most backbones running MPLS 
after all, well, we've got the technology so use it. Inter-AS routers 
for backbones, will likely need to continue to be large, power-hungry 
boxes so that policy can be separately applied on the borders.

I should point out that none of this really is about scalability of 
the routing system of the Internet, it's all about hardware and 
software design to allow the present system to scale. Looking at 
completely different and more scalable routing would require finding 
a better way to do things than the present BGP approach.





More information about the NANOG mailing list