size of the routing table is a big deal, especially in IPv6

Daniel Senie dts at senie.com
Mon Nov 29 23:55:40 UTC 2004


At 06:33 PM 11/29/2004, Iljitsch van Beijnum wrote:

>On 28-nov-04, at 5:20, Daniel Roesen wrote:
>
>>>I find it interesting that no operators are screaming that there will be
>>>too many routes, but that all the IPv6 researchers are bringing forth
>>>this view.
>
>>ACK. All the "oh our IPv4 DFZ table explodes today" is similarily
>>unfounded as far as I'm aware. I have not heard of anybody being
>>able to crystal-ball the scaling limits of BGP4 yet, and currently
>>used BGP implementations seem to cope quite well with 150k routes
>>(set aside the notorious vendor C artificial RAM limits in older gear
>>to make you buy new gear when table gets bigger).
>
>Ok, I'll do this one more time.
>
>There are basically two issues: the forwarding table and BGP processing. 
>Information in the forwarding table needs to be found *really* fast. 
>Fortunately, it's possible to create datastructures where this is 
>possible, to all intends and purposes, regardless of the size of the 
>table. However, memory is a concern here, as you only have a few hundred 
>nanoseconds to look up something in the routing table at 10 Gbps speeds.

This is a solvable problem. Hardware lookups are quite sufficient. 
Forwarding bases stored in line cards can be aggregated to the extent the 
data permits. Any router with 10GigE interfaces that's going to care about 
actually filling such pipes will have advanced hardware forwarding 
technology and a price tag to support the development of same.

>  When the forwarding table gets too large and the packets rates too high, 
> you may run into memory bandwidth problems and/or have to use much more 
> expensive memory. On any line card, but especially on a fast one, a 
> bigger fdb simply costs more money.

Right. And anyone on the edge just needs enough memory to hold the table in 
their software-based routers that have little or no lookup assistance.


>For the BGP routing information base this isn't much of a problem, as you 
>can use much cheaper and slower memory. Unfortunately, there is also the 
>processing. Because of stuff like the longest match first rule and the 
>presence of multiple BGP routes towards the same destination, it's much 
>harder to use very efficient data structures for this. And to add insult 
>to injury, the contents of the BGP table changes all the time. Now this 
>appears to be a linear problem, but it isn't: when the routing table gets 
>twice as big, generally this means twice as many updates (probably more, 
>as deaggregated routes tend to flap more) but you also need to search 
>through twice as many routes in the routing table to process each update. 
>So the work doesn't increase as O(n) but either O(n*n) or O(n*log(n)).

Even 10 years ago it was evident the routing table structures chosen by 
different manufacturers had significantly different performance 
characteristics. As there is no single data structure to define the storage 
of this information, it may follow that there is no singular formula for 
the impact of scaling.


>Now all of this doesn't mean we can't have any growth in the global 
>routing table, but it does mean that such growth must be considerably 
>below the Moore's Law rate (a factor 2 in 18 months or about a factor 10 
>in 5 years). Over the past few years the routing table growth has been 
>very modest, but it looks like it's picking up speed again. This isn't 
>good, although we're certainly not at dangerous levels yet.

Over the past several years, the CPUs in routers have been considerably 
below the speediest on the market. I suspect there's a fair bit of headroom 
at present between the route processing engines in core routers and the 
fastest CPUs presently offered for sale. As such, I have to wonder just how 
much growth we could handle instantaneously, and still stay within the CPU 
capabilities of today's available processors. Also consider that CPU power 
is far from the only issue. Higher speed memory continues to be developed 
along with higher speed bus architectures. System performance is made up of 
many factors.


>>>8 years too late guys.  We've figured out table management.
>
>>ACK, looks like that.
>
>Yes, it's surprising how effective hoping for the best can be sometimes.
>
>>And even if all active ASses would immediately adopt IPv6, we would
>>land at about 18k IPv6 routes. "big deal".
>
>I have a slightly bigger deal for you. Unfortunately, I can't find the 
>current number right now, but the number of individual /24s in the BGP 
>table was always something like half the table when I looked. Now for an 
>ISP, a /24 is small change, so it's likely that most of those /24s are 
>real or defacto PI blocks that are often announced under the AS of the ISP 
>of the week rather than under the AS of the holder of the block. If you 
>take this number you're at around 50k. I'm not sure about how this works 
>out in actual implementations, but it's likely a 50 to 75 k IPv6 table 
>takes the same amount of memory as a 150k IPv4 table.

Deaggregating the entire IPv4 space into /24's is today the worst case 
design for the RIB of a router. Designing a router to handle that case is 
not beyond today's technology.


>Next step. In IPv4, there is downward pressure on multihoming because you 
>can't get a route advertised that's longer than a /24. And yes, even a /24 
>is somewhat hard to get for most people. In IPv6, _everyone_ can get a 
>/48. So if we allow /48 PI blocks in the routing table, how do we make 
>sure we only allow "legitimate" PI users and not ISPs deaggregating a /32 
>into 64k /48s or people announcing PA /48s?
>
>This deal is getting bigger by the minute.

Lookout above! The sky is falling.


>In IPv4 it took a while before we managed to get it right, resulting in 
>the 192.x.x.x swamp and lots of address space and AS numbers that are as 
>good as unreclaimable. And this was all before 1993, before pretty much 
>anyone had even heard of the internet. If we get it wrong to the same 
>degree in IPv6 it will be much worse because the potential influx of new 
>IPv6 users in a week is larger than the influx of new IPv4 users in any 
>year before 1993. (For instance, if there is a land rush on AS numbers 
>because they are a free ticket towards an IPv6 PI prefix.)
>
>Now I'm not saying that all kinds of bad things are going to happen.

Really? You've set the stage to say exactly that. At least that's how it 
read to me.

>  I'm just saying we should be very conservative in allowing unreversible 
> changes in unscalable aspects of IPv6.

I'd sure like to see a lot more thorough analysis than what you provided 
above before reaching that conclusion. History has certainly not sided with 
you. Back in the mid-1990s, we were told routers wouldn't scale, so we 
needed MPLS. While MPLS has found useful roles in the network, it wasn't 
needed as a replacement for IPv4 routing in the core. Several companies, 
including some startups, figured out ways to route packets quite quickly.

In the long run, I'd rather provide the ability to offer the services 
needed. This permits the companies looking for those services to flourish 
and help the economies of the world. While there are challenges to be 
addressed, I belive those challenges will be well met by the equipment 
marketplace, and that innovation also will help the economies of the world. 
Artificial restraint does not result in expanded services or product 
innovations. If I had a way to vore on this, I'd vote to let the markets work. 




More information about the NANOG mailing list