size of the routing table is a big deal, especially in IPv6

Iljitsch van Beijnum iljitsch at muada.com
Mon Nov 29 23:33:04 UTC 2004


On 28-nov-04, at 5:20, Daniel Roesen wrote:

>> I find it interesting that no operators are screaming that there will 
>> be
>> too many routes, but that all the IPv6 researchers are bringing forth
>> this view.

> ACK. All the "oh our IPv4 DFZ table explodes today" is similarily
> unfounded as far as I'm aware. I have not heard of anybody being
> able to crystal-ball the scaling limits of BGP4 yet, and currently
> used BGP implementations seem to cope quite well with 150k routes
> (set aside the notorious vendor C artificial RAM limits in older gear
> to make you buy new gear when table gets bigger).

Ok, I'll do this one more time.

There are basically two issues: the forwarding table and BGP 
processing. Information in the forwarding table needs to be found 
*really* fast. Fortunately, it's possible to create datastructures 
where this is possible, to all intends and purposes, regardless of the 
size of the table. However, memory is a concern here, as you only have 
a few hundred nanoseconds to look up something in the routing table at 
10 Gbps speeds. When the forwarding table gets too large and the 
packets rates too high, you may run into memory bandwidth problems 
and/or have to use much more expensive memory. On any line card, but 
especially on a fast one, a bigger fdb simply costs more money.

For the BGP routing information base this isn't much of a problem, as 
you can use much cheaper and slower memory. Unfortunately, there is 
also the processing. Because of stuff like the longest match first rule 
and the presence of multiple BGP routes towards the same destination, 
it's much harder to use very efficient data structures for this. And to 
add insult to injury, the contents of the BGP table changes all the 
time. Now this appears to be a linear problem, but it isn't: when the 
routing table gets twice as big, generally this means twice as many 
updates (probably more, as deaggregated routes tend to flap more) but 
you also need to search through twice as many routes in the routing 
table to process each update. So the work doesn't increase as O(n) but 
either O(n*n) or O(n*log(n)).

Now all of this doesn't mean we can't have any growth in the global 
routing table, but it does mean that such growth must be considerably 
below the Moore's Law rate (a factor 2 in 18 months or about a factor 
10 in 5 years). Over the past few years the routing table growth has 
been very modest, but it looks like it's picking up speed again. This 
isn't good, although we're certainly not at dangerous levels yet.

>> 8 years too late guys.  We've figured out table management.

> ACK, looks like that.

Yes, it's surprising how effective hoping for the best can be sometimes.

> And even if all active ASses would immediately adopt IPv6, we would
> land at about 18k IPv6 routes. "big deal".

I have a slightly bigger deal for you. Unfortunately, I can't find the 
current number right now, but the number of individual /24s in the BGP 
table was always something like half the table when I looked. Now for 
an ISP, a /24 is small change, so it's likely that most of those /24s 
are real or defacto PI blocks that are often announced under the AS of 
the ISP of the week rather than under the AS of the holder of the 
block. If you take this number you're at around 50k. I'm not sure about 
how this works out in actual implementations, but it's likely a 50 to 
75 k IPv6 table takes the same amount of memory as a 150k IPv4 table.

Next step. In IPv4, there is downward pressure on multihoming because 
you can't get a route advertised that's longer than a /24. And yes, 
even a /24 is somewhat hard to get for most people. In IPv6, _everyone_ 
can get a /48. So if we allow /48 PI blocks in the routing table, how 
do we make sure we only allow "legitimate" PI users and not ISPs 
deaggregating a /32 into 64k /48s or people announcing PA /48s?

This deal is getting bigger by the minute.

In IPv4 it took a while before we managed to get it right, resulting in 
the 192.x.x.x swamp and lots of address space and AS numbers that are 
as good as unreclaimable. And this was all before 1993, before pretty 
much anyone had even heard of the internet. If we get it wrong to the 
same degree in IPv6 it will be much worse because the potential influx 
of new IPv6 users in a week is larger than the influx of new IPv4 users 
in any year before 1993. (For instance, if there is a land rush on AS 
numbers because they are a free ticket towards an IPv6 PI prefix.)

Now I'm not saying that all kinds of bad things are going to happen. 
I'm just saying we should be very conservative in allowing unreversible 
changes in unscalable aspects of IPv6.




More information about the NANOG mailing list