size of the routing table is a big deal, especially in IPv6
Iljitsch van Beijnum
iljitsch at muada.com
Mon Nov 29 23:33:04 UTC 2004
On 28-nov-04, at 5:20, Daniel Roesen wrote:
>> I find it interesting that no operators are screaming that there will
>> be
>> too many routes, but that all the IPv6 researchers are bringing forth
>> this view.
> ACK. All the "oh our IPv4 DFZ table explodes today" is similarily
> unfounded as far as I'm aware. I have not heard of anybody being
> able to crystal-ball the scaling limits of BGP4 yet, and currently
> used BGP implementations seem to cope quite well with 150k routes
> (set aside the notorious vendor C artificial RAM limits in older gear
> to make you buy new gear when table gets bigger).
Ok, I'll do this one more time.
There are basically two issues: the forwarding table and BGP
processing. Information in the forwarding table needs to be found
*really* fast. Fortunately, it's possible to create datastructures
where this is possible, to all intends and purposes, regardless of the
size of the table. However, memory is a concern here, as you only have
a few hundred nanoseconds to look up something in the routing table at
10 Gbps speeds. When the forwarding table gets too large and the
packets rates too high, you may run into memory bandwidth problems
and/or have to use much more expensive memory. On any line card, but
especially on a fast one, a bigger fdb simply costs more money.
For the BGP routing information base this isn't much of a problem, as
you can use much cheaper and slower memory. Unfortunately, there is
also the processing. Because of stuff like the longest match first rule
and the presence of multiple BGP routes towards the same destination,
it's much harder to use very efficient data structures for this. And to
add insult to injury, the contents of the BGP table changes all the
time. Now this appears to be a linear problem, but it isn't: when the
routing table gets twice as big, generally this means twice as many
updates (probably more, as deaggregated routes tend to flap more) but
you also need to search through twice as many routes in the routing
table to process each update. So the work doesn't increase as O(n) but
either O(n*n) or O(n*log(n)).
Now all of this doesn't mean we can't have any growth in the global
routing table, but it does mean that such growth must be considerably
below the Moore's Law rate (a factor 2 in 18 months or about a factor
10 in 5 years). Over the past few years the routing table growth has
been very modest, but it looks like it's picking up speed again. This
isn't good, although we're certainly not at dangerous levels yet.
>> 8 years too late guys. We've figured out table management.
> ACK, looks like that.
Yes, it's surprising how effective hoping for the best can be sometimes.
> And even if all active ASses would immediately adopt IPv6, we would
> land at about 18k IPv6 routes. "big deal".
I have a slightly bigger deal for you. Unfortunately, I can't find the
current number right now, but the number of individual /24s in the BGP
table was always something like half the table when I looked. Now for
an ISP, a /24 is small change, so it's likely that most of those /24s
are real or defacto PI blocks that are often announced under the AS of
the ISP of the week rather than under the AS of the holder of the
block. If you take this number you're at around 50k. I'm not sure about
how this works out in actual implementations, but it's likely a 50 to
75 k IPv6 table takes the same amount of memory as a 150k IPv4 table.
Next step. In IPv4, there is downward pressure on multihoming because
you can't get a route advertised that's longer than a /24. And yes,
even a /24 is somewhat hard to get for most people. In IPv6, _everyone_
can get a /48. So if we allow /48 PI blocks in the routing table, how
do we make sure we only allow "legitimate" PI users and not ISPs
deaggregating a /32 into 64k /48s or people announcing PA /48s?
This deal is getting bigger by the minute.
In IPv4 it took a while before we managed to get it right, resulting in
the 192.x.x.x swamp and lots of address space and AS numbers that are
as good as unreclaimable. And this was all before 1993, before pretty
much anyone had even heard of the internet. If we get it wrong to the
same degree in IPv6 it will be much worse because the potential influx
of new IPv6 users in a week is larger than the influx of new IPv4 users
in any year before 1993. (For instance, if there is a land rush on AS
numbers because they are a free ticket towards an IPv6 PI prefix.)
Now I'm not saying that all kinds of bad things are going to happen.
I'm just saying we should be very conservative in allowing unreversible
changes in unscalable aspects of IPv6.
More information about the NANOG
mailing list