Fwd: Weekly Routing Table Report

Zartash Uzmi zartash at gmail.com
Mon Aug 3 20:38:37 UTC 2009


A response I received from Mark Tinka on afnog mailing list. Reposting for
the nanog community:

---------- Forwarded message ----------
From: Mark Tinka <mtinka at globaltransit.net>
Date: Mon, Aug 3, 2009 at 7:55 AM
Subject: Re: Weekly Routing Table Report
To: Zartash Uzmi <zartash at gmail.com>
Cc: pfs at cisco.com, afnog at afnog.org


On Monday 03 August 2009 07:29:28 am Zartash Uzmi wrote:

<deleting other mailing lists and such>

> Apologies if this is too naive to ask...

Any question is a good question.

> but is there some
> detail available about the items listed in the summary?

This is probably the link you seek:

http://thyme.apnic.net/about.html

> 1) In particular, what exactly is the difference between
> the "BGP routing table entries examined (292961)" and
> "Unique aggregates announced to Internet (145391)"?

What this tells us is, assuming all routing entries
"currently" present in the routing table (as seen by the
infrastructure that enables this weekly CIDR report) were
aggregated, the maximum number of routing entries, as at the
point this report was generated, would be 145,391; just a
little shy of half what we're seeing today.

> 2) I believe 292961 is the worst case routing table size
> for any router.

Well, different networks will see slightly different views
(more or less), but yes, the average number each DFZ
(default-free zone) is seeing would be just about there.

But if you mean the number of routing entries each router
can support, then that's a different issue. This depends on
a number of factors, particularly, the router's
architecture.

<At the risk of massively derailing this thread, little
primer on routing architectures follows, for perspective>

Software-based routers will, generally, hold as many routing
entries as the amount of RAM they have can support. A number
of software-based routers today support 2GB of RAM, e.g.,
Cisco's 7201 or 7206-VXR/NPE-G2, or Juniper's J4350 and
J6350 routers. The limitation, obviously, is because packet
forwarding occurs in the CPU path (a software Interrupt
process, hence the name, "software-based routers"), there is
a finite amount of traffic software-based platforms can
forward, especially with other features enabled, before they
run out of ideas, so to speak :-).

Hardware-based routers, on the other hand, segment the
process of handling routing entries and forwarding traffic,
in a somewhat distributed fashion. Hardware-based routers
typically have what are known as control planes and data
planes. Control planes handle management and such house-
keeping functions, which includes BGP. This function is
generally similar to what you see in software-based routers,
in that RAM is abundant (up to 4GB in today's largest
systems) to hold as many routing entries from as many paths
as possible.

Where hardware-based routers differ from their software-
based cousins is how routing entries are used to forward
traffic. Once BGP has chosen the best path to all
destinations in the control plane, those best paths are then
"downloaded", if you will, to the router's data plane, which
is normally a special chip that has a single function -
forward traffic as quickly as possible. Dumb, but very
efficient. The vendors call these ASIC's (Application
Specific Integrated Circuits) or Programmable Chips.

These ASIC's or Programmable Chips usually hold only one
routing entry at a time, even though the control plane can
have several copies of the same routing entry, as seen from
multiple BGP paths. These ASIC's/Programmable Chips use
expensive, specialized memory to hold these entries; it
could be TCAM (Ternary Content Addressable Memory), SSRAM
(Synchronous Static RAM), RLDRAM (Reduced Latency DRAM),
e.t.c. These are high-speed types of memory that are built,
in most cases, for networking applications, e.g., routers,
switches, e.t.c., and work at very high bandwidths,
supporting high-speed route entry look-ups, which allows
hardware-based routers to forward traffic at the speeds they
do, e.g, 1Gbps, 10Gbps, 40Gbps, e.t.c.

The reason I brought this up is because the explosion of the
IPv4 Internet routing table is putting pressure on routers,
more specifically, hardware-based routers. This is because
TCAM, SSRAM, RLDRAM, e.t.c., is very expensive, and as such,
has a finite number of entries they can hold (I say entries
because on some platforms, entries includes IPv4 routes,
IPv6 routes, MPLS LSP's, ACL's, NetFlow data, e.t.c.).
Upgrading these means swapping out expensive data plane
infrastructure, which many service providers would like to
avoid, if at all possible.

</At the risk of massively derailing this thread, little
primer on routing architectures follows, for perspective>

> If the unique aggregates announced to the
> Internet is 145391, how does the routing table size
> anywhere may exceeds this number?

For the very reasons we receive this report on a regular
basis, to remind us of what we could do to reduce the
pollution of the routing table, and hence, increase the
lifetime of the (powerful?) routers we have in the network
that can no longer serve us because they can't hold anymore
routing entries without falling over.

I wouldn't do Philip Smith's live presentation of the state
of the Internet routing table any justice by trying to
explain it here :-), but basically, we see about 50% more
routing entries than we should mainly due to de-aggregation.

This happens for a number of reasons, but one of the main
ones is traffic engineering, where networks announce longer
versions of their prefixes in order to balance inbound
traffic to their network. This is usually a noble,
commercially-driven decision, but with the side effects
explained above.

Cake, eat, both... :-).

> 3) Is aggregation done at a particular router for (i)
> reducing the table size in that router, or (ii) reducing
> the number of announced prefixes by that router, or (iii)
> both?

Border and peering routers are generally the ones that face
the world. They connect to the Internet either by purchasing
transit from upstream providers, or peering with other
networks privately or at public exchange points.

Aggregation is typically done inside your network, but
whatever the case, the prefixes that these routers announce
to the outside should, ideally, be aggregates of the
allocations a network receives from its RIR.

Different networks implement this differently, e.g., some
networks configure and announce aggregates on their border
and peering routers, while others, like us, do the same on
the route reflectors, which I think scales better if you
have multiple border and peering routers spread across the
network. But as you can tell, this is an internal design
issue - the end goal is to announce to the world only what
you need to announce to the world, and hopefully, keep the
Internet routing table lean & mean.

Hope this helps.

Cheers,

Mark.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://mailman.nanog.org/pipermail/nanog/attachments/20090804/3db66e06/attachment.sig>


More information about the NANOG mailing list