routing meltdown

Nicolas Williams nmw at tremere.ios.com
Fri Aug 11 05:16:54 UTC 1995


If the problem we're talking about is that the large number of paths
that an exchange point router will soon be seeing is too large, and
having a single route-server at each exchange is not politically
acceptable, then here is a possible solution.

One way to cut down the number of paths each XP router sees would be for
every XP NSP/ISP/whatever to install a router _and_ a BGP4 "proxy"
server. Meaning that each XP router would peer via IBGP with a large
(64-128MB of RAM) computer, which in turn would peer, via EBGP, with all
of that AS' peers' routers or router servers. This way each XP router so
configured would have as many paths learned at the XP as it would have
routes learned at the XP, which would be a substantial reduction from
the number of paths most XP routers tend to carry nowadays; the computer
next to the router would be the one to absorb the heavy load of handling
so many neighbors, routes and paths, and these computers can be upgrade
with more ease than can the routers.

This approach has two major benefits:

 - The load on the XP routers would be lowered considerably by relieving
   the router of part of the path selection process as well as the need
   to have enough memory to carry hundreds of thousands of paths. The
   pressure on router vendors to beef up their products "path" capacity
   would be considerably reduced, thus lowering router costs in the long
   run.

 - NAP members would be freed from having to wait for their router
   vendor to implement better route filtering features. After all, each
   member would then have better control over the software used for the
   BGP4 server (there's gated, there could be new PD/free/shareware BGP4
   router daemons, or even commercial ones if everyone tried this
   setup), able, perhaps, to hack their BGP4 daemon any which way they
   desire. A Pentium-class PC running some sort of Unix or Unix-like OS,
   or a Sun or DEC Alpha, or something like that, populated with 64MB of
   RAM or more would do for such a server; these, unlike Cisco routers,
   tend to be easily upgradable to larger amounts of RAM or faster CPUs
   too!

If you think about it, freeing ourselves from our router vendors' BGP4
limitations would allow more experimenting with route dampening, and
even a model that allows for some temporary entropy within the /18
model.  For example, a BGP4 daemon could be developped that keeps track
of each prefix's routing flap, thus allowing policies that filter
unstable prefixes/paths, or that would temporarily allow in prefix
announcements that are too long (this would allow a /18+Entropy model,
which would help ASes deal with rare AS partitioning incidents and the
like). The number and size of holes in large aggregates could also be
controlled; a policy could specify that no more than 4 /24 holes in /18s
be accepted for example.

All of the above features would take some time before they are
implemented, but it would probably be less than the time it would take
router vendors (after all, many of us would have an interest in helping
to develop these features).

One other, less important, benefit of doing this would be that of saving
router vendors the headache of having to add more and more CPU and RAM
capacity to their routers. Why would this be good? Well, IPv6, if it
ever is accepted by the Internet (and I bet it will be), will only
requiere XP routers to carry a few hundred routes at most (IPv6 would
be used with CIDR from the beginning), wasting, as a result, the large
amounts of RAM and fast CPUs everyone will have put in their XP routers
by the time IPv6 replaces IPv4; general purpose servers can easily be
reused, and if anything, are far less expensive than large routers.

NAPs could even segregate all BGP4 traffic off of the FDDI or ATM
switches, since member's XP route servers could be configured to peer
over a lower bandwidth, separate LAN.


Would this exchange point configuration be acceptable?

Nick

PS: A Cisco 4500 with two high speed interface boards and one ether
    card, along with a Pentium-class, 128MB box in a small case with no
    monitor or keyboard and with one ethernet interface would be enough
    of a start up kit for most NAP newcomers and it would all fit in the
    limited rack space Sprint offers at its NAP. I'd love to see lower
    entry barriers.



More information about the NANOG mailing list