What is the limit? (was RE: multi-homing fixes)

Martin, Christian cmartin at gnilink.net
Wed Aug 29 16:21:23 UTC 2001


As much as arguing with Sean is generally a lost cause (face it folks, he
knows his sh*t), I would like to comment on the disasters he presents and
offer a reason why this was so, from someone who wasn't there.  Sometimes an
outside view is a clearer one.


> The Internet *FAILED* in 1994.  The number of prefixes 
> carried globally
> exceeded the ability of the global routing system to carry it, and
> as a result, some parts of the world deliberately summarized away
> whole large-scale ISPs merely to survive.

> The Internet *FAILED* again in 1996, when the dynamicism in
> the global routing system exceeded the ability of many border
> routers to handle, and as a result, some networks (not just core ones)
> deliberately made the Internet less quick to adjust to changes in
> topology. 
> 
> Thus, the routing system FAILED twice: once because of memory, once
> because of processor power.

And both times because the routers themselves were not designed exclusively
for the Internet and its requirements.  In 1994, the Internet was running on
AGS+ and 4000 boxen.  Compare these boxes, 7 years later (or 3.5 Moore
cycles later) to a 124xx, or an M160, or a TSR or any other vendors new
gear.  Today's systems are 10-20x more memory-rich, (still CPU constrained),
and the routing algorithms are adapting to deal with excessive dynamism.
Route damping is of huge importance, but this is something that is well
known.  ORF capabilities in BGP, redistribution communities, etc, will
continue keeping updates frequency in check.  And again, routers today are
generations behind today's Moore cycle.  Furthermore, we are still using
general purpose CPU's.  Imagine a BGP ASIC that could process a zillion
updates per second!  8->

The point is that the vendors did a good job of dealing with the speed
issue.  In 1994, even Sean would have been impressed to see an OC-192c
router interface performing at wire speed with 1000 line ACLs in and out,
etc, etc.  The speed problem has been addressed (the number of
OC-192c/STM-64 ports in use is pretty low compared to the number of
OC-48s/STM-16 ports that were in use 12 months after the those cards were
released), so now the vendors can start working on the global routing
system.  Then again, most of them are working on MPLS which does nothing to
address the GRS.  *sigh*  

(Note to vendor gurus - isn't it interesting how an increasing number of
Internet "greybeards" - sorry sean, I don't recall you being all that grey
or having a beard for that matter - continue to concern themselves with GRS
stability and scaling while at the same time puking on all the MPLS hype?) 

> If the size or the dynamicism of the global routing system grows
> for a sustained period faster than the price/performance curve of
> EITHER memory OR processing power, the Internet will FAIL again.
<-snip->
> So, Moore's Law, or more specifically the underlying curve
> which tracks the growth of useful computational power, is
> exactly what we should compare with the global routing system's
> growth curve.

This is only the case if the algorithmic complexity is a constant.  Reducing
the algorithmic complexity from O(N^2) to O(N), for example, reduces
exponential growth of the data processing requirements to linear growth.
Going from O(N) to O(logN) makes the requirements sublinear.  An example of
this would be the introduction of IS-IS mesh groups.

> 
> Note that when Moore is doing better than the Internet,
> it allows for either cheaper supply of dynamic connectivity,
> or it allows for the deployment of more complex handling
> of the global NLRI.
> 
> The major problem, as you have pointed out, is that processing
> requirement is often bursty, such as when everyone is trying
> to do a dynamic recovery from a router crash or major line fault.
> We could still use 68030s in our core routers, it's just that it'd
> take alot longer than it used to perform a global partition repair,
> which means your TCP sessions or your patience will probably time out
> alot more frequently.

Are you kidding!  My patience times out during single BGP session failure
restoration... ;)

> | I think that what we need to do is have a fourth group, 
> call them Internet
> | Engineers for lack of a better word, come in and determine 
> what the sign
> | should read.
> 
> Structures built according to best known engineering practices
> still fall down from time to time.  That's the problem in anticipating
> unforeseen failures.  Consider yourself lucky that you haven't had
> to experience a multi-day degredation (or complete failure!) of
> service due to resource constraints.   And that you haven't 
> run into a sign that says: "please note: if you try to have an
> automatic partition repair in the event this path towards {AS SET}
> fails, your local routing system will destabilize".

Since I wasn't there during your problems in 1994 and 1996, I ask you, "Did
you not know that you were approaching a resource constraint?"  In regards
to memory, we all know that we can triple or even quadruple the size of the
table, today, with headroom reamining.  The update frequency ceiling is less
tractable, due to implementation dependence, but a correlation between table
size and update frequency would certainly be of use in approximating it.  I
think the table size issue is solvable.  Processing the table is where we
may be hurting, which we are attempting to solve by constraining the update
frequency.

> | Finally, we have a sixth group, call them the IETF, come in
> | and invent a flying car that doesn't need the bridge at all.  
> 
> As Randy (with his IETF hat) says: "send code".  
> 

I wish I could.  But alas, I am too busy counseling my routers, ensuring
that they are happy, trying to keep them stable.  They are temperamental
little b*st*rds, I tell ya.

  -chris

> 	Sean.
> 



More information about the NANOG mailing list