Links on the blink - what will/should mci & sprint do?

Mon Nov 27 08:43:12 UTC 1995

[Sorry for being late in responding, I've been on vacation...]

Let me start by pointing out that the global problem here is to
maintain and grow a routing infrastructure at exponential rates and at
"interesting" flap rates.  

Using a level 2 switching infrastructure as a means of bandwidth
control is clearly not a comprehensive solution as it does not provide
the necessary level 3 routing [hopefully this is trivially obvious to
anyone reading this, but...].  Is a level 2 infrastructure a cost
effective means of doing bandwidth control?  Possibly, but the exact
economics vary widely depending on the level of intra-POP bandwidth
necessary and the prices that each ISP is able to obtain.  

Using a level 2 architecture clearly does nothing to avoid the "brick
wall".  This becomes quite clear as we can easily demonstrate brick
wall phenomena at any interconnect, regardless of the router (or make
of router) and the fabric behind it.

The question then focuses on the maintainability of the routing
infrastructure.  We know that we have two limits: perfectly stable
routing, and completely unstable routing.  Neither is interesting.
Perfectly stable routing is simply utopian thinking.  Completely
unstable routing is unusable even given arbitrarily fast technology.
In between, we have an enormous gray area, with no good way to
quantify it: routes that flap, and packets that interact with that
flap.

Clearly if a route flap does not affect any packets, that should not
perturb the infrastructure.  [We're pleased to say that extensive
testing of a 7000 and 7500 points out that in fact, it doesn't.  ;-)]
For the sake of brevity, let's call this "uninteresting flap."

What then is the response of the routing system to intesting flap?  We
can start by characterizing the flap by T_{down}, the period of time
that a route is withdrawn, measured from the "down edge" of the
withdrawl of the route to the "up edge" of the installation of the
replacement.  Subsequent to the "up edge", we have a further
T_{recover}, which is the time that it takes for the infrastructure to
forward at peak rate again.

Using this, we can begin to characterize the behaviors that we'd like.
Clearly, during T_{down}, we have a problem: we have no route and (as
the interesting part of the problem is in the default-free portion of
the net), we have no other shorter prefix for it.  One can then either
drop packets or forward them along the old route anyhow.  The Internet
philosophy has always been to drop packets as quickly as possible, as
close to the source as possible.  The alternative is to forward the
packet anyhow, hoping that the packet is not forwarded into a
sustained forwarding loop.  As the routing protocols aren't truly
architected to protect against such a forwarding loop, the latter
behavior seems risky.  The implication is that during T_{down}, the
infrastucture will drop packets from all sources to the destination.

Next, consider T_{recover}.  As no one has invented FTL hardware (yet
;-), T_{recover} is non-zero on all interesting architectures that I
know of.  Further, during T_{recover}, packet loss should also be
expected.  This includes the case of a router which fully distributes
the routing table, as the distribution itself takes finite time.  What
then are interesting values for T_{recover}?  Clearly, \infinity is
unacceptable.  If T_{recover} << T_{down}, it hardly seems relevant.
What then is an acceptable upper bound?  And what is the sensitivity
of this value?  I leave this as an open question for discussion...

Selective packet discard (a new method we're putting in for
prioritizing routing protocol packets over transit packets) will
insure that T_{recover} is finite, and based on our lab tests, what I
consider to be acceptable.  I should note in fairness that credit for
driving selective discard should be given to both Sprint and ANS.  The
former for demonstrating the need in the real world, the latter for
focusing cisco's management on the subject.

I'd also like to correct some misperceptions that have been
distributed:

- Distributing the full routing table is not the only way to achieve
acceptable response to interesting flap.  If someone has a technical
argument suggesting that this is the only true way, then we have
missed hearing it, and we'd really apprecaite it if you could tell us
again.  In detail.

- Distributing the full routing table to the SSE is not a logical way
to proceed.  For both obvious and non-obvious reasons, spending
significant resources developing software for the 7000 at this point
makes little sense, and without some convincing argument that it would
move T_{recover} from the unacceptable to the acceptable, is beyond
the bounds of rationality.

- The interesting flap rate introduced by end users can be controlled
by deployment of CIDR.  I trust no one on this list remains to be
convinced.

- Certain statements about communications of problems to cisco
mentioned in this forum have been patently incorrect.  Sending a mail
message to your favorite engineer saying "when will you do XYZ" is not
a reasonable way to insure that you're mission critical business
feature is being implemented.  Still, if you believe you've attempted
to communicate with us and you don't think we got the message, then we
can only ask that you have patience, increase your retry count, and
retransmit.  We can't neccessarily do what you might like, but we can
hear you and acknowledge your concerns.

Finally, I'd like to echo what Dennis said: we're all behind the eight
ball, and we all know it.  Recriminations are much less interesting
than solutions.  What should MCI & Sprint do?  Well, let's just say
that we've privately offered them our opinions, and they don't seem to
object too strenuously.  Assuming that things actually go that way,
well, "You ain't seen nothing yet, baby."

Tony

p.s. I'm not on the mailing list, so please cc me if you wish me to
respond.