What frame relay switch is causing MCI/Worldcom such grief?

Vadim Antonov avg at kotovnik.com
Tue Aug 10 00:46:23 UTC 1999


The traffic-engineering reason for L2 "routing" is only valid for
complex-topology networks.  In simple topologies, the penalty for
suboptimal paths effectively cancels gains from spreading traffic around.
Physical fiber plants do have rather simple topologies (the rich topologies
are usually "optical illusions" created by SONET layer).

>From a customer's point of view performance of the network is _not_
measured as available bandwidth; but rather as performance of his
TCP streams; which depends heavily on latencies and loss. Increasing
latency while there's a lossy component in the path (which is increasingly
found not in backbone but at ingress tail circuits, and outside of the
ISP control) downgrades performance apporximately inversely proportionally
to the latency.

In other words: excluding grossly overloaded circuits, you want the
path with least latency!  This is because your performance is limited
by the tail-circuit (or exchange point) loss _and_ the backbone latency.
MPLS does nothing to help avoid these lossy places (avoiding IXP loss
would require propagation of interior routing information into peer
backbones).

Additionally, suboptimal paths as a rule involve more hops, which
increase latency variance proportionally.

Now, no matter how one jumps, most congestions only last seconds.
Expecting any traffic engineering mechanism to take care of these is
unrealistic.  A useful time scale for traffic engineering is therefore
at least days - which can be perfectly accomodated by capacity planning in
fixed topology.  At these time scales traffic matrices do not change
rapidly.  In fact, as long as there are more than three backbones, one
can safely assume that most traffic goes from customers (proportionally
to size of their pipes) to the nearest exchange point; and from exchange
points randomly to all customers (again proportionally to their access
pipe sizes).

Backbones which neglect the capacity planning because they can "reroute"
traffic at L2 level simply cheat their customers.  If they _do not_
neglect capacity planning, they do not particularly need the L2 traffic
engineering facilities.

Anyway, the simplest solution (having enough capacity, and physical
topology matching L3 topology) appears to be the sanest way to
build a stable and manageable network.  Raw capacity is getting cheap
fast; engineers aren't.  And there is no magic recipe for writing
complex _and_ reliable software.  The simpler it is, the better it works.

--vadim




More information about the NANOG mailing list