links on the blink (fwd)

Curtis Villamizar curtis at ans.net
Wed Nov 8 21:14:53 UTC 1995


In message <Pine.LNX.3.91.951107212422.18960D-100000 at okjunc.junction.net>, Mich
ael Dillon writes:
> 
> So in order to state what percentage of loss is acceptable and be 
> unambiguously understood, you need to specify the time element. Of 
> course, I am also guilty of not explicitly stating this time element.

10^-4 and 10^-5 is determined by a test involving a bit under 10^5
packets per run.  <g>.  The test is generally run many times during
circuit acceptance.  It is run on circuits that are suspected of
trouble.  Since the traffic source is pps limited, the test can be run
on a live lightly loaded DS3.

> To get a REASONABLE standard of packet loss you have to qualify your 
> numbers by saying that a loss rate of .001% 23 hours out of 24 is 
> desirable but a loss rate of 10% 1 hour out of 24 is acceptable. This 
> recognizes the reality of today's global Internet which is not anywhere 
> near fully meshed and which is experiencing sustained surges of growth. 
> Just like in a race condition, sometimes the NSP's will fall behind due 
> to line failures and equipment failures and new equipment shipping 
> failures and so on.

We also allow a very small number of SES.  I think it is 60 or 90 SES
per day with a lower threshhold on any 15 minute interval.  This is
reported in the DS3 MIB.  One hour of loss has never been acceptable.

The NSS routers are pps limited but rock solid below a certain pps
ceiling.  We have strived to work around this by arranging our
topology to avoid exceeding the pps limits until we can replace these
routers.  The goal here is to keep below 10^-4 packet loss over 15
minute periods including cicuit or FDDI congestion within our core.
Only tail circuits to customers are allowed to congest (as long as
that is all the customer is willing to pay for for their attachment).
Lately we have been having difficulty with the pps limits but nowhere
near 10% even briefly.  At certain hot spots we have occasionally set
up temporary shell programs checking error rates on a 1 second
interval, saving intervals with high error rate.  We will probably
have to adjust our topology again to distribute traffic differently
and have ordered circuits specifically to avoid loss.

The point is that not everyone accepts high loss rates as "normal".

> Michael Dillon

Curtis



More information about the NANOG mailing list