MCI and SprintLink are partitioned (fwd)

Curtis Villamizar curtis at ans.net
Wed Oct 4 18:37:42 UTC 1995


In message <199510041535.IAA23092 at upeksa.sdsc.edu>, Hans-Werner Braun writes:
> 
>  . are all three (four?) NAPs really being used (I know they are
>    there, but despite repeated requests to at least one NAP service
>    provider I appear to be unable to get an answer). I do know that the
>    NY NAP is heavily used, including as my traffic to the Bay area
>    sites I need access to traverses it (modulo all the losses in
>    Sprintlink for at least weeks (reported to and confirmed by the
>    regional network that serves SDSC, though from rumors I am hearing
>    Sprintlink is rather not the exception, and many natives in the
>    community starting to get restless]

We primarily use MaeEast and the Sprint NAP with backup through E144
(FixE), and soon MaeWest.  We don't connect to AADS and only use
PacBell for customers not reachable by an other means.

>  . Is there any evidence that the NAPs are really backing each other
>    up? Did someone test and document it, e.g., with a few "test" networks
>    in a bunch of regional networks? What are the time delays for a
>    switch? Does someone have consecutive traceroute outputs where a
>    switch among the NAPs really happened?

Yes there is.  The NAPs can back each other up, but traffic can be a
real problem if MaeEast goes down.  Since adding the gigaswitch,
Sprint NAP becomes much more viable as a backup and MaeWest is
promising since they too may go with switched FDDI.

>  . do we have some regular examples from *any* site A initiating a
>    connection from A to B, A to C, and A to D, where the three are
>    verifiably (via traceroute, I guess) would traverse different NAPs
>    (and hopefully only one each)?

There are tons of examples.  If the load wasn't split, we'd drown in
the traffic load at MaeEast.

>  . Are there routing stability reports accessible online from the RA
>    (or whoever else feels responsible for this) that graph fluctuations
>    at the NAPs, including correlation among them? What are the quality
>    metrics for routing stability?

We have very reliable statistics on the peering session stability with
our peers at every interchange.  We also have some very unreliable
data (sorry folks, the data reall isn't very good) on prefix
stability.  On a bad day (a few times a month) we might have a total
disconnect time on a given peer of 5-15 minutes over a 24 hour period.
This is the worst single peer, not the NAP as a whole.  We
occasionally (a few times a month) see the entire set of peers drop at
MaeEast, we think due to route flap.  The normal case is many days,
without losing a peering session *anywhere*, interrupted by a loss of
a single or small number of peers lasting from a few seconds to a few
minutes followed by another few days of uninterrupted peering.

The stability of the prefixes announced by those peers is another
story.  Unfortunately the data collection we have in place has been
somewhat broken for a while now.  The external route flap reporting is
seen as a low priority (not officially supported, you can't get a
lower priority), and I haven't had the time to fix it.

>  . Do all the NAPs provide online statistics?
> 
>  . Are the NAP and RA regular reports to NSF publicly (hopefully via
>    the Web) available?

You have reporting requirements?  Great.

We regularly show summary information on internal routing stability
and external peer stability (I think that is still publicly
available).  The more detailed daily summary and the incident logs are
not made public, though we should be proud of our record, so I was
never able to figure out why.  Perhaps you (NSF) can get a copy for
reference.

>  . Is there any way NANOG can be used to exchange status information
>    about networks, rather than getting comments and rumors second or
>    third hand. I understand that it is painful for a service provider to
>    see problems on their network being posted, but if the alternative is
>    a few bad incidents and rumors spreading that the network is always
>    bad, I'd take a few hits and show I fix things quickly. Even better
>    then posting (e.g, via some mailing list) would be an accessible
>    distributed data base covering all the service pproviders and
>    accessible via the network. Is someone already working on that?
>    Would not NANOG be *the* forum to cooperate on that?

This would be great, but I can't see it happenning.

> I think this is prime NANOG business. Otherwise, who's problem are
> these? Who is or should be taking responsibility? Am I all off base
> here?

We should confirm that there actually was a problem and the problem
duration first.

Curtis



More information about the NANOG mailing list