outages, quality monitoring, trouble tickets, etc

Alan Hannan alan at gi.net
Thu Nov 23 22:10:06 UTC 1995


.........  Sean Donelan is rumored to have said:
] 
] >From: Scott Huddle <huddle at mci.net>
] > I consider this list a place for ISPs to discuss general policy and
] > planning issues that effect all of us.  It is a very inappropriate
] > place to discuss problems with a specific provider.

  Scott, if a certain provider *cough you_know_who* is causing our
  connectivity to go to hell through their lame non aggregation
  policies (now fixed) then where else would the issue be discussed?

] As a general policy I wish all network providers would implement
] at a minimum a network notification list.  Bonus points for a network
] status WWW page, NetNews, FAX, and PR Newswire distribution.

  Hmm, I wonder if the Trib' would be interested in knowing when the
  DS3 from Pensaulen is down.....

] Who cares if you connect to three (plus one) NAPs.  In the post-NSFNET
] era, Internet-wide reliability requires Internet-wide information.
] Reporting network reliability problems is more important than how
] many NAPs your network connects.

  At a precursing glance I would agree with you.  However, let us
  delve into this a bit deeper.  Donning my idiot hat may I point out
  that the _most_ important thing is network reliability -Period-.

  While I agree w/ you that accountability is important I can't
  agree that a simple outage list is very terribly useful.  With all
  due respect to the Sprint folx, their lists are often vague and
  noninformative.  As of late the MCI tickets have been more and
  more coming and less and less useful.

  Back to accountability, (with kindest respect) Sean, you haven't a
  lambs foot to stand when Barrnet isn't looking into a problem.  In
  my opinion this is an issue that needs to be brought to fruition.
  Either develop some world policy of Internet Connectivity or
  perhaps we should all realize that the only person we can hold
  accountable is Mr. Upstream.  There are a few ways I can think of
  this working, in (my opinion of the) order of their potential for
  success:

	o Sprint, MCI, ANS jointly fund an Inet trouble tracking NOC

	o The Federal Government's FCC encompasses USA's Inet
	  Traffic as a medium

	o NSPs voluntarily subscribe to a policy of notification of
	  problems to a global mail list.

  Your page at DRA is quite good, however the concensus among
  upper management (not just at our site) is "Why should other
  people know when we're broke?".  And the sad thing is, I am
  tempted to agree with them.

  If you call our NOC and you ask about a connectivity issue, you
  will get a straight answer.  Perhaps not from the first person you
  get, maybe not the second, but my people will escalate it until
  you do.

  The fact that we don't advertise this is not deterrent to the
  quality of the information, only the convenience.

] >Has anybody else noticed how hard it is to get trouble tickets these
] >days?  Once upon a time, I just called the NSF NOC, and got a report to
] >them in real time, so the problem could be fixed quickly.  Nowadays,
] >NOCs seem to want you to send email with 24 or 48 hour turnaround, or go
] >through 2 layers of service representatives.  Pretty hard to send email
] >to them when their link is down, or go through "regular" support in the
] >middle of the night!

  I don't know how all the other NSPs work, but if there is ever an
  issue wrt connectivity or systems we HAVE a trouble ticket and we
  WILL provide it on request.  With kindest respect, I understand
  your desire to get it "on demand" but with a bit more work you can
  get it from our NOC.

] Welcome to the new and improved Internet.  More clueless people cal 
] NOCs these days (is it plugged in?) 

  You can't imagine how humerous this is.... :)  I truly feel sorry
  for the poor chaps at INSC....

] so more caller screening is done.

  To a point, but if the person on our end of the phone doesn't know
  the answer, they aren't allowed to say as such, they escalate the
  issue until it's resolved.  "I don't know" has to be followed by a
  promise here.

  Is this not common?

] NOC-to-NOC communication has been a long standing Internet problem.  

  Hmm, I'm not sure I would terribly agree.  When MCI or Sprint has
  a problem, we have not had any latency issue getting to them.
  Likewise w/ wacky issues causing us to get with Sura, Barr, Cerf,
  Westnet, etc...

] no common conventions.  Even though its out of date, I still keep my
] Internet Manager's Phonebook published by BBN in 1990.

  Sounds like a good market... :)

] In the meantime, keep a stack of business cards and a special rolodex,
] with the magic names and telephone numbers that get you directly to
] someone who can understand (and maybe even fix) the problem.  Interesting
] enough, the people usually don't change; but the employers do.
							^
                                     You have openings? ;-)

  Enter the "Backbone Cabal".  I can call you when I need to know
  what's up w/ DRA.  If apropo, you shoot me to the less clueful
  person.  You call us, ditto.  I've got the same folx at ANS,
  Sprint, MCI, etc..  That's why we're important, we know who can 
  do what and occasionally how to find them.

  Do you really want outage and downtime on public record, or do you
  want easier access to clueful folx?

  -alan



More information about the NANOG mailing list