InfoWorld Column on Netcom-Cisco Ampersand Collapse
Bob Metcalfe
bob_metcalfe at infoworld.com
Tue Jul 9 15:49:02 UTC 1996
Dear NANOG,
You saw that Craig Huegen has caught me in ANOTHER error.
He's of course right (way below) that:
The Network Works. No Excuses.
is Cisco's slogan, not Netcom's, as I incorrectly wrote in my current
InfoWorld column. Netcom doesn't seem to have a slogan. I stand corrected
AGAIN. A copy of my incorrect InfoWorld column is below FYI. Nothing
slips by NANOG (;->).
Not to get back at Mr. Huegen, but he should note that Cisco is not "cisco"
anymore. Gotcha!
By the way, Mr. Huegen, the well-known fact that the Internet offers no
service guarantees has not, as you've written, escaped me. This well-known
fact is one of those we are working to FIX.
Also, tell us, what was the "original purpose" of the Internet? Not that
it matters much.
Ever your fan and loyal opposition,
/Bob Metcalfe, InfoWorld
---------------------------------
InfoWorld, July 8, 1996 (www.infoworld.com)
Netcom-Cisco outage could foreshadow
much bigger collapses ahead
When a tropical storm grows large enough -- winds exceeding 75
mph -- we call it a hurricane and give it a name. It's time
we do
something similar with Internet outages.
Borrowing a threshold used by our Federal Communications
Commission in the reporting of telephone outages, when more than
50,000 people are denied their Internet access for more
than an hour,
let's call it an Internet collapse and give it a name.
Let's call the
threshold a 50Kx1 collapse, or a 50 kilolapse.
Then we can say that, two weeks ago, the Internet suffered a
400Kx13 collapse, or a 5.2 megalapse. Beginning in the
afternoon of
June 18, the 400,000 customers of Netcom On-Line Communication
Services Inc. (http://www.netcom.com) experienced not just the
usual worsening afternoon Internet brownout but lost their
Internet
mail and Web access for 13 hours. (See "Netcom service
forced into
12-hour shutdown," June 24, page 3.)
Netcom says that this 5.2 megalapse was triggered by an engineer
incorrectly typing an ampersand into a router made by Cisco
Systems
Inc. (http://www.cisco.com). This typo was followed by "a
flood of
non-Netcom BGP [Boundary Gateway Protocol] routes being
introduced into our OSPF [open shortest path first] network
backbone. This led to a chain reaction of routing protocol
fluctuations, which in turn overloaded a majority of the gateway
routers on the Netcom WAN. Our network support staff diagnosed
the problem early and worked through the night rebuilding the
routing tables of our hub and POP routers."
So let's name this the Netcom-Cisco Ampersand Collapse.
Netcom CEO Dave Garrison apologized to his customers on KGO
talk radio in San Francisco. He explained that the collapse was
caused by human error. He admitted that the Ampersand Collapse
had overwhelmed Netcom's telephone support. He promised to meet
with Cisco, maker of most of Netcom's 100 routers, about
preventing future outages.
Interviewed by The Boston Globe, Garrison explained the
Internet is
growing rapidly and there is plenty of room for
competition. Then he
said, "Internet companies face ruthless competition and
don't have
billions to spend on reliability upgrades."
Uh-oh, this despite Netcom's trademarked slogan: The Network
Works. No Excuses.
Now, to err is human, and Internet fogies ask us to accept
this latest
megalapse as nothing new, no big deal. But Garrison's upcoming
meeting with Cisco is important. Cisco should continuously
improve
the software with which its routers are programmed so that
catastrophic human errors are less likely.
Ed Kozel, Cisco's chief technology officer, writes that "network
routing is quite susceptible to human error... complete
flexibility is
driving routing architecture development... in recent years
a lot of
work has gone into creating interdomain routing firewalls and
untrusted routing gateway functions, the result being that,
in general,
routing misbehavior is usually confined to a specific domain."
So we should be encouraged that the Netcom-Cisco Ampersand
Collapse did not escape Netcom and go Internetwide, this time.
While Netcom and Cisco are at it, they should find a way to make
Internet error messages more informative. Throughout the
Ampersand Collapse, Netcom customers were told that their user
names and passwords were incorrect, their calls were
failing, their
network connections were lost, or nothing at all as their
starting
session screens hung.
Now why has Netcom not offered each of its 400,000 customers a
refund for the access lost during the megalapse? Let's see,
that would
be, say, half a day out of 30, or typically 33 cents each.
Seems only
fair.
The Netcom-Cisco Ampersand Collapse and other major outages
should be prominent agenda items at upcoming meetings of
Internet
service providers.
Unfortunately, my favorite of such meetings, those of the North
American Network Operators Group (NANOG), are not likely to
take systematic outage analysis seriously. As one NANOG wag put
it, "This is the 'net, people, deal with it."
What's needed is for NANOG to deal with it. Another NANOG
participant minimized the Netcom-Cisco 5.2 megalapse with this
arithmetic: Since the Internet has 60 million users, the
Netcom outage
inconvenienced far fewer than 1 percent -- some collapse.
He has a
point. There is ample room for much bigger Internet
collapses ahead,
maybe eventually some gigalapses. (See what else the NANOG wags
are writing about at
http://www.merit.edu/mail.archives/html/nanog.)
Bob Metcalfe invented Ethernet in 1973 and founded 3Com Corp. in
1979. He receives E-mail at bob_metcalfe at infoworld.com via the
Internet.
Copyright © 1996 by InfoWorld Publishing Company
At 4:15 PM 7/8/96, Craig A. Huegen wrote:
>Received: by ccmail from lserver.infoworld.com
>>From c-huegen at quad.quadrunner.com
>X-Envelope-From: c-huegen at quad.quadrunner.com
>Received: from quad.quadrunner.com by lserver.infoworld.com with smtp
> (Smail3.1.29.1 #12) id m0udR8U-000x1xC; Mon, 8 Jul 96 17:58 PDT
>Received: from localhost (c-huegen at localhost) by quad.quadrunner.com
>(8.7.5/8.7-quad) with SMTP id QAA12003; Mon, 8 Jul 1996 16:14:36 -0700
>Date: Mon, 8 Jul 1996 16:14:36 -0700 (PDT)
>From: "Craig A. Huegen" <c-huegen at quad.quadrunner.com>
>To: Michael Dillon <michael at memra.com>
>cc: nanog at merit.edu, bob_metcalfe at infoworld.com
>Subject: Re: Hurricanes redefined!
>In-Reply-To: <Pine.BSI.3.93.960708104725.22916I-100000 at sidhe.memra.com>
>Message-ID: <Pine.QUAD.3.94.960708155947.11988A-100000 at quad.quadrunner.com>
>MIME-Version: 1.0
>Content-Type: TEXT/PLAIN; charset=US-ASCII
>
>On Mon, 8 Jul 1996, Michael Dillon wrote:
>
>==>The official definition of a hurricane is winds in excess of 5 km/hr
>==>for a duration of at least 1 hour. Read all about it at
>==>http://www.infoworld.com/cgi-bin/displayNew.pl?/metcalfe/metcalfe.htm
>
>Interesting as well is Bob's incorrectness once again:
>
> "Uh-oh, this despite Netcom's
> trademarked slogan: The Network
> Works. No Excuses. "
>
>See http://www.cisco.com/public/copyright.html, in which you'll find:
>
>"All rights reserved. No portion of this service may be reproduced in any
>form, or by any means, without prior written permission from Cisco
>Systems, Inc. [...], The Network Works. No Excuses. are service marks;
>[...] of Cisco Systems, Inc.[...]"
>
>Bob also states that:
>
> "Cisco should
> continuously improve the software with
> which its routers are programmed so
> that catastrophic human errors are less
> likely. "
>
>Which gives the connotation that cisco Systems doesn't constantly improve
>IOS; everyone who's worked with cisco's software knows it's constantly
>being improved.
>
>Bob also states the following:
>
> "Unfortunately, my favorite of such
> meetings, those of the North American
> Network Operators Group (NANOG),
> are not likely to take systematic outage
> analysis seriously. As one NANOG
> wag put it, "This is the 'net, people, deal
> with it.""
>
>What Bob fails to mention is that no one has a service-level agreement
>with the Internet. The Internet is designed this way--my network connects
>to your network. It is _NOT_ under control of one body. It's very hard
>to GUARANTEE outages to _anyone_ without monetary value involved. And
>generally, "my networks connects to your network" does not have enough
>monetary value to warrant SLA contracts of service. Bob, I challenge you
>to find an Internet Service Provider that gives an END-TO-END service
>level agreement for the Internet. That is, if my web site isn't fast
>enough, you have escalation/remedy procedures. If Joe Blow's sendmail has
>crapped out, you have escalation/remedy procedures. I'll buy you dinner
>if you find one.
>
>Bob once again forgets the original purpose behind the Internet, and he
>apparently has permanently doffed his engineer hat for his non-technical
>businessperson hat long ago.
>
>/cah
______________________________________________
______________________________________________
Dr. Robert M. ("Bob") Metcalfe
Executive Correspondent, InfoWorld and
VP Technology, International Data Group
Internet Messages: bob_metcalfe at infoworld.com
Voice Messages: 617-534-1215
Conference Chairman for
ACM97: The Next 50 Years of Computing
San Jose Convention Center
March 1-5, 1997
______________________________________________
______________________________________________
More information about the NANOG
mailing list