Network Operations Luminaries?
riz at boogers.sf.ca.us
Tue Dec 18 18:14:27 UTC 2001
On Mon, Dec 10, 2001 at 11:15:25AM -0800, Clayton Fiske wrote:
> On Mon, Dec 10, 2001 at 10:18:48AM -0700, Joel Baker wrote:
> > I don't have a timeline to know which happened first; 2551 was down, or at
> > least the majority of it was, for something on the order of 48-72 hours.
> > The rendition I heard assigned the moniker because one of the major news
> > outlets said that the mistake which triggered it was "like a misplaced
> > ampersand". It was certainly the one on the wall beside his legacy Chevy's
> > sombrero.
> My oh my, how the versions differ.
> As was recounted to me, the outage was about 19 hours, and was due to
> the semantics of Cisco config mode. Something like:
> router ospf 1234
> redistribute bgp subnets route-map blah
> (everything fine, now let's turn it off...)
> no redistribute bgp subnets route-map blah
> So tell me, does this turn off the redistribution, or just remove the
> route-map... :)
> And this is certainly worth remembering. Had I not known of this, I
> could likely have made the same mistake at some point.
Ye gods. Guess I'd better keep a closer eye on NANOG.
For the record: This version (Clay's) is more or less correct. :)
The "oops" was pretty much *exactly* that... removing a route-map
when intending to remove a redistribution.
In my defense, however, we discovered the problem fairly quickly, but
OSPF bug extant in IOS 10.whatever that we were running prevented us
from cleaning it up in a timely manner. We basically wound up
partitioning the network and rebooting *every* ospf-speaking device
on it, because they were getting poisoned with gobs of LSA data that was
never getting cleared, and was propagating the bogus information.
Also for the record: I have *no idea* where Bob Metcalfe got that
"ampersand" thing. It did make for an amusing engraving on
a going-away gift I was given some years later:
Jeff Rizzo http://boogers.sf.ca.us/~riz
More information about the NANOG