Level3 worldwide emergency upgrade?

Thu Feb 7 01:18:22 UTC 2013

ah - those were the days of glory... :)

On Wed, Feb 06, 2013 at 06:06:39PM -0700, Brett Watson wrote:
> Hell, we used to not have to bother notifying customers of anything, we just fixed the problem. Reminds me a of a story I've probably shared on the past. 
> 
> 1995, IETF in Dallas. The "big ISP" I worked for at the time got tripped up on a 24-day IS-IS timer bug (maybe all of them at the time did, I don't recall)  where all adjacencies reset at once. That's like, entire network down. Working with our engineering team in the *terminal* lab mind you, and Ravi Chandra (then at Cisco) we reloaded the entire network of routers with new code from Cisco once they'd fixed the bug. I seem to remember this being my first exposure to Tony Li's infamous line, "... Confidence Level: boots in the lab."
> 
> Good times.
> 
> -b
> 
> 
> On Feb 6, 2013, at 5:41 PM, Brandt, Ralph wrote:
> 
> > David. I am on an evening shift and am just now reading this thread.   
> > 
> > I was almost tempted to write an explanation that would have had
> > identical content with yours based simply on Level3 doing something and
> > keeping the information close.  
> > 
> > Responsible Vendors do not try to hide what is being done unless it is
> > an Op Sec issue and I have never seen Level3 act with less than
> > responsibility so it had to be Op Sec. 
> > 
> > When it is that, it is best if the remainder of us sit quietly on the
> > sidelines.
> > 
> > Ralph Brandt
> > 
> > 
> > -----Original Message-----
> > From: Siegel, David [mailto:David.Siegel at Level3.com] 
> > Sent: Wednesday, February 06, 2013 12:01 PM
> > To: 'Ray Wong'; nanog at nanog.org
> > Subject: RE: Level3 worldwide emergency upgrade?
> > 
> > Hi Ray,
> > 
> > This topic reminds me of yesterday's discussion in the conference around
> > getting some BCOP's drafted.  it would be useful to confirm my own view
> > of the BCOP around communicating security issues.  My understanding for
> > the best practice is to limit knowledge distribution of security related
> > problems both before and after the patches are deployed.  You limit
> > knowledge before the patch is deployed to prevent yourself from being
> > exploited, but you also limit knowledge afterwards in order to limit
> > potential damage to others (customers, competitors...the Internet at
> > large).  You also do not want to announce that you will be deploying a
> > security patch until you have a fix in hand and know when you will
> > deploy it (typically, next available maintenance window unless the cat
> > is out of the bag and danger is real and imminent).
> > 
> > As a service provider, you should stay on top of security alerts from
> > your vendors so that you can make your own decision about what action is
> > required.  I would not recommend relying on service provider maintenance
> > bulletins or public operations mailing lists for obtaining this type of
> > information.  There is some information that can cause more harm than
> > good if it is distributed in the wrong way and information relating to
> > security vulnerabilities definitely falls into that category.
> > 
> > Dave
> > 
> > -----Original Message-----
> > From: Ray Wong [mailto:rayw at rayw.net] 
> > Sent: Wednesday, February 06, 2013 9:16 AM
> > To: nanog at nanog.org
> > Subject: Re: Level3 worldwide emergency upgrade?
> > 
> >> 
> > 
> > OK, having had that first cup of coffee, I can say perhaps the main
> > reason I was wondering is I've gotten used to Level3 always being on top
> > of things (and admittedly, rarely communicating). They've reached the
> > top by often being a black box of reliability, so it's (perhaps
> > unrealistically) surprising to see them caught by surprise. Anything
> > that pushes them into scramble mode causes me to lose a little sleep
> > anyway. The alternative to what they did seems likely for at least a few
> > providers who'll NOT manage to fix things in time, so I may well be
> > looking at longer outages from other providers, and need to issue
> > guidance to others on what to do if/when other links go down for periods
> > long enough that all the cost-bounding monitoring alarms start to scream
> > even louder.
> > 
> > I was also grumpy at myself for having not noticed advance
> > communication, which I still don't seem to have, though since I
> > outsourced my email to bigG, I've noticed I'm more likely to miss
> > things. Perhaps giving up maintaining that massive set of procmail rules
> > has cost me a bit more edge.
> > 
> > Related, of course, just because you design/run your network to tolerate
> > some issues doesn't mean you can also budget to be in support contract
> > as well. :) Knowing more about the exploit/fix might mean trying to find
> > a way to get free upgrades to some kit to prevent more localized attacks
> > to other types of gear, as well, though in this case it's all about
> > Juniper PR839412 then, so vendor specific, it seems?
> > 
> > There are probably more reasons to wish for more info, too. There's
> > still more of them (exploiters/attackers) than there are those of us
> > trying to keep things running smoothly and transparently, so anything
> > that smells of "OMG new exploit found!" also triggers my desire to share
> > information. The network bad guys share information far more quickly and
> > effectively than we do, it often seems.
> > 
> > -R>
> > 
> > 
> > 
> 
>