San Francisco Power Outage

Stephen Wilcox steve.wilcox at packetrade.com
Wed Jul 25 11:04:17 UTC 2007


On Tue, Jul 24, 2007 at 11:57:37PM +0000, Paul Vixie wrote:
> 
> sethm at rollernet.us (Seth Mattinen) writes:
> 
> > I have a question: does anyone seriously accept "oh, power trouble" as a 
> > reason your servers went offline? Where's the generators? UPS? Testing 
> > said combination of UPS and generators? What if it was important? I 
> > honestly find it hard to believe anyone runs a facility like that and 
> > people actually *pay* for it.
> > 
> > If you do accept this is a good reason for failure, why?
> 
> sometimes the problem is in the redundancy gear itself.  PAIX lost power
> twice during its first five years of operation, and both times it was due
> to faulty GFI in the UPS+redundancy gear.  which had passed testing during
> construction and subsequently, but eventually some component just wore out.

I had an issue with exactly that 7 or 8 years ago at Via Networks.. the switchover gear shorted and died horrifically leading to an outage that lasted well through the night (something like 16hours in total). Being on a Friday evening it was difficult to get people on site promptly.

The lesson learned was 'the big switch' .. a huge thing that took the weight of two adults to move it, but did mean that should something similar occur we could transfer the whole building power manually directly to the generator.

I doubt such a beast would scale to the power loads on a large datacentre tho, but then they are generally not on a single grid/UPS feed.

Steve



More information about the NANOG mailing list