"They all suck!" Re: UPS failure modes (was: fire at NAC)

Alex Rubenstein alex at nac.net
Thu May 29 20:39:56 UTC 2003




> UPSes (and UPS batteries) do fail, sometimes in catastrophic ways.  I
> would not design any critical system on the assumption that any particular
> component won't fail.  High availability is about designing for failure.
> Sometimes there is a long time between failures, other times they occur
> early and often.  The most annoying thing about UPSes is they fail at
> exactly the time they are needed most.

Except, that:

Even in instances where 'High availability' is designed, in the case where
one of the units has a failure that causes a fire and FM200 dump, either
the FM200 will still trigger an EPO, or the fire department will.

So, the second 'high available' unit will generally not prevent you from
dropping the critical load, but instead, will help you get back on line
quicker.

A much cheaper and easier to implement external maintenance
make-before-break bypass will accomplish the same thing.

I've heard many a story of the paralleling gear causing the problem in the
first place, as well...



-- Alex Rubenstein, AR97, K2AHR, alex at nac.net, latency, Al Reuben --
--    Net Access Corporation, 800-NET-ME-36, http://www.nac.net   --




More information about the NANOG mailing list