Never push the Big Red Button (New York City subway failure)

Tom Beecher beecher at beecher.cc
Wed Sep 15 18:35:43 UTC 2021


>
> If the generators are "emergency power", and you need to switch back to
> "utility power", obviously the way to do this must be the big red button,
> clearly marked as "EMERGENCY POWER OFF", no?!
>

The owner of my previous company did the same thing to us many years ago
because there was a small smudge on the placard between POWER and OFF that
he interpreted as a dash.

He was never happy with the custom sign I hung after that, REVENUE
REDUCTION SWITCH. But he never tried to be helpful after that, so
mission accomplished.

On Fri, Sep 10, 2021 at 4:35 PM Warren Kumari <warren at kumari.net> wrote:

>
>
> On Fri, Sep 10, 2021 at 4:21 PM Baldur Norddahl <baldur.norddahl at gmail.com>
> wrote:
>
>> A nearby datacenter once lost power delayed because someone hit the
>> switch to transfer from city power to generator power and then failed to
>> notice. The power went out the day after when there was no fuel left.
>>
>
> :-)
>
> A story, told to me by a friend...
>
> The utility let them know that they were going to be doing some
> maintenance work in the area. No impact expected, but out of an abundance
> of caution, they transfer over to generators. After the utility lets them
> know that the maintenance work is all finished, they want to switch back.
> If the generators are "emergency power", and you need to switch back to
> "utility power", obviously the way to do this must be the big red button,
> clearly marked as "EMERGENCY POWER OFF", no?!
>
> I suspect it is apocryphal, but it's still entertaining,
> W
>
>
>
>>
>> On Fri, Sep 10, 2021 at 9:24 PM Matthew Huff <mhuff at ox.com> wrote:
>>
>>> Since we are telling power horror stories…
>>>
>>>
>>>
>>>
>>>
>>> How about the call from the night operator that arrived at 10:00pm
>>> asking “Is there any reason there is no power in the data center?”
>>>
>>>
>>>
>>> Turns out someone had plugged in a new high end workgroup laser printer
>>> to the outside wall of the datacenter. The power receptacle was wired into
>>> the data center’s UPS and completely smoked the UPS. Luckily the static
>>> transfer switched worked, but the three mainframes weren’t’ happy…
>>>
>>>
>>>
>>>
>>>
>>> Or
>>>
>>>
>>>
>>> Our building had a major ground fault issue that took years to find and
>>> resolve. We got hit with lightning that caused the mainframe to fault and
>>> recycle…and two minutes in, we got hit by lightning again. When the system
>>> failed to start, we called IBM support. When we explained what happened
>>> there was a very long pause…then some mumbling off phone, then the manager
>>> got on the line and said someone would be flying out and be onsite within
>>> 12 hours. We were down for 3 days, and got fined $250,000 by the insurance
>>> regulators since we couldn’t pay claims.
>>>
>>>
>>>
>>> *Matthew Huff* | Director of Technical Operations | OTA Management LLC
>>>
>>>
>>>
>>> *Office: 914-460-4039*
>>>
>>> *mhuff at ox.com <mhuff at ox.com> | **www.ox.com <http://www.ox.com>*
>>>
>>>
>>> *...........................................................................................................................................*
>>>
>>>
>>>
>>> *From:* Chris Kane <ccie14430 at gmail.com>
>>> *Sent:* Friday, September 10, 2021 3:16 PM
>>> *To:* Christopher Morrow <morrowc.lists at gmail.com>
>>> *Cc:* Matthew Huff <mhuff at ox.com>; nanog at nanog.org
>>> *Subject:* Re: Never push the Big Red Button (New York City subway
>>> failure)
>>>
>>>
>>>
>>> True EPO story; maintenance crew carrying new drywall into the data
>>> center backed into the EPO that didn't have a cover on it. One of the most
>>> eerie sounds in networking...a completely silent data center.
>>>
>>>
>>>
>>> -chris
>>>
>>>
>>>
>>> On Fri, Sep 10, 2021 at 2:48 PM Christopher Morrow <
>>> morrowc.lists at gmail.com> wrote:
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Sep 10, 2021 at 1:49 PM Matthew Huff <mhuff at ox.com> wrote:
>>>
>>> Reminds me of something that happened about 25 years ago when an
>>> elementary school visited our data center of the insurance company where I
>>> worked. One of our operators strategically positioned himself between the
>>> kids and the mainframe, leaned back and hit it's EPO button.
>>>
>>>
>>>
>>> Or when your building engineering team cuts themselves a new key for the
>>> 'main breaker' for the facility... and tests it at 2pm on a tuesday.
>>>
>>> Or when that same team cuts a second key (gotta have 2 keys!) and tests
>>> that key on the same 'main breaker' ... at 2pm on the following tuesday.
>>>
>>>
>>>
>>> <quadruple face palm>
>>>
>>>
>>>
>>> not fakenews, a real story from a large building full of gov't employees
>>> and computers and all manner of 'critical infrastructure' for the agency
>>> occupying said building.
>>>
>>>
>>>
>>> Matthew Huff | Director of Technical Operations | OTA Management LLC
>>>
>>> Office: 914-460-4039
>>> mhuff at ox.com | www.ox.com
>>>
>>> ...........................................................................................................................................
>>>
>>> -----Original Message-----
>>> From: NANOG <nanog-bounces+mhuff=ox.com at nanog.org> On Behalf Of Sean
>>> Donelan
>>> Sent: Friday, September 10, 2021 12:38 PM
>>> To: nanog at nanog.org
>>> Subject: Never push the Big Red Button (New York City subway failure)
>>>
>>> NEW YORK CITY TRANSIT RAIL CONTROL CENTER POWER
>>> OUTAGE ISSUE ON AUGUST 29, 2021
>>> Key Findings
>>> September 8, 2021
>>>
>>>
>>>
>>> https://www.governor.ny.gov/sites/default/files/2021-09/WSP_Key_Findings_Summary-for_release.pdf
>>>
>>> Key Findings
>>> [...]
>>>
>>> 3. Based on the electrical equipment log readings and the manufacturer’s
>>> official assessment, it was determined that the most likely cause of RCC
>>> shutdown was the “Emergency Power Off” button being manually activated.
>>>
>>> Secondary Findings
>>>
>>> 1. The “Emergency Power Off” button did not have a protective cover at
>>> the
>>> time of the shutdown or the following WSP investigation.
>>>
>>> [...]
>>> Mitigation Steps
>>>
>>> 1. Set up the electrical equipment Control and Communication systems
>>> properly to stay active so that personnel can monitor RCC electrical
>>> system operations.
>>>
>>> [...]
>>>
>>>
>>>
>>>
>>> --
>>>
>>> Chris Kane
>>>
>>
>
> --
> The computing scientist’s main challenge is not to get confused by the
> complexities of his own making.
>   -- E. W. Dijkstra
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nanog.org/pipermail/nanog/attachments/20210915/1465a79f/attachment.html>


More information about the NANOG mailing list