Power cut if temps are too high

Warren Kumari warren at kumari.net
Tue May 28 13:45:41 UTC 2019

I used to work for a small, fairly crappy ISP -- the "datacenter" was
a converted brick garage / loading dock. In order to provide cooling,
they had chipped out a bunch of bricks, and mounted in 8 or so AC
units, all in a line.

We monitored everything with WhatsUp Gold[0] - one (hot) night I'm
oncall, and at 3:30AM I get an alert that the environmental sensors on
one of the routers thinks it's too hot. I'm tired and grumpy, and it's
only slightly too hot, so I ack it and go back to bed. A short while
later I get paged again - another router now thinks it is
uncomfortably warm. Still grumpy, so I ack that too, and back to bed.
Sure enough, 20 minutes later, another page.... Fine, I get dressed,
drive over to the location -- and realize that bricks / mortar are
strong in compression, but weak in tension - the AC window units have
been quietly vibrating for many years, and the entire row of bricks
above the AC units has popped out. All the AC units are lying outside
the building on the grass, still running.... :-) I stared at them for
a bit, unsure what to do -- so I turned them off, bumped up the
monitoring levels, and went back to bed... Next day we blocked up the
hole, installed some temporary chillers, and then finally installed
real colling....

There isn't much point to this story, but I've got a cold, and wanted
to share... :-P

[0]: Wow, I just realized that WUG still exists... huh.

On Tue, May 28, 2019 at 9:13 AM Thomas Bellman <bellman at nsc.liu.se> wrote:
> On 2019-05-27 18:18 +0000, Mel Beckman wrote:
> > Before the trigger temperature is reached, the NMS would have sent
> > various escalating alarms to on call staffers, who hopefully would
> > intervene before this point.
> Would they actually have time to react and do something?  In our
> datacenters, we reach our cut-off temperature in about 20 minutes
> if cooling stops.
> > This system has triggered one time, successfully shutting down the data
> > center on a holiday weekend when people missed their notifications, and
> > undoubtedly saved a lot of hard drives. When we got to the room the
> > temperature was over 115°, but the power was cut at 95°.
> Presumably that was °F, not °C.
> I have heard from people who did *not* have automatic cutting of the
> power at high temperatures.  Their computer room reached 100°C in
> places; some keyboards apparently looked like a certain Salvador Dali
> painting afterwards...  (But I think they had very few actual servers
> or disk drives breaking.)  The reason it didn't get even hotter, was
> that as temperature rose, servers started overheating and shut them-
> selves down, thus lowering power disippation more and more.
> Our system for cutting power at high temperatures is part of the PLC
> monitoring power and temperature in the computer rooms.  It sends a
> signal to the large breakers connecting the power subcentrals (where
> all the 16A fuses are) to the power rail feeding the room.  I believe
> our PLCs are from Schneider Electric, but anyone who delivers PLCs
> for controlling power and cooling in a datacenter should be capable
> or programming their PLCs to do the same.  You just need to remember
> putting it in the specifications when you contract the building. :-)
>         /Bellman

I don't think the execution is relevant when it was obviously a bad
idea in the first place.
This is like putting rabid weasels in your pants, and later expressing
regret at having chosen those particular rabid weasels and that pair
of pants.

More information about the NANOG mailing list