Power cut if temps are too high
kaze0010 at umn.edu
Tue May 28 00:41:38 UTC 2019
Where granular temperature readings are available to control scripts, it
would also be possible to implement something like the tiers described
below. Adjust thresholds as deemed appropriate for the facility and
equipment, and also for the expected rates of temperature rise. System
peformance throttling and/or quiescing may also be ways to reduce load (and
thus cooling requirements and heat build up rates) during periods of
reduced or completely lost cooling capacity).
1.) Elevated temperature watch at 77 F / 25 C. Send alerts to on-call staff
but take no other action.
2.) Elevated temperature warning at 81.5 F / 27.5 C. Begin performance
throttling and engage other measures to reduce heat buildup to compensate
for insufficient cooling capacity.
3.) Elevated temperature severe warning at 86 F / 30 C. Begin automated
clean system shutdowns.
4.) Critical temperature limit exceeded at 95 F / 35 C. Trigger EPO to
On sensor redundancy: 3x or higher redundancy allows for voting methods to
be used to rule out potential false readings.
On series vs parallel wiring: either can be used...what makes most sense
depends on the design of the system being integrated with (basically NC vs
On Mon, May 27, 2019, 13:18 Mel Beckman <mel at beckman.org> wrote:
> We use Intermapper, an SNMP network monitoring system, which supports UNIX
> scripting. Intermapper probes two Weathergoose temperature sensors, and
> calls a script with the values it retrieves. When both sensors exceed a
> certain threshold, the script sends an snmp relay trip signal to the
> Weathergoosen, which close a pair of dry contacts wired in series to the
> emergency power off contacts for the whole-room UPS.
> We chose to use two sensors and two dry contact relays to protect against
> false trips, and thus false shut downs. Before the trigger temperature is
> reached, the NMS would have sent various escalating alarms to on call
> staffers, who hopefully would intervene before this point. This protection
> is for the worst case scenario where nobody responds and the equipment is
> at risk of damage.
> We could have commanded an orderly shut down to all servers, but decided
> that it would be better to kill the power in the event of a runaway heat
> vent than to try to make it through all the disk activity necessary for a
> clean shut down.
> This system has triggered one time, successfully shutting down the data
> center on a holiday weekend when people missed their notifications, and
> undoubtedly saved a lot of hard drives. When we got to the room the
> temperature was over 115°, but the power was cut at 95°.
> On May 27, 2019, at 11:01 AM, Dovid Bender <dovid at telecurve.com> wrote:
> Is anyone aware of a device that will cut the power if the room goes above
> X degrees? I am looking for something as a just in case.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NANOG