What to expect after a cooling failure

Jay Ashworth jra at baylink.com
Wed Jul 10 04:04:23 UTC 2013

----- Original Message -----
> From: "Erik Levinson" <erik.levinson at uberflip.com>

> For those who have gone through such events in the past, what can one
> expect in terms of long-term impact...should we expect some premature
> component failures? Does anyone have any stats to share?

If the HDDs were spinning while above rated maximum ambient intake temp,
*especially* if they're not *right out front in the intake path* (is
anything not built that way anymore?  Yeah; the back side of 45-drive
Supermicro racks, among other things), you should probably plan on doing
a preemptive replacement cycle, or at the very least, pay *very* close
attention to smartctld, and have a good stock of pre-trayed replacements.

Remember that you may fall in the RAID Hole if you wait for failures,
and hence lose data which isn't backed up anyway -- if more drives in a 
raid group fail *during rebuilds*, you're essentially screwed.

If your raid groups were properly dispersed across drive build dates, then
this will probably be *slightly* less dangerous, but still.

Also watch bearing-type fans.

-- jra
Jay R. Ashworth                  Baylink                       jra at baylink.com
Designer                     The Things I Think                       RFC 2100
Ashworth & Associates     http://baylink.pitas.com         2000 Land Rover DII
St Petersburg FL USA               #natog                      +1 727 647 1274

More information about the NANOG mailing list