Most energy efficient (home) setup

Leo Bicknell bicknell at ufp.org
Mon Apr 16 12:39:34 UTC 2012


In a message written on Sun, Apr 15, 2012 at 09:54:14PM -0400, Luke S. Crawford wrote:
> On my current fleet (well under 100 servers)  single bit errors are so rare
> that if I get one, I schedule that machine for removal from production. 

In a previous life, in a previous time, I worked at a place that
had a bunch of Cisco's with parity RAM.  For the time, these boxes
had a lot of RAM, as they had distributed line cards each with their
own processor memory.

Cisco was rather famous for these parity errors, mostly because of
their stock answer: sunspots.  The answer was in fact largely
correct, but it's just not a great response from a vendor.  They
had a bunch of statistics though, collected from many of these
deployed boxes.

We ran the statistics, and given hundreds of routers, each with
many line cards the math told us we should have approximately 1
router every 9-10 months get one parity error from sunspots and
other random activity (e.g. not a failing RAM module with hundreds
of repeatable errors).  This was, in fact, close to what we observed.

This experience gave me two takeaways.  First, single bit flips are
rare, but when you have enough boxes rare shows up often.  It's
very similar to anyone with petabytes of storage, disks fail every
couple of days because you have so many of them.  At the same time
a home user might not see a failure in their lifetime (of disk or
memory).

Second though, if you're running a business, ECC is a must because
the message is so bad.  "This was caused by sunspots" is not a
customer inspiring response, no matter how correct.  "We could have
prevented this by spending an extra $50 on proper RAM for your $1M
box" is even worse.

Some quick looking at Newegg, 4GB DDR3 1333 ECC DIMM, $33.99.  4GB
DDR3 1333 Non-ECC DIMM, $21.99.  Savings, $12.  (Yes, I realize the
Motherboard also needs some extra circuitry, I expect it's less than $1
in quantity though).

Pretty much everyone I know values their data at more than $12 if it
is lost.

-- 
       Leo Bicknell - bicknell at ufp.org - CCIE 3440
        PGP keys at http://www.ufp.org/~bicknell/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 826 bytes
Desc: not available
URL: <http://mailman.nanog.org/pipermail/nanog/attachments/20120416/a96b3d7f/attachment.sig>


More information about the NANOG mailing list