Akamai server reliability
Vinny Abello
vinny at tellurian.com
Mon Nov 28 19:02:17 UTC 2005
At 01:39 PM 11/28/2005, Roy wrote:
>Hi,
>
>Many moons ago, we got a set of Akamai servers. Over the years I
>think they replaced every one of them at least once. Last August we
>got a another set of servers due to a move and now two of those
>three servers have failed.
>I still have the original server that started garlic.com in
>production after 11+ years so I know servers can last a long
>time. I don't understand why Akamai failure rates are so high
>
>Is anyone else seeing high failure rates of Akamai servers at their
>facilities?
Out of the total three Akamai servers we have, I think we've had two
of them replaced in the past three or four years that we've had them.
One was replaced several times. The replacement servers tend to be
refurbished and I've seen multiple things wrong with them when they
arrive. If I recall correctly, one replacement wouldn't even boot
successfully... Just kept crashing. Reloading the OS from an Akamai
recovery CD had no affect. Shipping does cause problems whereby the
parts can come loose during transit.
The most common problem we see is failed hard drives and/or SCSI bus
errors which are likely related to the hard drive failures. I'm
surprised Akamai doesn't have any hardware RAID with hot swap yet (at
least not in the boxes we have). It would be much less costly for
them to ship a new hard drive than a whole new server each time a
hard drive fails. I know the idea is to have very cheap boxes in
clusters, but I wonder how much they're paying in shipping for
replacing the cheap hardware.
As of late, we've had no known problems with our Akamai boxes. That
one box does occasionally have weird SCSI hangs where the other two
work nonstop. For the most part it is fine though.
Vinny Abello
Network Engineer
Server Management
vinny at tellurian.com
(973)300-9211 x 125
(973)940-6125 (Direct)
PGP Key Fingerprint: 3BC5 9A48 FC78 03D3 82E0 E935 5325 FBCB 0100 977A
Tellurian Networks - The Ultimate Internet Connection
http://www.tellurian.com (888)TELLURIAN
"Courage is resistance to fear, mastery of fear - not absence of
fear" -- Mark Twain
More information about the NANOG
mailing list