Quick question.

Alexei Roudnev alex at relcom.net
Wed Aug 4 05:54:39 UTC 2004


No need.

Remove disk. Insert isk to spare. Start spare server. Allow techs to analyze
broken server next day.

1 minute. But in reality, 2 CPU servers are redundant to most COPU failures
(had a few cases). Anyway, CPU faiolure is not major reason for server
failures (and never was).




>
> On Sun, Aug 01, 2004 at 09:44:13AM -0700, Michel Py wrote:
> > In other words, I don't really care if the second processor reduces the
> > MTBF from 200k hours to 60k hours, but I do care if the second processor
> > reduces the time to restore service from 24 hours to 20 minutes (7.5
> > minutes for SNMP to fail the query twice, 1.5 minute for the tech to
> > find out that either it's frozen or there's a BSOD, 6 minutes to have
> > someone go there and reset, 5 minutes to reboot).
>
> With the right form factor (nice easy-to-open rackmount unit) it will take
> just as little time to swap in an on-site cold-spare. That way you get the
> nice MTBF and the short restore time. Also, if you have multiple similar
> machines, you drastically reduce your spares inventory.
>
> > Unsignificant in my experience, and does not balance what Alexei
> > mentioned yesterday: a duallie will keep the system up when a faulty
> > process hogs 100% CPU, because the second one is still available. That
> > also increases availability ratio.
>
> These days you can achieve the same using hyper-threading for example,
> and keep the long MTBF :)
>
> -- 
> Colm MacCárthaigh                        Public Key: colm+pgp at stdlib.net




More information about the NANOG mailing list