Quick question.

Paul Jakma paul at clubi.ie
Sun Aug 1 22:45:42 UTC 2004


On Sun, 1 Aug 2004, John Underhill wrote:

> Not necessarily. There have been a number of innovations in recent years in
> the area of integrated fault tolerance, including bios level controls over
> component monitoring / management. Some of the more upscale Compaq G3
> servers for instance, can remove a processor from operation if it exceeds a
> threshold of critical errors, (this is also true for memory).

Interesting to know. Those usually are due to ECC errors in CPU 
caches often due to overheating. The CPU is still functional to a 
degree though, marginal failure as opposed to catastrophic.

But what of electrical failures? Even P4 class machines still share a 
host bus amongst CPUs no?

Anyway, CPUs (if kept sufficiently cool) tend to one of the more 
reliable components in a system, if they are good to begin with.

> Alphas can boot even if the bootstrap processor fails at system 
> start, and simply selects the next available processor..

Alphas are quite nice, they have support for lockstep operation too. 
Tandem were supposed to have been moving to Alpha for their Himalaya 
F-T servers when DEC bought them. Also the 21164 and up (not sure 
about 21064) AXPs used a point-to-point bus for SMP[1], they were all 
electrically isolated from each other - at least, a failure of one 
CPU couldnt affect the other CPUs.

> So that if a process runs amock on a single bus architecture, the 
> second processor will not have the resources it needs to run 
> effectively..

Processes running amok still only have access to those resources 
granted it. Processes generally do not have access to bare IO. What 
the OS giveth, it can take away (or constrain).

1. Still alive and well in a sense, but now developed into a general 
purpose PtP local CPU/IO interconnect: AMDs' HyperTransport as used 
in K8.

regards,
-- 
Paul Jakma	paul at clubi.ie	paul at jakma.org	Key ID: 64A2FF6A
Fortune:
Don't get stuck in a closet -- wear yourself out.



More information about the NANOG mailing list