Operate until failure

Eric A. Hall ehall at ehsco.com
Mon Jan 8 23:49:15 UTC 2001



> One issue with highly redudandent data centers is the failure modes
> are "interesting."  You don't want to shutdown due to a single UPS
> failure, so you don't use something simple like PowerChute Plus. You
> most likely don't want to shutdown based on any automatic signal.
> However, you do want a way for an operator to gracefully shutdown a
> lot of equipment quickly when the decision is made.

The old Deltec stuff was good about this. They had it so that a server
daemon would notify different groups at different stages.

	Power lost->notify group A (Printers, PCs)
	Low battery->notify group B (Secondary servers)
	Dead battery->notify group C (Primary servers, comms)

They also had different outlets on different "groups", so if a device
wasn't able to understand the network alert (the routers and firewalls
don't have agents), they could be terminated as a part of a group.

Deltec got bought by somebody and I'm sure a lot of this stuff has changed
since I last looked at it, but it was a good design.

-- 
Eric A. Hall                                        http://www.ehsco.com/
Internet Core Protocols          http://www.oreilly.com/catalog/coreprot/




More information about the NANOG mailing list