Operate until failure

Eric Whitehill ericw at mercury.xtratyme.com
Mon Jan 8 14:49:17 UTC 2001



1.  When I had a power supply fail in a fileserver about a year ago, I
limped it along until my next maintence window (which happened to be in 24
hours, thank goodness) and replaced it then.  It was only a 10 minute
downtime for my users who were very happy because there was no downtime
durning business hours.  Usually this is what I will do.  The less
downtime I can have outside my maintence windows, the better. 

2.  Depends.  If there is a chance I'll break something if I don't shut it
all down, I will.  If there is not a likely chance I'll break it, then
great, I'll keep working.

If I have to shut down my database server, I'll switch over to the backup
and keep working and then do the repairs and bring my backup online.  

We've had issues here with power outages and usually the UPS' will hold.
The one time they didn't, we went and brought all the machines down
gracefully as we didn't have the auto-shutdown installed on the systems.  

While I do realize this is describing the "perfect" problem, there will be
times when a NIC will fail or someone will cut the fiber, and then you
just have to handle it the best way you know how to get the issue
resolved, then take a blunt object (like the clue phone) to the person who
cut the fiber.  ;-)

-Eric

-- 
Eric Whitehill		ericw at xtratyme.com
Network Engineer	XtraTyme Technologies
320.864.8513		http://www.xtratyme.com


> > 1. Do you attempt to preserve service as long as possible, including
> > running equipment to the point of destruction?

> > 2. Do you attempt to minimize recovery time by shutting down equipment
> > to a "safe" condition before failure?

> > If you are running a database/transaction oriented system, I would expect
> > you want to put the database into a stable condition.  On the other hand,
> > if you are operating mostly communication equipment, you would want to
> > leave it operating as long as possible.

> > I'm aware of a variety of proprietary software shutdown programs associated
> > with UPS vendors.  But I'm wondering do any "open standards" exist for
> > initiating soft shutdowns?





More information about the NANOG mailing list