Data Center testing
cabenth at gmail.com
Wed Aug 26 01:09:48 CDT 2009
Most Provider type datacenters I've worked with get a lot of flak from
customers when they announce they're doing network failover testing, because
there's always going to be a certain amount of chance (at least) of
disruption. Its the exception to find a provider that does it I think (or
maybe just one that admits it when they're doing it). Power tests are a
As for testing your own equipment, there are a couple ways to do that,
regular failover tests (quarterly, or more likely at 6 month intervals),
and/or routing traffic so that you have some of your traffic on all paths
(ie internal traffic on one path, external traffic on another). The latter
doesn't necessarily tell you that your failover will work perfectly, only
that all your gear in the 2nd path is functioning. I prefer doing both.
When doing the failover tests, no matter how good your setup is, there's
always a chance for taking a hit, so I
always do this kind of work during a maintenance window, not too close
to quarter end, etc.
If you have your equipment set up correctly of course, it goes like butter
and is a total non-event.
For test procedure, I usually pull cables. I'll go all the way to line cards
or power cables if I really want to test, though that can be hard on
On Mon, Aug 24, 2009 at 10:45 AM, Jack Bates <jbates at brightok.net> wrote:
> Dan Snyder wrote:
>> We have done power tests before and had no problem. I guess I am looking
>> for someone who does testing of the network equipment outside of just
>> tests. We had an outage due to a configuration mistake that became
>> when a switch failed. It didn't cause a problem however when we did a
>> test for the whole data center.
> The plus side of failure testing is that it can be controlled. The downside
> to failure testing is that you can induce a failure. Maintenance windows are
> cool, but some people really dislike failures of any type which limits how
> often you can test. I personally try for once a year. However, a lot can go
> wrong in a year.
More information about the NANOG