GoDaddy.com shuts down entire data center?

Steve Gibbard scg at gibbard.org
Tue Jan 17 01:04:48 UTC 2006


On Sun, 15 Jan 2006, Elijah Savage wrote:

>
> Any validatity to this and if so I am suprised that our team has got no calls 
> on not be able to get to certain websites.
>
> http://webhostingtalk.com/showthread.php?t=477562

Casting blame may be a fun exercise.  Listening to others cast blame gets 
old fast.  The more useful question here is whether there are lessons the 
rest of us can learn from this incident.

The most important lesson is probably that your problems will almost 
always be more important to you than to somebody else. If you end up with 
a business killing problem, it doesn't matter if it's somebody else's 
fault -- you're the one who will be out of business.  Likewise, you 
shouldn't go wandering out into heavy traffic just because the drivers are 
required by law to stop for you.

Choosing your vendors carefully is important.  Having a backup plan for 
what to do if your vendors fail you is a good thing, but it's nice not to 
have to use the backup plan.  Likewise, if something is really important 
to you, make sure your vendors know that.  Nobody wants to suddenly find 
out in the middle of the night that they're responsible for something 
critical.

Knowing what's important to you in advance can help you figure out what 
arrangements need to be made.  If your hosting operation won't run without 
power, Internet connectivity, and DNS, making sure your power, 
connectivity, and DNS are robust matters a lot.  If your business can 
continue to operate for a few days without toner for your laser printer, 
choosing a less reliable toner supplier is probably ok.

If you do need to call your vendors, having a clear explanation of what's 
going on is often a good thing.  "An entire datacenter" is an awfully 
vague term.  If that were all of, say, Equinix Ashburn, it would be a big 
enough deal that government regulators would probably be concerned.  But a 
room in the back of somebody's office with a rack of servers in it could 
also be justifiably called a "datacenter" (and a rack of servers in the 
back of somebody's office could also be important to somebody).  It's 
probably better to be able to say, "x number of domains are down, 
representing y amount of revenue for our company and z critical service 
that the rest of the Internet relys on.  This might put us out of 
business."  This still may not get the desired response -- it's not your 
vendor who is going to be put out of business -- but it at least gives the 
person on the other end of the phone call some idea of what they're 
dealing with.

Protecting everything you've decided is important may be expensive.  It 
may not be worth the cost.  It's best to have made that calculation before 
the problem starts, when there's still time to spend money on protection 
if you do decide it's worth it.

Not having all your DNS servers in the same domain, or registered through 
the same registrar, isn't a "best practice" that has previously occurred 
to me, but it makes a lot of sense now that I think about it.  Looking at 
the big TLDs, .com and .net have all their servers in the gtld-servers.net 
domain, but Verisign controls .net and can presumably fix gtld-servers.net 
if it breaks.  UltraDNS has their TLD servers (for .org and others) in 
several different TLDs.  Maybe that is to protect against this sort of 
thing.

And there's a PR lesson here, too.  I'd never heard of Nectartech before 
this, and I'm guessing that's the case for a lot of NANOG readers.  Having 
heard this story, I'd be hesitant to register a domain with GoDaddy, and 
that was presumably the goal.  But I'd be hesitant to rely on a company 
with a name like GoDaddy anyway, just because of the name.  Now that I've 
heard of Nectartech, I know them as the company that had the outage. 
That's not exactly a selling point.

I've certainly got sympathy for Mr. Perkel.  I've learned a lot of the 
lessons above the hard way, some due to my own miscalculations and some 
due to working for companies that didn't value my time and stress levels 
as highly as I would have liked (choosing your employers carefully is 
important too...).

These lessons don't apply just to networking.  The loss prevention 
department of a bank once locked my account for "suspicious activity" on a 
Friday afternoon and then left for the weekend.  I had two dollars in my 
wallet, and didn't have much food.  Escalating as far as I could through 
the ranks of people working the bank's customer service lines on Friday 
evening, I didn't manage to find anybody who didn't think I should just 
wait until Monday.  Multiple accounts at different banks, neither of which 
is the bank that locked my account, now seem like a very good idea.

-Steve



More information about the NANOG mailing list