Journal of Internet Disasters

Michael Dillon michael at memra.com
Sat Nov 14 19:33:29 UTC 1998


On Fri, 13 Nov 1998, Marc Slemko wrote:

> What you are discussing is a problem, but not "the" problem and not a
> problem that causes a significant impact over the short term.

What I'm getting at is that on a network you cannot simply point the
finger at the bad guys, NSI and say that since they screwed up everything
is their fault. Everyone who interacts with NSI's servers also has a
responsibility to arrange their operations so that an NSI problem cannot
cause cascading failures. Especially so since NSI is known to regularly
screw up like this.

That means that the other root nameserver operators have a responsibility
to limit the damage that NSI can do to them. You will also note that some
ISPs attempt to mitigate the damage by running their own root zones which
allows them to fix things without waiting for the NSI bureaucracy to get
around to fixing their servers. 

> It is important to keep that clear in messages; NSI has already spread
> enough lies, so any confusion about the issue isn't wise.

Nevertheless, there are other lessons to be learned from the incident
besides the fact that NSI's internal operations are a mess.

> The big issue that needs to be addressed is why the heck it took NSI over
> two hours after they were notified to fix it, 

Precisely! Part of NSI's problem is that they simply do not have the
skilled professionals available to build a proper robust architecture.
This is evident not only in their nameserver operations but also in the
domain name registry as well. But NSI also suffers from the bureaucratic
disease that does not give front-line people the authority and the
responsibility to fix things fast.

> The organization that controls the root nameservers should have one of the
> best operations departments, not one of the worst.

The solution to this problem is to take this operational responsibility
away from NSI. And then to run it totally transparently so that if a
problem like this occurred there would be no veil of secrecy. IN such an
important infrastructure operation, every detail of the event logs
complete with names and dates and times and the content of internal email
messages should all be open to the public. This would be a very positive
outcome of the new ICANN and would, in fact, be a resurrection of the way
things used to be done on the net where everyone shared their data openly
and jointly figured out how to do things better.

--
Michael Dillon                 -               E-mail: michael at memra.com
Check the website for my Internet World articles -  http://www.memra.com        





More information about the NANOG mailing list