Out-of-band paging
Steve Gibbard
scg at gibbard.org
Wed Jul 28 16:54:29 UTC 2010
On Wed, 28 Jul 2010, Joel M Snyder wrote:
> But... you can take this sort of 'single point of failure' argument almost as
> far as you want. In the security business (where I spend most of my time), I
> see people do this a lot--they get deep into the ultra-ultra-ultra marginal
> risk, which takes then an enormous amount of money to mitigate. It's an easy
> rat hole to explore, and often fun.
I think people are getting lost in the weeds here, and confusing
technologies with paths.
My current employer has been upgrading its transit circuits, and spent
time in the last few months worrying about diversity of the transit paths.
But we didn't insist that one provider come in via metro ethernet, one via
SONET, and one via a GRE tunnel. What we did was have them bring in
network maps, and make them sell us circuits that weren't running down the
same streets as our other providers.
The same goes for your paging network. If it's running over IP, that's
not a huge problem. If anything, if you're an IP engineer, it probably
makes it easier for you to audit the setup. Where you do have a problem
is if it's running over YOUR IP network, but that's just a more accute
version of the problem you'd have if your paging company were using fiber
along the same path as somebody you were buying fiber from.
So, for paging, or out of band management, or redundant capacity, the
rules seem pretty simple. Buy from somebody who's not your customer.
Audit whatever information you can get about their network paths to verify
that they're not sharing segments with you. And, for good measure, have
some backup plans in case the notifications don't work.
You probably are better off if you have humans in a NOC, rather than a
purely automated alerting system. Those people can notice if you're not
responding, and be creative. Maybe they can figure out how to fix
problems themselves. If all else fails, they may be able to dispatch
somebody to your house. Remember, organizations have been tracking down
critical personnel for far longer than there have been telephones.
Or are people here worried about a scenario in which the entire world is
run off of one big interconnected IP network, and that when it fails it's
not only not possible to make a phone call, but also not possible to get
across town to alert the people who could fix it? It seems to me that if
things really got that bad, it might be pretty hard for even the most
oblivious on-call person to miss.
-Steve
More information about the NANOG
mailing list