Mitigating human error in the SP
nanog at 85d5b20a518b8f6864949bd940457dc124746ddc.nosense.org
Tue Feb 2 07:16:29 CST 2010
On Mon, 1 Feb 2010 21:21:52 -0500
Chadwick Sorrell <mirotrem at gmail.com> wrote:
> Hello NANOG,
> Long time listener, first time caller.
> A recent organizational change at my company has put someone in charge
> who is determined to make things perfect. We are a service provider,
> not an enterprise company, and our business is doing provisioning work
> during the day. We recently experienced an outage when an engineer,
> troubleshooting a failed turn-up, changed the ethertype on the wrong
> port losing both management and customer data on said device. This
> isn't a common occurrence, and the engineer in question has a pristine
> track record.
Why didn't the customer have a backup link if their service was so
important to them and indirectly your upper management? If your
upper management are taking this problem that seriously, then your
*sales people* didn't do their job properly - they should be ensuring
that customers with high availability requirements have a backup link,
or aren't led to believe that the single-point-of-failure service will
be highly available.
> This outage, of a high profile customer, triggered upper management to
> react by calling a meeting just days after. Put bluntly, we've been
> told "Human errors are unacceptable, and they will be completely
> eliminated. One is too many."
If upper management don't understand that human error is a risk factor
that can't be completely eliminated, then I suggest "self-eliminating"
and find yourself a job somewhere else. The only way you'll avoid
human error having any impact on production services is to not change
anything - which pretty much means not having a job anyway ...
> I am asking the respectable NANOG engineers....
> What measures have you taken to mitigate human mistakes?
> Have they been successful?
> Any other comments on the subject would be appreciated, we would like
> to come to our next meeting armed and dangerous.
More information about the NANOG