Mitigating human error in the SP
hiersd at gmail.com
Tue Feb 2 20:38:40 CST 2010
If your manager pretends that they can manage humans without a few
well-worn human factor books on their shelf, quit.
On Tue, Feb 2, 2010 at 5:36 PM, Michael Dillon
<wavetossed at googlemail.com> wrote:
>> The actual error happened when someone was troubleshooting a turn-up,
>> where in the past the customer in question has had their ethertype set
>> wrong. It wasn't a provisioning problem as much as someone
>> troubleshooting why it didn't come up with the customer. Ironically,
>> the NOC was on the phone when it happened, and the switch was rebooted
>> almost immediately and the outage lasted 5 minutes.
> This is why large operators have a "ready for service" protocol. The customer
> is never billed until it is officially RFS, and to make it RFS requires more
> than an operational network, it also requires the customer to agree in writing
> that they have a fully functional connection.
> This is another way of hiding human error, because now the up-down-up is
> just part of the provisioning process. There is a record of the RFS date-time
> so if the customer complains about an outage BEFORE that point, they can
> be politely reminded that when RFS happened and that charging does not
> start until AFTER that point.
> --Michael Dillon
More information about the NANOG