DOs and DONTs for small ISP

William Herrin bill at herrin.us
Wed Jun 5 18:45:03 UTC 2019


On Wed, Jun 5, 2019 at 5:44 AM William Waites <ww at styx.org> wrote:
> It's not enough to have monitoring and a ticket system. You need to pay
> attention to them, care for them and feed them. I can't count the number
> of ticket systems full of ancient and irrelevant things or monitoring
> systems that people have forgotten about or don't know how to add new
> stuff to. Even the cycle of,

Some points to consider when monitoring your network:

1. Beware early automation. If you write a generator to go and monitor all
your stuff without addressing how operators will change things one-off
(which is hard to design well) the other operators will find the monitoring
system unusable. Which means they won't update it when stuff is added and
changed. Making it quickly useless.

2. Careful aggregating alarms. That big green or red light is useless. The
operator has to be able to start with the alarm and immediately trace back
to exactly what tests and results bubbled up in to the aggregate and from
there to the malfunctioning component. If you lose this information during
the aggregation process, you're just producing noise.

2. Every alarm must be actionable. When the light goes red, what -exactly-
do you want the operator to do as a result? Don't create an alarm until you
can offer a detailed and specific answer, and link that answer to the alarm
so the operator doesn't have to hunt for it.

Regards,
Bill Herrin

-- 
William Herrin
bill at herrin.us
https://bill.herrin.us/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nanog.org/pipermail/nanog/attachments/20190605/6b72c772/attachment.html>


More information about the NANOG mailing list