Operations task management software?
Lee
ler762 at gmail.com
Wed Jul 27 23:19:45 UTC 2016
On 7/27/16, David Hubbard <dhubbard at dino.hostasaurus.com> wrote:
> Hi all, curious if anyone has recommendations on software that helps manage
> routine duties assigned to operations staff?
Have computers do the routine scut work - not people.
> For example, let’s say we have a P&P that says someone from the netops group
> must check that Rancid is successfully backing up all router configs
> bi-weekly.
You've got the source code for rancid, so change rancid-run to do something like
LOGFILE=$LOGDIR/$GROUP.`date +%Y%m%d.%H%M%S`; export LOGFILE
change the
) >$LOGDIR/$GROUP.`date +%Y%m%d.%H%M%S` 2>&1
to
) >$LOGFILE 2>&1
and then in control_rancid do something like
grep "clogin error:" $LOGFILE | sort | uniq -c >$TMP.fail
if [ -s $TMP.fail ]; then
# got some output, mail the report
...
Do the same type thing for checking on
> backup failures, backup internet circuit status, out of band interfaces, etc.
Automate the checks, put the scripts in crontab & mail out an
"OhNoes!" or "all clear" msg at the end. At which point you're left
with the problem of making sure the managers are looking at the emails
& making sure whatever problems are found actually get fixed :)
Regards,
Lee
> Ideally, it would send an email reminder to this pre-defined
> group of people saying hey, it’s Monday, someone needs to check this and
> come acknowledge the task as having been completed. If that doesn’t occur,
> pre-defined manager X is notified on Tuesday. If manager X doesn’t get
> someone to complete the task, director Y is notified, so on and so forth.
> Then, perhaps periodically it emails manager X anyway and says hey, it’s
> been three months, you need to audit netops to ensure they’re actually doing
> the Rancid audit and not just checking that it was done. This could be
> applied to the staff who check on backup failures, backup internet circuit
> status, out of band interfaces, etc.
>
> A data center I looked at recently had QR code stickers on all of their
> infrastructure stuff and there were staff assigned to check and log certain
> displayed values each day. The software would at least ensure they actually
> visited the equipment by requiring they scan the relevant QR code when in
> front of it. So I figure something that does what I’m looking for properly
> already exists.
>
> Thanks,
>
> David
>
More information about the NANOG
mailing list