SLA monitoring and reporting to customers

Rubens Kuhl Jr. rubensk at
Sun Mar 18 23:11:02 UTC 2007

> > What open-source or low-budget tools are operators using for SLA
> > monitoring when the reports (current state and historical) should be
> > available to customers ?
> Please define SLA in terms of monitoring.

- 99.x% availability (defined by packet loss and response time) monthly
- A certain number of hours from service interruption to service recovery

> > Looking at NANOG archives, NAGIOS is the most prevalent tool, but its
> > authorization mechanisms are somewhat below I would like so customers
> > could not change anything both in configuration and in SLA software
> > state
> You can setup so that customer only sees the data on status of the
> services he or she has access to by adding customer into as a contact
> for host or services.

There are 2 main issues on my reading of
- Users can issue commands for hosts/services they are contact for.
They could acknowledge an outage even when we should know about it.
- Some devices of interest to a customer are not specific to a
customer: a switch, a router. If they are considered contact for such
devices, they can issue commands for it.

> Do you think that your customers should or
> should not have such access to your central nagios system?

That's something I woud like to hear opinions on, but even with NAGIOS
such an issue could be solved by having one NOC-only NAGIOS and one
customers-only NAGIOS. Using NagiosQL would be probably make
replication easier.

> > I'm looking for something more like Cacti, where customers can be
> > contained to only see some of the generated graphs.
> Would you be satisfied with graphing extension to nagios that is
> tied replicates nagios security mechanism where customer can see
> graphs for the service he/she is listed as contact for?

Is it ? Can a user be a
nagiosgraph contact without being a NAGIOS contact ?


