SLA Monitoring

Saku Ytti saku at ytti.fi
Wed Apr 12 16:17:14 UTC 2017


On 12 April 2017 at 01:57, Mike Hammett <nanog at ics-il.net> wrote:

Hey,

> What do you guys use for monitoring of SLAs, be it an upstream or a downstream SLA? I know of a couple services, just looking to see who's doing what and how they like it.

It might be useful to understand what type of data are you expecting
out. Bunch of kit out there Accedian, Creanord, Netrounds, JDSU, Exfo,
Polystar...


Personally for me important things are:
  a) full-mesh, any pop to any pop
  b) high resolution, I want to know at least down to 10ms (100pps *
cos * pops - may not be trivial amount of traffic)
  c) multiple CoS for all paths
  d) ability to discriminate measurement by SPORT (to troubleshoot ECMP issues)
  e) 1us or better precision for 2way jitter, latency
  f) good API to configure, to get data out, to get alerts out
  g) verify that received data is same as send (to find out if network
has mangled bits) - this is very rare feature for some reason


1us or better precision basically removes all virtualised setups,
because SR-IOV does not provide access to HW timestamping today. So
you'll need dedicated HW for it, and vast majority of these shops only
offer HW timestamping in the the upper range of the products.

Personally I like Creanord, in previous life I've worked with them and
found them to be knowledgeable and reactive partner. They've recently
released new small/affordable boxes with HW timestamping. But are
lacking in some department, like no data validity checking today, and
GUI creation of full-mesh measurement is quite a chore as you need to
individually pick interfaces. Latter isn't so big deal for me, as I'd
do it programmatically anyhow, but may be big deal to others. I know
that both are on the table to be fixed.

If precision and resolution are not important and you're happy to
write your tooling to present the data and alert, you can probably get
away with CSCO IP SLA and/or JNPR RPM. Coworker of mine has written
very convenient and high performance IP SLA responder, so that you
don't have to buy several expensive Cisco boxes just to respond to the
queries - https://github.com/cmouse/ip-sla-responder


-- 
  ++ytti



More information about the NANOG mailing list