Ahoy, SLA boffins!

Patrick W. Gilmore patrick at ianai.net
Wed Jul 29 04:42:42 UTC 2009

On Jul 29, 2009, at 12:34 AM, Bill Woodcock wrote:

> So I've embarked on the no-doubt-futile task of trying to interpret  
> SLAs as empirically-verifiable technical specifications, rather than  
> as marketing blather.  And there's something that I'm finding  
> particularly puzzling:
> In most SLAs, there seem to be two separate guarantees proffered:  
> one concerning "network availability" and one concerning "packet  
> loss."  Now, if I were to put my engineer hat on, and try to  
> _imagine_ what the difference might be, I might imagine "network  
> availability" to have something to do with layer-2 link status being  
> presented as "up," while packet loss would be the percentage of  
> packets dropped.  But when I actually read SLAs, "network  
> availability" is generally defined as the portion of the month that  
> the path from the customer's local loop to the transit or peering  
> routers was "available" to transmit packets.  Packet loss, on the  
> other hand, is generally defined as the portion of packets which are  
> lost while crossing that exact same piece of network.
> Now, what am I missing here?  Is this one of those Heisenberg  
> things, where "network availability" is the time the network _could  
> have_ delivered a packet _when you weren't actually doing so_, while  
> "packet loss" is the time the network _couldn't_ deliver a packet  
> when you _were_ actually doing so?
> Is "network availability" inherently unmeasurable on a network  
> that's less than 100% utilized?
> Am I over-thinking this?

Yes.  But not because you are coming to strange conclusions, but  
because (as you say in your first sentence), you are trying to put  
empirical / objective meaning to marketing blather.

I had a simple way to fix this.  I defined a network as "down" with  
more than X% packet loss (usually with X in the 2-5 range, depending  
on other deal parameters).  IMHO, a network with 5% packet loss -is-  
down.  I don't know about you, but none of my customers will use my  
service if they have 5% loss.  TCP is finicky!  This receives the  
strongest credit because you cannot use the service.

Below X, you are not "down", just degraded, and therefore the link has  
some utility, but not 100% utility.  This receives a credit, but not  
as strong a credit as being unable to use a link.

Oh, and, of course, if the there is no light on the fiber, then we are  
(obviously) "down" as well.

Make sense?

Or I am over-thinking it? :)


P.S. Now you get to think about things like "packet loss to / from  
where?" and whether the last mile should count.

> Seriously, though, I know there are people who don't consider SLAs  
> to be fantasy-fiction, and some of them must not be innumerate, and  
> some subset of those must be on NANOG, and the intersection set  
> might be equal to or greater than one, right?  Can anybody explain  
> this to me in a way I can translate into code, while still taking  
> myself seriously?
>                                -Bill

More information about the NANOG mailing list