Ahoy, SLA boffins!
Patrick W. Gilmore
patrick at ianai.net
Tue Jul 28 23:42:42 CDT 2009
On Jul 29, 2009, at 12:34 AM, Bill Woodcock wrote:
> So I've embarked on the no-doubt-futile task of trying to interpret
> SLAs as empirically-verifiable technical specifications, rather than
> as marketing blather. And there's something that I'm finding
> particularly puzzling:
> In most SLAs, there seem to be two separate guarantees proffered:
> one concerning "network availability" and one concerning "packet
> loss." Now, if I were to put my engineer hat on, and try to
> _imagine_ what the difference might be, I might imagine "network
> availability" to have something to do with layer-2 link status being
> presented as "up," while packet loss would be the percentage of
> packets dropped. But when I actually read SLAs, "network
> availability" is generally defined as the portion of the month that
> the path from the customer's local loop to the transit or peering
> routers was "available" to transmit packets. Packet loss, on the
> other hand, is generally defined as the portion of packets which are
> lost while crossing that exact same piece of network.
> Now, what am I missing here? Is this one of those Heisenberg
> things, where "network availability" is the time the network _could
> have_ delivered a packet _when you weren't actually doing so_, while
> "packet loss" is the time the network _couldn't_ deliver a packet
> when you _were_ actually doing so?
> Is "network availability" inherently unmeasurable on a network
> that's less than 100% utilized?
> Am I over-thinking this?
Yes. But not because you are coming to strange conclusions, but
because (as you say in your first sentence), you are trying to put
empirical / objective meaning to marketing blather.
I had a simple way to fix this. I defined a network as "down" with
more than X% packet loss (usually with X in the 2-5 range, depending
on other deal parameters). IMHO, a network with 5% packet loss -is-
down. I don't know about you, but none of my customers will use my
service if they have 5% loss. TCP is finicky! This receives the
strongest credit because you cannot use the service.
Below X, you are not "down", just degraded, and therefore the link has
some utility, but not 100% utility. This receives a credit, but not
as strong a credit as being unable to use a link.
Oh, and, of course, if the there is no light on the fiber, then we are
(obviously) "down" as well.
Or I am over-thinking it? :)
P.S. Now you get to think about things like "packet loss to / from
where?" and whether the last mile should count.
> Seriously, though, I know there are people who don't consider SLAs
> to be fantasy-fiction, and some of them must not be innumerate, and
> some subset of those must be on NANOG, and the intersection set
> might be equal to or greater than one, right? Can anybody explain
> this to me in a way I can translate into code, while still taking
> myself seriously?
More information about the NANOG