Extreme congestion (was Re: inter-domain link recovery)

Thu Aug 16 19:03:15 UTC 2007

On Thu, 16 Aug 2007, Deepak Jain wrote:

> Depends on your traffic type and I think this really depends on the 
> granularity of your study set (when you are calculating 80-90% usage). If you 
> upgrade early, or your (shallow) packet buffers convince to upgrade late, the 
> effects might be different.

My guess is that the value comes from mrtg or alike, 5 minute average 
utilization.

> If you do upgrades assuming the same amount of latency and packet loss on any 
> circuit, you should see the same effect irrespective of buffer depth. (for 
> any production equipment by a main vendor).

I do not agree. A shallow buffer device will give you packet loss without 
any major latency increase, whereas a deep buffer device will give you 
latency without packet loss (as most users out there will not have 
sufficient tcp window size to utilize a 300+ ms latency due to buffering, 
they will throttle back their usage of the link, and it can stay at 100% 
utilization without packet loss for quite some time).

Yes, these two cases will both enable link utilization to get to 100% on 
average, and in most cases users will actually complain less as the packet 
loss will most likely be less noticable to them in traceroute than the 
latency increase due to buffering.

Anyhow, I still consider a congested backbone an operational failure as 
one is failing to provide adequate service to the customers. Congestion 
should happen on the access line to the customer, nowhere else.

> Deeper buffers allow you to run closer to 100% (longer) with fewer packet 
> drops at the cost of higher latency. The assumption being that more congested 
> devices with smaller buffers are dropping some packets here and there and 
> causing those sessions to back off in a way the deeper buffer systems don't.

Correct.

> Its a business case whether its better to upgrade early or buy gear that lets 
> you upgrade later.

It depends on your bw cost, if your link is very expensive then it might 
make sense to use manpower opex and equipment capex to prolong the usage 
of that link by trying to cram everything you can out of it. In the long 
run there is of course no way to avoid upgrade, as users will notice it 
anyhow.

-- 
Mikael Abrahamsson    email: swmike at swm.pp.se