Shady areas of TCP window autotuning?

Leo Bicknell bicknell at
Tue Mar 17 15:39:13 UTC 2009

In a message written on Tue, Mar 17, 2009 at 08:46:50AM +0100, Mikael Abrahamsson wrote:
> In my mind, the problem is that they tend to use FIFO, not that the queues 
> are too large.

We could quickly get lost in queuing science, but at a high level you
are most correct that both are a problem.

> What we need is ~100ms of buffer and fair-queue or equivalent, at both 
> ends of the end-user link (unless it's 100 meg or more, where 5ms buffers 
> and FIFO tail-drop seems to work just fine), because 1 meg uplink (ADSL) 
> and 200ms buffer is just bad for the customer experience, and if they 
> can't figure out how to do fair-queue properly, they might as well just to 
> WRED 30 ms 50 ms (100% drop probability at 50ms) or even taildrop at 50ms.
> It's very rare today that an end user is helped by anything buffering 
> their packet more than 50ms.

Some of this technology exists, just not where it can do a lot of
good.  Some fancier CPE devices know how to queue VOIP in a priority
queue, and elevate some games.  This works great when the cable
modem or DSL modem are integrated, but when you buy a "router" and
hook it to your provider supplied DSL or Cable Modem it's doing no
good.  I hate to suggest such a thing, but perhaps a protocol for a
modem to communicate a comitted rate to a router would be a good

I'd also like to point out, where this technology exists today it's
almost never used.  How many 2600's and 3600's have you seen
terminating T1's or DS-3's that don't have anything changed from
the default FIFO queue?  I am particularly fond of the DS-3 frame
circuits with 100 PVC's, each with 40 packets of buffer.  4000
packets of buffer on a DS-3.  No wonder performance is horrid.

In a message written on Tue, Mar 17, 2009 at 09:47:39AM +0100, Marian ??urkovi?? wrote:
> Reducing buffers to 50 msec clearly avoids excessive queueing delays,
> but let's look at this from the wider perspective:
> 1) initially we had a system where hosts were using fixed 64 kB buffers
> This was unable to achieve good performance over high BDP paths

Note that the host buffer, which generally should be 2 * Bandwidth
* Delay is, well, basically unrelated to the hop by hop network

> 2) OS maintainers have fixed this by means of buffer autotuning, where
> the host buffer size is no longer the problem. 
> 3) the above fix introduces unacceptable delays into networks and users
> are complaining, especially if autotuning approach #2 is used
> 4) network operators will fix the problem by reducing buffers to e.g. 50 msec
> So at the end of the day, we'll again have a system which is unable to
> achieve good performance over high BDP paths, since with reduced buffers
> we'll have an underbuffered bottleneck in the path which will prevent full
> link untilization if RTT>50 msec. Thus all the above exercises will end up
> in having almost the same situation as before (of course YMMV). 

This is an incorrect conclusion.  The host buffer has to wait for
an RTT for an ack to return, so it has to buffer a full RTT of data
and then some.  Hop by hop buffers only have to buffer until an
output port on the same device is free.  This is why a router with
20 10GE interfaces can have a 75 packet deep queue on each interface
and work fine, the packet only sits there until a 10GE output
interface is available (a few microseconds).

The problems are related, as TCP goes faster there is an increased
probability it will fill the buffer at any particular hop; but that
means a link is full and TCP is hitting the maximum speed for that
path anyway.  Reducing the buffer size (to a point) /does not slow/
TCP, it reduces the feedback loop time.  It provides less jitter
to the user, which is good for VoIP and ssh and the like.

However, if the hop-by-hop buffers are filling and there is lag and
jitter, that's a sign the hop-by-hop buffers were always too large.
99.99% of devices ship with buffers that are too large.

       Leo Bicknell - bicknell at - CCIE 3440
        PGP keys at
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
URL: <>

More information about the NANOG mailing list