TCP congestion control and large router buffers

Mikael Abrahamsson swmike at swm.pp.se
Tue Dec 21 07:18:44 UTC 2010


On Mon, 20 Dec 2010, Jim Gettys wrote:

> Common knowledge among whom?  I'm hardly a naive Internet user.

Anyone actually looking into the matter. The Cisco "fair-queue" command 
was introduced in IOS 11.0 according to 
<http://www.cisco.com/en/US/docs/ios/12_2/qos/command/reference/qrfcmd1.html#wp1098249> 
to somewhat handle the problem. I have no idea when this was in time, but 
I guess early 90:ties?

> And the statement is wrong: the large router buffers have effectively 
> destroyed TCP's congestion avoidance altogether.

Routers have had large buffers since way before residential broadband even 
came around, the basic premise of TCP is that routers have buffers and 
quite a lot of it.

> 200ms is good; but it is often up to multiple *seconds*. Resulting latencies 
> on broadband gears are often horrific: see the netalyzr plots that I posted 
> in my blog. See:

I know of the problem, it's no news to me. You don't have to convince me. 
I've been using Cisco routers as a CPE because of this for a long time.

> Dave Clark first discovered bufferbloat on his DSLAM: he used the 6 
> second latency he saw to DDOS his son's excessive WOW playing.

When I procured a DSLAM around 2003 or so it had 40ms of buffering at 
24meg ADSL2+ speed, when the speed went down, the buffers length in bytes 
was constant so buffering time also went up. It didn't do any AQM either, 
but at least it did .1p prioritization and had 4 buffers so there was a 
little possibility of doing things upstream of it.

> All broadband technologies are affected, as are, it turns out, all operating 
> systems and likely all home routers as well (see other posts I've made 
> recently). DSL, cable and FIOS all have problems.

Yes.

> How many of retail ISP's service calls have been due to this terrible 
> performance?

A lot, I'm sure.

> Secondly, any modern operating system (anything other than Windows XP), 
> implements window scaling, and will, within about 10 seconds, *fill* the 
> buffers with a single TCP connection, and they stay full until traffic 
> drops enough to allow them to empty (which may take seconds).  Since 
> congestion avoidance has been defeated, you get nasty behaviour out of 
> TCP.

That is exactly what TCP was designed to do, use as much bandwidth as it 
can. Congestion is detected by two means, latency goes up and/or there is 
packet loss. TCP was designed with router buffers in mind.

Anyhow, one thing that might help would be ECN in conjunction with WRED, 
but already there you're way over most CPE manufacturers head.

> is a good idea, you aren't old enough to have experienced the NSFnet collapse 
> during the 1980's (as I did).  I have post-traumatic stress disorder from 
> that experience; I'm worried about the confluence of these changes, folks.

I'm happy you were there, I was under the impression that routers had 
large buffers back then as well?

> The best you can do is what Ooma has done; bandwidth shaping along with being 
> closest to the broadband connection (or by fancy home routers with 
> classification and bandwidth shaping).  That won't help the downstream 
> direction where a single other user (or yourself), can inject large packet 
> bursts routinely by browsing web sites like YouTube or Google images (unless 
> some miracle occurs, and the broadband head ends are classifying traffic in 
> the downstream direction over those links).

There is definitely a lot of improvement to be had. For FTTH, if you use 
an L2 switch with a few ms of buffering as the ISP handoff device, you 
don't get this problem. There are even TCP algorithms to handle this case 
where you have little buffers and just tail-drop

But yes, I agree that we'd all be much helped if manufacturers of both 
ends of all links had the common decency of introducing a WRED (with ECN 
marking) AQM that had 0% drop probability at 40ms and 100% drop 
probability at 200ms (and linear increase between).

-- 
Mikael Abrahamsson    email: swmike at swm.pp.se




More information about the NANOG mailing list