State of QoS peering in Nanog

Mon Apr 4 20:55:26 UTC 2011

On 04/03/2011 12:50 PM, Stefan Fouant wrote:
>> -----Original Message-----
>> From: Leo Bicknell [mailto:bicknell at ufp.org]
>> Sent: Saturday, April 02, 2011 10:24 PM
>>
>> But it also only affects priority queue traffic.  I realize I'm making
>> a value judgment, but many customers under DDoS would find things
>> vastly improved if their video conferencing went down, but everything
>> else continued to work (if slowly), compared to today when everything
>> goes down.
> I'd like to observe that discussion when the Netflix guys come calling on
> the support line - "Hey Netflix, yeah you're under attack and your
> subscribers can't watch videos at the moment, but the good news is that all
> other apps running on our network are currently unaffected". ;>
>
>> In closing, I want to push folks back to the buffer bloat issue though.
>> More than once I've been asked to configure QoS on the network to
>> support VoIP, Video Conferencing or the like.  These things were
>> deployed and failed to work properly.  I went into the network and
>> _reduced_ the buffer sizes, and _increased_ packet drops.  Magically
>> these applications worked fine, with no QoS.
>>
>> Video conferencing can tolerate a 1% packet drop, but can't tolerate a
>> 4 second buffer delay.  Many people today who want QoS are actually
>> suffering from buffer bloat. :(
> Concur 100%.  In my experience, I've gotten much better performance w/
> VoIP/Video Conferencing and other delay-intolerant applications when setting
> buffer sizes to a temporal value rather than based on a _fixed_ number of
> packets.
>

There is no magic here at all.

There are dark buffers all over the Internet; some network operators run 
routers and broadband without RED enabled, our broadband gear suffers 
from excessive buffering, as does our home routers and hosts.

What is happening, as I outlined at the transport area meeting at the 
IETF in Prague, is that by putting in excessive buffers everywhere in 
the name of avoiding packet loss, we've destroyed TCP congestion 
avoidance and badly damaged slow start while adding terrible latency and 
jitter.  Tail drop with long buffers delays notification of congestion 
to TCP, and defeats the algorithms.  Even without this additional 
problem (which causes further havoc), TCP will always fill buffers on 
either side of your bottleneck link in your path.

So your large buffers add latency, and when a link is saturated, the 
buffers on either side of the saturated links fill, and stay so (most 
commonly in the broadband gear, but often also in the hosts/home routers 
over 802.11 links).

By running with AQM (or small buffers), you reduce the need for QOS 
(which doesn't yet exist seriously in the network edge).

See my talk in http://mirrors.bufferbloat.net/Talks/PragueIETF/ 
(slightly updated since the Prague IETF) and you can listen to it at

  http://ietf80streaming.dnsalias.net/ietf80/ietf80-ch4-wed-am.mp3

A longer version of that talk is at:http://mirrors.bufferbloat.net/Talks/BellLabs01192011/

Note that there is a lot you can do immediately to reduce your personal suffering, by using bandwidth shaping to reduce/eliminate the buffer problem in your home broadband gear, and by ensuring that your 802.11 wireless bandwidth is always greater than your home broadband bandwidth (since the bloat in current home routers can be even worse than in the broadband gear).

See http://gettys.wordpress.com for more detail.  Please come help fix this mess at bufferbloat.net.
The bloat mailing list is bloat at lists.bufferbloat.net.

We're all in this bloat together.
				- Jim