sigs wanted for a response to the fcc's NOI for faster broadband speeds

Sat Dec 2 12:55:40 UTC 2023

On Sat, Dec 2, 2023 at 2:30 AM Stephen Satchell <list at satchell.net> wrote:
>
> On 12/1/23 5:27 PM, Mike Hammett wrote:
> > It would be better to keep the government out of it altogether, but that has little chance of happening.
> >
>
> I agree.  But I do have a question: is there a Best Practices RFC for
> setting buffer sizes in the existing corpus?  The Internet community has
> been pretty good at setting reasonable standards and recommendations, so
> a pointer to a BCP or RFC would go much farther to solve the bufferbloat
> problem, as I believe administrators would prefer the "suggestion"
> instead of ham-handed regulation.

I too!

The IETF recommends universal AQM: see RFC7567.

However, as a consensus document, it was impossible to get the group
to agree to fair (flow) queuing also being deployed universally,
(although it is discussed extensively) where I feel that that
technology - breaking up bursts, making it possible for all flows to
multiplex better to a destination, and isolating problematic flows -
is more important than AQM. A lot of FQ is implicit - packet pacing
from the hosts does it e2e, switches naturally multiplex from multiple
ports, but anywhere it is not, explicitly having it helps in
downshifting from one rate to another. A pure AQM tends to react late
to bursts otherwise. "Flow Queuing", is a real advance over
conventional fair queueing, IMHO.

PIE has a delay target of 16ms, Codel 5ms - but these are targets that
take a while to hit, absorbing bursts, and then draining the queue to
a steady, smaller state gradually. I recommend VJ's talk on this
highly:

https://www.bufferbloat.net/projects/cerowrt/wiki/Bloat-videos/#van-jacobson-of-google-introduces-the-codel-solution-and-the-packet-fountain-analogy-at-the-ietf84-conference-july-august-2012

(he has also been heavily involved in BBR and so many other things)

If all you have is a FIFO, I personally would recommend no more than
30ms in a byte fifo, if available. A packet fifo, limited thusly,
might have issues with swallowing enough acks, but either way, with
the advent of "packet pacing" from the hosts, some pretty good
experimental evidence that 100ms is too much... (I can supply more
links), and the rise of unclassified interactive traffic like webrtc
with tighter delay constraints, I still lean strongly towards 30ms as
the outside figure in most cases.

Aside from incast traffic, you only need even this much buffering when
a link is oversubscribed. Try not to do that, but test. We got back a
lot of good data from the level3 outage showing that a lot of core
seemed to have 250ms of buffering, or more. I can dig up that
research.

For more RFCs from the now closed IETF AQM working group, see:

https://datatracker.ietf.org/group/aqm/documents/

> But that's just me.  I do know there has been academic research on the
> subject, but don't recall seeing the results published as a practical
> operational RFC.

I too would like a practical operational RFC.

It is becoming increasingly practical, but the big vendors are lagging
behind on support for advanced FQ and AQM techniques. There has been
some decent work in P4. In my case on problematic links I just slap
CAKE in front of it on a whitebox. Example of the generally pleasing
results here: https://blog.cerowrt.org/post/juniper/

(this blog also references the debate about the BDP in relation to the
number of flows controversy)

That blog also shows the degradation in tcp performance once buffers
crack 250ms - a sea of retransmits with ever decreasing goodput.

Cisco has AFD (approximate fair drop). I have zero reports from the
field from those configuring it.

I hear good things about Juniper's RED implementation but have not
torn it apart, and few configure it that I know of.

I would love it if more people slapped LibreQos on an oversubscribed
link (it's good to well past 25Gbit and pushes cake to about
10Gbits/core destination), and observed what happened to user
satisfaction, packet drops, RFC3168 ecn, and flow collisions in the 8
way set associative hash , and so on. We've produced some good tools
for it, notably tcp sampling of the rtt, as well as nearly live "mice
and elephants" plots from the 2002 paper on it.

>
> (And this is very much on-topic for NANOG, as it is about encouraging
> our peers to implement effective operation in their networks, and in
> their connections with others.)

I too encourage everyone.

-- 
:( My old R&D campus is up for sale: https://tinyurl.com/yurtlab
Dave Täht CSO, LibreQos