LibreQos

Dave Taht dave.taht at gmail.com
Fri May 12 15:59:49 UTC 2023


Changing the topic...

On Fri, May 12, 2023 at 7:11 AM Mark Tinka <mark at tinka.africa> wrote:
>
>
>
> On 5/12/23 15:03, Dave Taht wrote:
>
> > Libreqos is free software, working as a bridge, you can plug it in
> > between any two points on your network, and on cheap (350 bucks off of
> > ebay) xeon gold hardware easily cracks 25Gbits while shaping with a
> > goal of cracking 100Gbits one day soon.
>
> This is fantastic!

:blush:

We have done a couple podcasts about it, like this one:

https://packetpushers.net/podcast/heavy-networking-666-improving-quality-of-experience-with-libreqos/

and have perhaps made a mistake by using matrix chat, rather than a
web forum, to too-invisibly, do development and support in, but it has
been a highly entertaining way to get a better picture of the real
problems caring ISPs have.

I see you are in Africa? We have a few ISPs playing with this in kenya...

>
> I also found your post about it here:
>
> https://www.reddit.com/r/HomeNetworking/comments/11pmc9a/a_latency_on_the_internet_update_bufferbloat_sqm/
>
> If you could throw more hardware at it, could it do several 100's of Gbps?

We do not know. Presently our work is supported by equinix´s open
source program, with four servers in their Dallas DC, and they are
25Gbit ports. Putting together enough dough to get to 100Gbit or
finding someone willing to send traffic through more bare metal at
that data center or elsewhere is on my mind. In other words, we can
easily spin up the ability to L2 route some traffic through a box in
their DCs, if only we knew where to find it. :)

If you assume linearity to cores (which is a lousy assumption, ok?),
64 Xeon cores could do about 200Gbit, running flat out. I am certain
it will not scale linearly and we will hit multiple bottlenecks on a
way to that goal.

Limits we know about:

A) Trying to drive 10s of gbits of realistic traffic through this
requires more test clients and servers than we have, or someone with
daring and that kind of real traffic in the first place. For example
one of our most gung-ho clients has 100Gbit ports, but not anywhere
near that amount of inbound traffic. (they are crazy enough to pull
git head, try it for a few minutes in production, and then roll back
or leave it up)

B) A brief test of a 64 core AMD + Nvidia ethernet was severely
outperformed by our current choice of a 20 core xeon gold + intel 710
or 810 card. It is far more the ethernet card that is the dominating
factor. I would kill if I could find one that did a LPM -> CPU
mapping... (e.g. instead of a LPM->route mapping, LPM to what cpu to
interrupt). We also tried an 80 core arm to inconclusive results early
on.

Tests of the latest ubuntu release are ongoing. I am not prepared to
bless that or release any results yet.

C) A single cake instance on one of the more high end Xeons can
*almost* push 10Gbit/sec while eating a core.

D) Our model is one cake instance per subscriber + the ability to
establish trees emulating links further down the chain. One ISP is
modeling 10 mmwave hops. Another is just putting in multiple boxes
closer to the towers.

So in other words, 100s of gbits is achievable today if you throw
boxes at it, and more cost effective to do that way. We will of
course, keep striving to crack 100gbit native on a single box with
multiple cards. It is a nice goal to have.

E) In our present, target markets, 10k typical residential subscribers
only eat 11Gbit/sec at peak. That is a LOT of the smaller ISPs and
networks that fit into that space, so of late we have been focusing
more on analytics and polish than pushing more traffic. Some of our
new R/T analytics break down at 10k cake instances (that is 40 million
fq_codel queues, ok?), and we cannot sample at 10ms rates, falling
back to (presently) 1s conservatively.

We are nearing putting out a v1.4-rc7 which is just features and
polish, you can get a .deb of v1.4-rc6 here:

https://github.com/LibreQoE/LibreQoS/releases/tag/v1.4-rc6

There is an optional, and anonymized reporting facility built into
that. In the last two months, 44404 cake shaped devices shaping
.19Tbits that we know of have come online. Aside from that we have no
idea how many ISPs have picked it up! a best guess would be well over
100k subs at this point.

Putting in libreqos is massively cheaper than upgrading all the cpe to
good queue management, (it takes about 8 minutes to get it going in
monitor mode, but exporting shaping data into it requires glue, and
time) but better cpe remains desirable - especially that the uplink
component of the cpe also do sane shaping natively.

"And dang, it, ISPs of the world, please ship decent wifi!?", because
we can see the wifi going south in many cases from this vantage point
now. In the past year mikrotik in particular has done a nice update to
fq_codel and cake in RouterOS, eero 6s have got quite good, much of
openwifi/openwrt, evenroute  is good...

It feels good, after 14 years of trying to fix the internet, to be
seeing such progress, on fixing bufferbloat, and in understanding and
explaining the internet better. joooooiiiiiiiin us..

> Also, when you say "bridge", if the server dies, does it become a wire,
> or would that require specialized hardware builds?

What we do now is put it inline with ospf/olsr/bgp with a low cost,
and a wire with a higher cost, if it fails. Things have stablized a
lot in the last few months, the last crash I can remember was in
january. (in rust we trust!). You have to watch out for breaking
spanning tree in that case. The most common install bug is someone
flipping inbound and outbound interfaces in the setup.

Among other things we replaced the linux native bridge code with about
600 lines of ebpf C. The enormous speedup from that is getting us
closer to what dpdk could do, but dpdk cannot queue worth a darn, just
forward willy nilly.

I hope, in particular, far, far more folk start leveraging variants of
doing inband measurements with pping. The stand alone code for that is
here: https://github.com/thebracket/cpumap-pping

> Mark.



-- 
Podcast: https://www.linkedin.com/feed/update/urn:li:activity:7058793910227111937/
Dave Täht CSO, LibreQos


More information about the NANOG mailing list