Proving Gig Speed

James Bensley jwbensley at gmail.com
Tue Jul 17 17:41:08 UTC 2018


On 17 July 2018 at 09:54, Saku Ytti <saku at ytti.fi> wrote:
> On Tue, 17 Jul 2018 at 10:53, James Bensley <jwbensley at gmail.com> wrote:
>
>> Virtually any modern day laptop with a 1G NIC will saturate a 1G link
>> using UDP traffic in iPerf with ease. I crummy i3 netbook with 1G NIC
>> can do it on one core/thread.
>
> I guess if you use large packets this might be true. But personally,
> if I'm testing network, I'm interested in latency, jitter, packet, bps
> _AND_ pps goals as well, not just bps goal.

Hi Saku,

Yeah I fully agree with what you are saying however, the OPs question
sounds like he "only" needed to prove bandwidth. With 1500 byte frames
I've run it up to nearly 10Gbps before (it was between VMs in two
different DCs that were having slow transfers and the hyper-visors had
10G NICs, so I dare say, on bare metal with large frames it will do
10Gbps).

> And I've never seen clean
> 1Gbps on iperf with small packets. It just cannot be done, even if
> iPerf was written half decently and it used recvmmsg, it still
> wouldn't be anywhere near.
> Clean 1Gbps with small packets in user space is actually very much
> doable today, just you can't use UDP socket, you must use AF_PACKET on
> Linux or BPF on OSX and you can write portable 1Gbps UDP
> sender/receiver.
> I'm very surprised we don't have iperf like program for netengs which
> does this and reports latency, jitter, packet loss with binary search
> for highest lossless pps/bps rates.

I absolutely agree there is a gap in the open source market for this
exact application. A tool that sends traffic between Tx and Rx (or
bidirectionally) at a specified frame size and frame rate, which can
max out 10Gbps at 64 byte frames if required (I say 10Gbps instead of
1Gbps because 10Gbps as an access circuit speed is being increasingly
common), and throughout the test it should report RTT and one way
latency, jitter and packet loss etc. and then output the results in a
format that is easy to parse. It should also have a JSON API and be
able to run in a "daemon" mode like an iPerf server that is always on
ready for people to test to/from.

> I started to write one with Anton Aksola in Rust (using libpnet[0]),
> and implemented quite flexible protocol (server/client, client can ask
> server exactly what kind of packet to construct/expect, what rate to
> send/receive over JSON based protocol), so you could also use it to
> ask it to DDoS your routers control-plane in lab etc. And actually got
> it working, OSX+Linux ~wirarate (still needs higher end laptop to do
> 1.5Mpps on single core and we didn't implement multicore support). But
> as both of us are trash in Rust (and every other applicable language
> in this domain), we kind of dropped the project once we had sufficient
> POC running on our laptops.
> Someone who actually can code, could easily implement such program in
> a weekend. I'm happy to share the trash we've done if someone intends
> to check this box in open source world. May use it for inspiration, or
> just straight up add polish and enough CLI to make it usable as-is.

I went through a similar process. AF_PACKET is definitely what you
need to use if you want to use user-space in Linux (don't know about
MAC, only use Linux). I wrote a basic multi-threaded load generator
and load sinker (Tx and Rx) in C using various Kernel methods (send(),
sendmsg(), sendmmsg(), and PACKET_MMAP) with AF_PACKET to compare them
all: https://github.com/jwbensley/EtherateMT

The problem is that C is a great language to write high performance
stuff, it's a shit language to create a JSON API in. I have two back
to back lab servers at work with 10G links between them, low end
2.1Ghz Xeons, I get 1Mpps per core, 8 cores-1 for OS means I max out
at 7Mpps :(

I know that XDP is coming to Linux user space so we'll see where that
goes, as it promises the magic performance levels we want. Also
TPACKETv4 is coming for AF_PACKET in Linux which should also get us to
that magic level of performance in user land (it is effectively Kernel
bypass). I'll add this to EtherateMT when I get some time to check
it's performance: https://lwn.net/Articles/737947/

So EtherateMT works OK as a proof of concept, but nothing more. It
requires 100% CPU utilisation to send/receive at such high pps rates,
there is no CPU time for stats collection or fancy rtt/latency/jitter
etc. That can only be done (right now) with something like DPDK,
because it we only need one or two cores for Tx/Rx and then we have
free cores for stats collections/generations etc. I looked into
MoonGen, it creates Lua bindings for DPDK which means you can rapidly
develop DPDK based tools without knowing much about DPDK. It had some
RFC2544 Lua scripts for DPDK and I started to re-write them as they
were old and didn't work with the latest version of MoonGen:

https://github.com/jwbensley/MoonGen-Scripts

The throughput script works OK-ish (10Gbps on one core no problems):
https://github.com/jwbensley/MoonGen-Scripts/blob/master/throughput.lua

Luea would allow one to easily provide parseable output and more
easily implement a JSON API however, since MoonGen uses DPDK, we can
only use the NICs that DPDK supports and not "any Ethernet NIC
supported by Linux", which is what I really want by using AF_PACKET +
TPACKETv4.

> I think very important quality is multiplatform with static binaries.
> Because important use case is, that you can ask modestly informed
> customer to copy paste one line to donwload server and copy paste
> another line to have it running.
> If use case is that both ends have arbitrary clued people, then there
> are plenty of good solutions, like Cisco's trex[1]. But what I need is
> iPerf-like program, which actually a) performs and b) reports the
> correct things.

Yeah agreed so DPDK is out the window for me for this specific
requirement, it's Linux only (ignoring the minor level of BSD support)
and NIC specific too.

Python *yuk* is multi-OS, it has JSON libraries, and it has some
support for AF_PACKET:
https://stackoverflow.com/questions/1117958/how-do-i-use-raw-socket-in-python#6374862

I don't know enough about it, but it might be that TPACKETv4 could be
leveraged through Python but that still only covers Linux as Windows
and MAC have very different network stacks (but then again I do only
care about link sooooo...).

I'm keen to have another go at this problem now that I've got a better
understanding of it having written EtherateMT and played with DPDK
etc. Not sure where to go though - so just waiting on TPACKETv4 right
now.

Cheers,
James.



More information about the NANOG mailing list