RFC2544 Testing Equipment

James Bensley jwbensley at gmail.com
Wed May 31 11:23:46 UTC 2017


On 31 May 2017 at 11:56, Saku Ytti <saku at ytti.fi> wrote:
> Cool. Seems you're using AF_PACKET, which makes it actually unique.
> iperf/netperf etc use UDP or TCP socket, so UDP performance is just
> abysmal, you can't saturate 1GE link with any reliability. So
> measuring for example packet loss is not possible at all.
>
> I've been meaning to write AF_PACKET based UDP sender/receiver and
> have gotten pretty far with friend of mine on rust version, we can
> congest 1GE (on minimum size frames) on Linux reliably and actually
> tell if you're lossy. It has server/client design, where client
> requests via JSON based messages through control-channel server to
> receive or send, and what exactly.
> Alas, we're only 80% there, and seem to struggle to find time to
> polish it for initial release.
>
> We definitely need tool like iperf, which performs at least to 1GE,
> and AF_PACKET can do that, UDP socket cannot. Alas 10GE is still pipe
> dream for anything as portable as iperf, as you'd need to use DPDK,
> netmap or equivalent which will remove the NIC from userland, there
> are quite few options for that use-case, but no good option for
> use-case when you want at least 1GE but you cannot remove NIC from
> userland.

Hi Saku,

Yeah AF_PACKET sockets are used and you really need to be on a 4.x
Kernel for better performance (update your NIC firmware etc). The
problem with Etherate is that is uses Ethernet for the test data and
control data and since Ethernet is loss-less is does some strange
(read: lame) things like send some control or data frames three times
to try and ensure the other side receives it when there is frame loss.

Yeah 1G with large frames is do-able. 10G with large frames is also
do-able with a fast CPU. Etherate is single threaded though so you’ll
not get anywhere near 10G with 64 byte frames in Etherate. I have
started writing a multi-threaded version which will use TCP sockets to
exchange control data but still use AF_PACKET sockets for data plane
traffic.

10G with 64 byte packets should be achievable (still writing it so not
100% confirmed yet) when using the PACKET_MMAP Tx/Rx rings in
AF_PACKET which is what the new aptly named EtherateMT
(multi-threaded) uses. One can then use multiple threads (each on a
difference CPU core) and each with its own Tx or Rx ring buffer to
push packets to the NIC and we can use RSS on the NIC and assign each
NIC Tx queue to a separate core also for processing NET_TX and NET_RX
IRQs.

So it might take 12 or 16 cores but it should be do-able in EtherateMT
still with the iperf like portability, whereas DPDK can do this on a
single core (pkt-gen and moon-gen etc). However EtherateMT would
ideally use only Kernel native features (no 3rd party libraries
required or custom Kernel complication to enable an optional modules).

Yeah Rust seems cool, it's on my "to-learn" list along with Go and
seven thousand over things so writing in C for now.

Cheers,
James.



More information about the NANOG mailing list