Validating multi-path in production?

Tom Beecher beecher at beecher.cc
Mon Nov 15 14:07:46 UTC 2021


It sounds like you want something like this:

https://github.com/facebookarchive/fbtracert

We have an internal tool that works on generally similar principles, works
pretty well.

( I have no relationship with Facebook; I just always remember their presos
on UDPinger and FBTracert from my first NANOG meeting for whatever reason.
:) )

On Sun, Nov 14, 2021 at 11:21 AM Adam Thompson <athompson at merlin.mb.ca>
wrote:

> The problem I'm looking to solve is the logical opposite, I think: I want
> to demonstrate that no links are malfunctioning in such a way that packets
> on a certain path are getting silently dropped.  Which has some "proving a
> negative" aspects to it, unfortunately.
> I think the only way I can demonstrate it is to determine that every
> single multi-path/hashed-member link is working, which is... hard.
> Especially if I need to deal with the combinatoric explosion - I *think* I
> can skip that part.
> -Adam
>
> Get Outlook for Android <https://aka.ms/AAb9ysg>
> ------------------------------
> *From:* James Bensley <jwbensley+nanog at gmail.com>
> *Sent:* Sunday, November 14, 2021 5:29:25 AM
> *To:* Adam Thompson <athompson at merlin.mb.ca>; nanog <nanog at nanog.org>
> *Subject:* Re: Validating multi-path in production?
>
> On Fri, 12 Nov 2021 at 16:54, Adam Thompson <athompson at merlin.mb.ca>
> wrote:
>
> The best I've come up with so far is to have two test systems (typically
> VMs) that use adjacent IP addresses and adjacent MAC addresses, and test
> both inbound and outbound to/from those, blindly trusting/hoping that
> hashing algorithms will *probably* exercise both paths.
>
>
> If the goal is to test that traffic *is* being distributed across multiple
> links based on traffic headers, then you can definable roll your own. I
> think the problem is orchestrating it (feeding your topology data into the
> tool, running the tool, getting the results out, and interpreting the
> results etc).
>
> A coupe of public examples:
> https://github.com/facebookarchive/UdpPinger
> https://www.youtube.com/watch?v=PN-4JKjCAT0
>
> If you do roll your own, you need to taylor the tests to your topology and
> your equipment. For example, you can have two VMs as you mentioned, each at
> opposite ends of the network. Then, if your network uses a 5-tuple for ECMP
> inside the core for example, you could send many flows between the two VMs,
> rotating the sauce port for example, to ensure all links in a LAG or all
> ECMP paths are used.
>
> It's tricky to know the hashing algo for every type of device you have in
> your network, and for each traffic type for each device type, if you have a
> multi vendor network. Also, if your network carries a mix of IPv4, IPv6,
> PPP, MPLS L3 VPNs, MPLS L2 VPNs, GRE, GTP, IPSEC, etc. The number of
> permutations of tests you need to run and the result sets you need to
> parse, grows very rapidly.
>
> Cheers,
> James.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nanog.org/pipermail/nanog/attachments/20211115/e554dd37/attachment.html>


More information about the NANOG mailing list