interesting troubleshooting

Matthew Petach mpetach at netflight.com
Fri Mar 20 22:23:19 UTC 2020


On Fri, Mar 20, 2020 at 3:09 PM Saku Ytti <saku at ytti.fi> wrote:

> Hey Nimrod,
>
> > I was contacted by my NOC to investigate a LAG that was not distributing
> traffic evenly among the members to the point where one member was
> congested while the utilization on the LAG was reasonably low. Looking at
> my netflow data, I was able to confirm that this was caused by a single
> large flow of ESP traffic. Fortunately, I was able to shift this flow to
> another path that had enough headroom available so that the flow could be
> accommodated on a single member link.
> >
> > With the increase in remote workers and VPN traffic that won't hash
> across multiple paths, I thought this anecdote might help someone else
> track down a problem that might not be so obvious.
>
> This problem is called elephant flow. Some vendors have solution for
> this, by dynamically monitoring utilisation and remapping the
> hashResult => egressInt table to create bias to offset the elephant
> flow.
>
> One particular example:
>
> https://www.juniper.net/documentation/en_US/junos/topics/reference/configuration-statement/adaptive-edit-interfaces-aex-aggregated-ether-options-load-balance.html
>
> Ideally VPN providers would be defensive and would use SPORT for
> entropy, like MPLSoUDP does.
>
> --
>   ++ytti
>
>

There are *several* caveats to doing dynamic monitoring and remapping of
flows; one of the biggest challenges is that it puts extra demands on the
line cards tracking the flows, especially as the number of flows rises to
large values.  I recommend reading
https://www.juniper.net/documentation/en_US/junos/topics/topic-map/load-balancing-aggregated-ethernet-interfaces.html#id-understanding-aggregated-ethernet-load-balancing
before configuring it.

"Although the feature performance is high, it consumes significant amount
of line card memory. Approximately, 4000 logical interfaces or 16
aggregated Ethernet logical interfaces can have this feature enabled on
supported MPCs. However, when the Packet Forwarding Engine hardware memory
is low, depending upon the available memory, it falls back to the default
load balancing mechanism."

What is that old saying?

Oh, right--There Ain't No Such Thing As A Free Lunch.   ^_^;;

Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nanog.org/pipermail/nanog/attachments/20200320/da3eaab7/attachment.html>


More information about the NANOG mailing list