Long-haul 100Mbps EPL circuit throughput issue

alvin nanog nanogml at Mail.DDoS-Mitigator.net
Thu Nov 5 23:19:12 UTC 2015


hi eric

On 11/05/15 at 04:48pm, Eric Dugas wrote:
...
> Linux test machine in customer's VRF <-> SRX100 <-> Carrier CPE (Cisco
> 2960G) <-> Carrier's MPLS network <-> NNI - MX80 <-> Our MPLS network <->
> Terminating edge - MX80 <-> Distribution switch - EX3300 <-> Linux test
> machine in customer's VRF
> 
> We can full the link in UDP traffic with iperf but with TCP, we can reach
> 80-90% and then the traffic drops to 50% and slowly increase up to 90%.
 
if i was involved with these tests, i'd start looking for "not enough tcp send 
and tcp receive buffers"

for flooding at 100Mbit/s, you'd need about 12MB buffers ... 

udp does NOT care too much about dropped data due to the buffers,
but tcp cares about "not enough buffers" .. somebody resend packet# 1357902456 :-)

at least double or triple the buffers needed to compensate for all kinds of 
network whackyness: 
data in transit, misconfigured hardware-in-the-path, misconfigured iperfs, 
misconfigured kernels, interrupt handing, etc, etc

- how many "iperf flows" are you also running ??
	- running dozen's or 100's of them does affect thruput too

- does the same thing happen with socat ??

- if iperf and socat agree with network thruput, it's the hw somewhere

- slowly increasing thruput doesn't make sense to me ... it sounds like 
something is cacheing some of the data

magic pixie dust
alvin

> Any one have dealt with this kind of problem in the past? We've tested by
> forcing ports to 100-FD at both ends, policing the circuit on our side,
> called the carrier and escalated to L2/L3 support. They tried to also
> police the circuit but as far as I know, they didn't modify anything else.
> I've told our support to make them look for underrun errors on their Cisco
> switch and they can see some. They're pretty much in the same boat as us
> and they're not sure where to look at.
> 



More information about the NANOG mailing list