Long-haul 100Mbps EPL circuit throughput issue
greg at foletta.org
Thu Nov 5 23:35:13 UTC 2015
Along with recv window/buffer which is needed for your particular
bandwidth/delay product, it appears you're also seeing TCP moving from
slow-start to a congestion avoidance mechanism (Reno, Tahoe, CUBIC etc).
greg at foletta.org
On 6 November 2015 at 10:19, alvin nanog <nanogml at mail.ddos-mitigator.net>
> hi eric
> On 11/05/15 at 04:48pm, Eric Dugas wrote:
> > Linux test machine in customer's VRF <-> SRX100 <-> Carrier CPE (Cisco
> > 2960G) <-> Carrier's MPLS network <-> NNI - MX80 <-> Our MPLS network <->
> > Terminating edge - MX80 <-> Distribution switch - EX3300 <-> Linux test
> > machine in customer's VRF
> > We can full the link in UDP traffic with iperf but with TCP, we can reach
> > 80-90% and then the traffic drops to 50% and slowly increase up to 90%.
> if i was involved with these tests, i'd start looking for "not enough tcp
> and tcp receive buffers"
> for flooding at 100Mbit/s, you'd need about 12MB buffers ...
> udp does NOT care too much about dropped data due to the buffers,
> but tcp cares about "not enough buffers" .. somebody resend packet#
> 1357902456 :-)
> at least double or triple the buffers needed to compensate for all kinds of
> network whackyness:
> data in transit, misconfigured hardware-in-the-path, misconfigured iperfs,
> misconfigured kernels, interrupt handing, etc, etc
> - how many "iperf flows" are you also running ??
> - running dozen's or 100's of them does affect thruput too
> - does the same thing happen with socat ??
> - if iperf and socat agree with network thruput, it's the hw somewhere
> - slowly increasing thruput doesn't make sense to me ... it sounds like
> something is cacheing some of the data
> magic pixie dust
> > Any one have dealt with this kind of problem in the past? We've tested by
> > forcing ports to 100-FD at both ends, policing the circuit on our side,
> > called the carrier and escalated to L2/L3 support. They tried to also
> > police the circuit but as far as I know, they didn't modify anything
> > I've told our support to make them look for underrun errors on their
> > switch and they can see some. They're pretty much in the same boat as us
> > and they're not sure where to look at.
More information about the NANOG