TCP Performance

Tue Aug 27 16:56:14 UTC 2013

No QoS is in use anywhere..
To the best of my ability I've eliminated Packet loss. However, I've not 
found a way any better than ICMP/MTR/Ping -f..etc.
The reason flow control has been mentioned is to correct buffer overflow at 
the Microwave links. Where they physically link at GigFDX. But the radio 
interface is only capable of ~360Mb/s, It's possible for the sending device 
to overflow the buffer between the fiber/ethernet and the radio interface.I 
can say we've had an issue like this in the past, Which forcing 100Mb/s FDX 
on a licensed radio fixed the problem. Being that, The ethernet was now 
slower then the radio interface. However, The down fall of this is that it 
limits the link to 100Mb/s which isn't sufficient anymore.
In terms of congestion, There is not from my point of view. Every link in 
questions runs =>30% utilization.

Nick Olsen
Network Operations (855) FLSPEED  x106

----------------------------------------
From: "Blake Dunlap" <ikiris at gmail.com>
Sent: Tuesday, August 27, 2013 11:42 AM
To: nick at flhsi.com
Cc: nanog at thedaileyplanet.com, "nanog at nanog.org" <nanog at nanog.org>
Subject: Re: TCP Performance

This really sounds like you aren't testing the correct flow type in 
i/jperf, or you have some QoS queues for http traffic but not the perf 
traffic that are filled.

Regardless, your problem looks like either tail drops or packet loss, which 
you showed originally. The task is to find out where this is occurring, and 
which of the two it is. If you want to confirm what is going on, there are 
some great bandwidth calculators on the internet which will show you what 
bandwidth you can get with a given ms delay and % packet loss.

As far as flow control, its really outside the scope. If you ever need flow 
control, there is usually a specific reason like FCoE, and if not, it's 
generally better to just fix the backplane congestion issue if you can, 
than ever worry about using FC. The problem with FC isn't node to node, its 
when you have node to node to node with additional devices, it isn't smart 
enough to discriminate, and can crater your network 3 devices over when it 
would be much better to just lose a few packets.

-Blake

On Tue, Aug 27, 2013 at 9:49 AM, Nick Olsen <nick at flhsi.com> wrote:
 Duplex mismatch has been checked across the board. On every device.

Nick Olsen
Network Operations (855) FLSPEED  x106

----------------------------------------
From: "Chad Dailey" <nanog at thedaileyplanet.com>
Sent: Tuesday, August 27, 2013 10:48 AM
To: nick at flhsi.com
Subject: Re: TCP Performance

Check for duplex mismatch at the server.

On Mon, Aug 26, 2013 at 2:07 PM, Nick Olsen <nick at flhsi.com> wrote:
Greetings all, I've got an issue I was hoping to put a few more eyes on.
 Here's the scenario. Downloading a file at our Border is multiple orders
of magnitude faster then a few hops out. Using the same 128MB test file, I
tested at two different locations. As well as between them. Using multiple
connections improves throughput, However it's the single stream issue
we're
looking at right now. All testing servers in question are Centos Linux.
 Orlando Datapath: Cogent>Orlando Border Router (Mikrotik)>HP Procurve
Switch> Server Results: 2013-08-29 05:04:09 (52.6 MB/s) - `128mbfile.tgz'
saved [127926272/127926272]
 Cocoa NOC Datapath: Cogent>Orlando Border Router (Mikrotik)>Licensed
Microwave Link (300+Mb/s Capacity)>East Orange Router (Mikrotik)> Licensed
Microwave Link (300+Mb/s Capacity)>Cocoa Router (Mikrotik)>Licensed
Microwave Link (300+Mb/s Capacity)>Colo Router (Mikrotik)>NOC Router
(Mikrotik)>HP Procurve Switch>Server Results: 2013-08-26 13:42:25 (398
KB/s) - `128mbfile.tgz' saved [127926272/127926272]
 Orlando-Cocoa NOC Datapath: Orlando Server>HP Procurve Switch>Orlando
Border Router (Mikrotik)>Licensed Microwave Link (300+Mb/s Capacity)>East
Orange Router (Mikrotik)> Licensed Microwave Link (300+Mb/s
Capacity)>Cocoa
Router (Mikrotik)>Licensed Microwave Link (300+Mb/s Capacity)>Colo Router
(Mikrotik)>NOC Router(Mikrotik)>HP Procurve Switch>ServerResults:
2013-08-26 13:56:25 (3.31 MB/s) - `128mbfile.tgz' saved
[134217728/134217728]
 Now, For the fun of it. I ran Iperf single TCP between our Cocoa and
Orlando POP's. Just like the HTTP test above. (Server has a 100Mb/s port).
It maxes out the port, Unlike the HTTP test.
 [root at ded01 ~]# iperf -c
208.90.219.18------------------------------------------------------------Cli

ent connecting to 208.90.219.18, TCP port 5001TCP window size: 16.0 KByte
(default)------------------------------------------------------------[  3]
local 206.208.56.130 port 47281 connected with 208.90.219.18 port 5001[
ID]
Interval       Transfer     Bandwidth[  3]  0.0-10.0 sec   114 MBytes
95.7
Mbits/sec

Here's associated packet captures for each transfer. As well as full wget
output and traceroutes for each test. As you can see, The tests crossing
the wireless links show about 3x more TCP re-transmits/dup ACK's. But I'm
not sure I'm sold this could show such a huge drop in throughput. Other
then that, nothing really stands out to me as to why these transfers are
so
much slower. Intra-network iperf testing shows full throughput the whole
way with single connection. As well as UDP testing. One thing to note is
the Iperf testing has far less TCP re-transmit/dup acks then any of the
HTTP testing, Crossing the same Microwave Links and routers.
http://cdn.141networks.com/files/captures.zip
I appreciate any insight anyone might have. Thanks!
 Nick Olsen
Network Operations (855) FLSPEED  x106