Lossy cogent p2p experiences?
Tom Beecher
beecher at beecher.cc
Tue Sep 5 21:13:01 UTC 2023
>
> Cogent support has been about as bad as you can get. Everything is great,
> clean your fiber, iperf isn’t a good test, install a physical loop oh wait
> we don’t want that so go pull it back off, new updates come at three to
> seven day intervals, etc. If the performance had never been good to begin
> with I’d have just attributed this to their circuits, but since it worked
> until late June, I know something has changed. I’m hoping someone else has
> run into this and maybe knows of some hints I could give them to
> investigate. To me it sounds like there’s a rate limiter / policer defined
> somewhere in the circuit, or an overloaded interface/device we’re forced to
> traverse, but they assure me this is not the case and claim to have
> destroyed and rebuilt the logical circuit.
>
Sure smells like port buffer issues somewhere in the middle. ( mismatched
deep / shallow, or something configured to support jumbo frames, but
buffers not optimized for them)
On Thu, Aug 31, 2023 at 11:57 AM David Hubbard <
dhubbard at dino.hostasaurus.com> wrote:
> Hi all, curious if anyone who has used Cogent as a point to point provider
> has gone through packet loss issues with them and were able to successfully
> resolve? I’ve got a non-rate-limited 10gig circuit between two geographic
> locations that have about 52ms of latency. Mine is set up to support both
> jumbo frames and vlan tagging. I do know Cogent packetizes these circuits,
> so they’re not like waves, and that the expected single session TCP
> performance may be limited to a few gbit/sec, but I should otherwise be
> able to fully utilize the circuit given enough flows.
>
>
>
> Circuit went live earlier this year, had zero issues with it. Testing
> with common tools like iperf would allow several gbit/sec of TCP traffic
> using single flows, even without an optimized TCP stack. Using parallel
> flows or UDP we could easily get close to wire speed. Starting about ten
> weeks ago we had a significant slowdown, to even complete failure, of
> bursty data replication tasks between equipment that was using this
> circuit. Rounds of testing demonstrate that new flows often experience
> significant initial packet loss of several thousand packets, and will then
> have ongoing lesser packet loss every five to ten seconds after that.
> There are times we can’t do better than 50 Mbit/sec, but it’s rare to
> achieve gigabit most of the time unless we do a bunch of streams with a lot
> of tuning. UDP we also see the loss, but can still push many gigabits
> through with one sender, or wire speed with several nodes.
>
>
>
> For equipment which doesn’t use a tunable TCP stack, such as storage
> arrays or vmware, the retransmits completely ruin performance or may result
> in ongoing failure we can’t overcome.
>
>
>
> Cogent support has been about as bad as you can get. Everything is great,
> clean your fiber, iperf isn’t a good test, install a physical loop oh wait
> we don’t want that so go pull it back off, new updates come at three to
> seven day intervals, etc. If the performance had never been good to begin
> with I’d have just attributed this to their circuits, but since it worked
> until late June, I know something has changed. I’m hoping someone else has
> run into this and maybe knows of some hints I could give them to
> investigate. To me it sounds like there’s a rate limiter / policer defined
> somewhere in the circuit, or an overloaded interface/device we’re forced to
> traverse, but they assure me this is not the case and claim to have
> destroyed and rebuilt the logical circuit.
>
>
>
> Thanks!
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nanog.org/pipermail/nanog/attachments/20230905/21122233/attachment.html>
More information about the NANOG
mailing list