Lossy cogent p2p experiences?

Tom Beecher beecher at beecher.cc
Tue Sep 5 21:13:01 UTC 2023


>
> Cogent support has been about as bad as you can get.  Everything is great,
> clean your fiber, iperf isn’t a good test, install a physical loop oh wait
> we don’t want that so go pull it back off, new updates come at three to
> seven day intervals, etc.  If the performance had never been good to begin
> with I’d have just attributed this to their circuits, but since it worked
> until late June, I know something has changed.  I’m hoping someone else has
> run into this and maybe knows of some hints I could give them to
> investigate.  To me it sounds like there’s a rate limiter / policer defined
> somewhere in the circuit, or an overloaded interface/device we’re forced to
> traverse, but they assure me this is not the case and claim to have
> destroyed and rebuilt the logical circuit.
>

Sure smells like port buffer issues somewhere in the middle. ( mismatched
deep / shallow, or something configured to support jumbo frames, but
buffers not optimized for them)

On Thu, Aug 31, 2023 at 11:57 AM David Hubbard <
dhubbard at dino.hostasaurus.com> wrote:

> Hi all, curious if anyone who has used Cogent as a point to point provider
> has gone through packet loss issues with them and were able to successfully
> resolve?  I’ve got a non-rate-limited 10gig circuit between two geographic
> locations that have about 52ms of latency.  Mine is set up to support both
> jumbo frames and vlan tagging.  I do know Cogent packetizes these circuits,
> so they’re not like waves, and that the expected single session TCP
> performance may be limited to a few gbit/sec, but I should otherwise be
> able to fully utilize the circuit given enough flows.
>
>
>
> Circuit went live earlier this year, had zero issues with it.  Testing
> with common tools like iperf would allow several gbit/sec of TCP traffic
> using single flows, even without an optimized TCP stack.  Using parallel
> flows or UDP we could easily get close to wire speed.  Starting about ten
> weeks ago we had a significant slowdown, to even complete failure, of
> bursty data replication tasks between equipment that was using this
> circuit.  Rounds of testing demonstrate that new flows often experience
> significant initial packet loss of several thousand packets, and will then
> have ongoing lesser packet loss every five to ten seconds after that.
> There are times we can’t do better than 50 Mbit/sec, but it’s rare to
> achieve gigabit most of the time unless we do a bunch of streams with a lot
> of tuning.  UDP we also see the loss, but can still push many gigabits
> through with one sender, or wire speed with several nodes.
>
>
>
> For equipment which doesn’t use a tunable TCP stack, such as storage
> arrays or vmware, the retransmits completely ruin performance or may result
> in ongoing failure we can’t overcome.
>
>
>
> Cogent support has been about as bad as you can get.  Everything is great,
> clean your fiber, iperf isn’t a good test, install a physical loop oh wait
> we don’t want that so go pull it back off, new updates come at three to
> seven day intervals, etc.  If the performance had never been good to begin
> with I’d have just attributed this to their circuits, but since it worked
> until late June, I know something has changed.  I’m hoping someone else has
> run into this and maybe knows of some hints I could give them to
> investigate.  To me it sounds like there’s a rate limiter / policer defined
> somewhere in the circuit, or an overloaded interface/device we’re forced to
> traverse, but they assure me this is not the case and claim to have
> destroyed and rebuilt the logical circuit.
>
>
>
> Thanks!
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nanog.org/pipermail/nanog/attachments/20230905/21122233/attachment.html>


More information about the NANOG mailing list