Large RTT or Why doesn't my ping traffic get discarded?
saku at ytti.fi
Thu Dec 22 06:07:03 UTC 2022
There certainly aren't any temporal buffers in SP gear limiting the
buffer to 100ms, nor are there any mechanisms to temporally decrease
TTL or hop-limit. Some devices may expose temporal configuration to
UX, but that is just a multiplier for max_buffer_bytes, and what is
programmed is a fixed amount of bytes instead of temporal limit as
function of observed traffic rate.
This is important, because HW may support tens or even hundreds of
thousands of queues, because HW may support large amount of logical
interfaces with HQoS and multiple queues each, then if such device is
ran with single logical interface, which is low speed either
physically or shaped, you may end up having very very long temporal
queues, not because people intend to queue long, but because
understanding all of this requires lot of context and information
about platform which isn't readily available nor is solved by 'just
remove those buffers from devices physically, it's bufferbloat'.
Like others have pointed out, there is not much information to go with
and this could be many things, one of those could be 'buffer bloat'
like Taht pointed out, this might be true because cyclical nature of
the ping, buffer getting filled and drained. I don't really think
ARP/ND is good candidate like Herring suggested, because it's
cyclical, instead of exactly single event, but not impossible.
We'd really need to see full mtr output, and if or not this affects
other destinations, if it just affects icmp or also dns, ideally
reverse traceroute as well. I can tell that I'm not observing the
issue, nor did I expect to observe it, as I expect problem to close to
your network, and therefore affecting a lot of destinations.
On Thu, 22 Dec 2022 at 07:35, Jerry Cloe <jerry at jtcloe.net> wrote:
> Because there is no standard for discarding "old" traffic, only discard is for packets that hop too many times. There is, however, a standard for decrementing TTL by 1 if a packet sits on a device for more than 1000ms, and of course we all know what happens when TTL hits zero. Based on that, your packet could have floated around for another 53 seconds. Having said that, I'm not sure many devices actually do this (but its not likely it would have had a significant impact on this traffic anyway).
> -----Original message-----
> From: Jason Iannone <jason.iannone at gmail.com>
> Sent: Wed 12-21-2022 11:11 am
> Subject: Large RTT or Why doesn‘t my ping traffic get discarded?
> To: North American Network Operators‘ Group <nanog at nanog.org>;
> Here's a question I haven't bothered to ask until now. Can someone please help me understand why I receive a ping reply after almost 5 seconds? As I understand it, buffers in SP gear are generally 100ms. According to my math this round trip should have been discarded around the 1 second mark, even in a long path. Maybe I should buy a lottery ticket. I don't get it. What is happening here?
> 64 bytes from 126.96.36.199: icmp_seq=392 ttl=54 time=4834.737 ms
> 64 bytes from 188.8.131.52: icmp_seq=393 ttl=54 time=4301.243 ms
> 64 bytes from 184.108.40.206: icmp_seq=394 ttl=54 time=3300.328 ms
> 64 bytes from 220.127.116.11: icmp_seq=396 ttl=54 time=1289.723 ms
> Request timeout for icmp_seq 400
> Request timeout for icmp_seq 401
> 64 bytes from 18.104.22.168: icmp_seq=398 ttl=54 time=4915.096 ms
> 64 bytes from 22.214.171.124: icmp_seq=399 ttl=54 time=4310.575 ms
> 64 bytes from 126.96.36.199: icmp_seq=400 ttl=54 time=4196.075 ms
> 64 bytes from 188.8.131.52: icmp_seq=401 ttl=54 time=4287.048 ms
> 64 bytes from 184.108.40.206: icmp_seq=403 ttl=54 time=2280.466 ms
> 64 bytes from 220.127.116.11: icmp_seq=404 ttl=54 time=1279.348 ms
> 64 bytes from 18.104.22.168: icmp_seq=405 ttl=54 time=276.669 ms
More information about the NANOG