Large RTT or Why doesn't my ping traffic get discarded?
bill at herrin.us
Thu Dec 22 07:26:03 UTC 2022
On Wed, Dec 21, 2022 at 11:03 PM Saku Ytti <saku at ytti.fi> wrote:
> On Thu, 22 Dec 2022 at 08:41, William Herrin <bill at herrin.us> wrote:
> > Suppose you have a loose network cable between your Linux server and a
> > switch. Layer 1. That RJ45 just isn't quite solid. It's mostly working
> > but not quite right. What does it look like at layer 2? One thing it
> > can look like is a periodic carrier flash where the NIC thinks it has
> > no carrier, then immediately thinks it has enough of a carrier to
> > negotiate speed and duplex. How does layer 3 respond to that?
> Agreed. But then once the resolve happens, and linux floods the queued
> pings out, the responses would come ~immediately. So the delta between
> the RTT would remain at the send interval, in this case 1s. In this
> case, we see the RTT decreasing as if the buffer is being purged,
> until it seems to be filled again, up-until 5s or so.
Not quite. The ping origination time isn't set when layer 3 decides
the packet can be delivered to layer 2, it's set when layer 7 drops
the packet on the stack. In other words: when the ping app "sends" the
packet, not when the NIC actually puts the packet on the wire or even
when the OS sends the packet over to the NIC. The time the packet
spends queued waiting for ARP to supply a next-hop MAC address counts
against the round trip time.
When you see this pattern of descending ping times exactly one second
apart where the responses all arrived at once, it's usually because
something in the path didn't have the next-hop MAC address for a
while, and then it did. And it's usually not something deep in the
network because something deep would exhaust it's transmission queue
long before it could queue several seconds worth of pings.
If you want to prove this to yourself, set up a Linux box, install a
filter to drop arp replies (arptables or nftables), delete the arp
entry for your default router (arp -d) and then start pinging
something. When you -remove- the arp filter, you'll see the pattern in
the ping responses that Jason posted.
You may get different results in other OSes. For example, Windows will
lose its DHCP address with the carrier flash, so when ping tries to
send the packet the network is unreachable. Because the stack
considers the network unreachable, the ping packet isn't queued and
the error is reported immediately to the application.
For hire. https://bill.herrin.us/resume/
More information about the NANOG