IP addresses on subnet edge (/24)
joe.klein at mischoice.com
Tue Sep 15 14:52:40 UTC 2020
You could have them try the AWS E2 reachability site to confirm if this is the case.
Many of their test nodes end with .255 or .0. There are a few ending with 255.255 and several that end with 0.0.
I’m not sure what the website test actually does (ICMP versus TCP test or something else), but you can also connect to those IPs (at least the two that I just tested) over port 80, to test the full handshake. You mentioned ClientHello/ServerHello, these nodes don't respond over port 443 (only saw SYN). Kinda makes sense given they're IP addresses.
From: NANOG <nanog-bounces+joe.klein=mischoice.com at nanog.org> On Behalf Of Andrey Khomyakov
Sent: Monday, September 14, 2020 16:26
To: Nanog <nanog at nanog.org>
Subject: IP addresses on subnet edge (/24)
TL;DR I suspect there are middle boxes that don't like IPs ending in .255. Anyone seen that?
We are troubleshooting a strange issue where some of our customers cannot establish a successful connection with our HTTP front end. In addition to checking the usual things like routing and interface errors and security policy configurations, hopening support tickets with the load balancer vendor so far all to no avail, we did packet captures.
Based on the packet captures we receive a SYN, we reply with SYN-ACK, but the client never actually receives that SYN-ACK. In a different instance the 3-way completes, followed by TLS client hello to us, we reply with TLS Server Hello and that server hello never makes it to the client.
And again, this is only affecting a small subset of customers thus suggesting it's not the load balancer or the edge routing configuration (in fact we can traceroute fine to the customer's IP).
So far the only remaining theory that remains is that there are middle boxes out there that do not like IPs ending in .255. The service that the clients can't get to is hosted on two IPs ending in .255
Let's just say they are x.x.121.255 and x.x.125.255. We even stood up a basic "hello world" web server on x.x.124.255 with the same result. Standing up the very same basic webserver on x.x.124.250 allows the client to succeed.
So far we have a friendly customer who has been working with us on troubleshooting the issue and we have some pcaps from the client's side somewhat confirming that it's not the customer's system either.
This friendly customer is in a small 5 people office with Spectrum business internet (that's the SYN-ACK case). The same customer tried hopping on his LTE hotspot which came up as Cellco Partnership DBA Verizon Wireless with the same result (that's the TLS server hello case). That same customer with the same workstation drives a town over and he can get to the application fine (we are still waiting for the customer to let us know what that source IP is when it does work).
Before you suggest that those .255 addresses are broadcasts on some VLAN, they are not. They are injected as /32s using a routing protocol, while the VLAN addressing is all RFC1918 addressing.
More information about the NANOG