IPv6 oddness in Comcast land...

Mon Mar 20 15:16:53 UTC 2017

(I first sent this directly to Valdis instead of the list, so my apologies
to Valdis for getting this twice)

Greetings,

I'm afraid I can't hand the ultimate solution, but I can point you in a
direction.

     Sounds like you probably have an IPv6 neighbor discovery problem.
Most likely (since that's where the change occurred) it's between your WRT
and the Comcast CPE (I assume a cable modem) or the first active piece of
the upstream cable plant.  But It'll be the first Comcast device actually
speaking Ipv6 to your WRT.

     I've seen this happen several times in new (or changed) peering links
with other providers (where dissimilar equipment, or new ACLs) are
involved.  Typically what's happening is that an ACL or firewall rule on
one device isn't allowing that devices interface to speak fully over the
new link, and that's preventing IPv6 neighbor discovery from happening
properly between two adjacent devices.  (In this case those devices are
likely your WRT and the first upstream Comcast device speaking IPv6).

     Since it's your device that changed, you likely won't have a lot of
luck convincing comcast to dig too deep into this issue, especially since
their device "worked" before and these providers have few engineers
on-staff that really understand v6.  It's not that there's no one at
comcast who can fix it, it'll just take you a while to find them.

     So without knowing your equipment, I can only offer a few general
tips.  Look for troubleshooting commands that will show you the ipv6
neighbor discovery status on your device interfaces.  See what the status
is before a traceroute (when things are broken) and after a traceroute
(when things are fixed).  If it appears I'm right, go to that Interface and
create ACLs or firewall rules to allow the actual ipv6 addresse(s) on that
interface to speak (outward) to their local subnet.

     Be sure to remember you may need to create a rule for the global
(permanent, public) address, and also for the link-local address.  Some
vendors will put the link-local address in the ND solicitation and others
will use the global unicast (if it's already been assigned).  The RFC
suggests the link-local, but also says that the source and destination
addresses in the messages need be only "An address assigned to the
interface from which the advertisement is sent."

     If that does help, remember to tighten those new ACLs as much as you
can and still have things work.  If it doesn't, you'll likely have to
engage comcast about the issue, as it may, or may not be this at all.

:-)  good luck

Sincerely,
Casey Russell
Network Engineer
[image: KanREN] <http://www.kanren.net>
[image: phone]785-856-9809
2029 Becker Drive, Suite 282
Lawrence, Kansas 66047
[image: linkedin]
<https://www.linkedin.com/company/92399?trk=tyah&trkInfo=clickedVertical%3Acompany%2CclickedEntityId%3A92399%2Cidx%3A1-1-1%2CtarId%3A1440002635645%2Ctas%3AKanREN>
[image:
twitter] <https://twitter.com/TheKanREN> [image: twitter]
<http://www.kanren.net/feed/> need support? <support at kanren.net>

On Sun, Mar 19, 2017 at 6:16 PM, <valdis.kletnieks at vt.edu> wrote:

> Trying to figure out what the heck is going on here.  Any good
> explanations cheerfully accepted.
>
> Background:  Home internet router is a Linksys WRT1200AC that had been
> running OpenWRT 15.05.01. IPv6 worked just fine - Comcast handed me a /60
> via DHCP-PD and no issues.  I reflashed it to Lede 17.01, and after doing
> all the reconfig, I'm hitting a really strange IPv6 issue.
>
> Symptoms - IPv6 still configures correctly, but IPv6 packets appear to go
> out
> and disappear into the ether when they leave the Linksys.  Doing a
> traceroute
> to any IPv6 destination makes things work again - for a while (from 15
> minutes
> to an hour or two).
>
> As seen from my laptop (I have the matching tcpdump from the outbound
> interface on the Linksys):
>
> [~] ping -6 -c 3 listserv.vt.edu
> PING listserv.vt.edu(listserv.ipv6.vt.edu (2001:468:c80:2105:211:43ff:feda:d769))
> 56 data bytes
>
> --- listserv.vt.edu ping statistics ---
> 3 packets transmitted, 0 received, 100% packet loss, time 2070ms
>
> [~] traceroute -6 listserv.vt.edu
> traceroute to listserv.vt.edu (2001:468:c80:2105:211:43ff:feda:d769), 30
> hops max, 80 byte packets
>  1  2601:5c0:c001:69e2::1 (2601:5c0:c001:69e2::1)  2.417 ms  3.077 ms
> 5.358 ms
>  2  * * *
>  3  * * *
>  4  * * *
>  5  * * *
>  6  * hu-0-10-0-7-pe04.ashburn.va.ibone.comcast.net (2001:558:0:f5c1::2)
> 31.478 ms  31.975 ms
>  7  2001:559::d16 (2001:559::d16)  32.406 ms  17.102 ms  24.751 ms
>  8  2001:550:2:2f::a (2001:550:2:2f::a)  23.245 ms  23.519 ms  22.185 ms
>  9  2607:b400:f0:2003::f0 (2607:b400:f0:2003::f0)  29.782 ms  28.604 ms
> 29.891 ms
> 10  2607:b400:90:ff05::f1 (2607:b400:90:ff05::f1)  30.423 ms *  30.680 ms
> 11  * * *
> 12  listserv.ipv6.vt.edu (2001:468:c80:2105:211:43ff:feda:d769)  34.562
> ms  39.072 ms  24.633 ms
> [~] ping -6 -c 3 listserv.vt.edu
> PING listserv.vt.edu(listserv.ipv6.vt.edu (2001:468:c80:2105:211:43ff:feda:d769))
> 56 data bytes
> 64 bytes from listserv.ipv6.vt.edu (2001:468:c80:2105:211:43ff:feda:d769):
> icmp_seq=1 ttl=53 time=33.3 ms
> 64 bytes from listserv.ipv6.vt.edu (2001:468:c80:2105:211:43ff:feda:d769):
> icmp_seq=2 ttl=53 time=24.3 ms
> 64 bytes from listserv.ipv6.vt.edu (2001:468:c80:2105:211:43ff:feda:d769):
> icmp_seq=3 ttl=53 time=46.0 ms
>
> --- listserv.vt.edu ping statistics ---
> 3 packets transmitted, 3 received, 0% packet loss, time 2003ms
> rtt min/avg/max/mdev = 24.334/34.595/46.093/8.927 ms
>
> So it looks like something times out somewhere and fails to pass packets
> back.
>
> TCP connections don't keep IPv6 alive. I have a browser window that
> auto-updates every 5 minutes, and a SmokePing process on a Raspberri Pi
> uploads
> to a server at work every few minutes, and those eventually drop back to
> IPv4
> when the IPv6 TCP fails to connect. And normal UDP doesn't seem to keep it
> alive - NTP pointing at IPv6 peers loses connectivity as well.
>
> But a traceroute wakes it up. It's almost like some router is losing the
> route to me out of the FIB, and fixes it when it has to handle a packet
> on the CPU slow path (like send back a 'time exceeded').  But I'm mystified
> why this started when I reflashed my router.
>
>