Cogent --> Google Public DNS routing issue
dmiller at tiggee.com
Wed Aug 17 11:01:48 CDT 2011
On 8/17/2011 9:13 AM, Patrick W. Gilmore wrote:
> On Aug 17, 2011, at 1:07 AM, Christopher Morrow wrote:
>> On Wed, Aug 17, 2011 at 12:09 AM, Robert Glover<robertg at garlic.com> wrote:
>>> We have noticed that from our Cogent link (as well as from ALL U.S. based
>>> points we tested via the Cogent Looking Glass:
>>> http://www.cogentco.com/en/network/looking-glass), traceroutes to 184.108.40.206
>>> and 220.127.116.11 all seem to go over to Europe:
>> 18.104.22.168 ain't the driods you are looking for...
> In the traceroute appended to the original post, he did trace to 22.214.171.124.
> While it did go all over, I don't see the problem - it got to the destination host.
> Anycast is OK for some things, but it depends on BGP. BGP has zero concept of latency, loss, or geography. Expecting anycast to guarantee an optimal path or location is a grave error.
There are two basic types of anycast:
1. Simple anycast - announce an anycast prefix to whoever/wherever in
more than one location.
2. Global anycast + careful configuration - announce an anycast prefix
to particular providers at specific geographically disparate locations
and using other options to achieve geographic and/or performant inbound
Perhaps we need a new term for 2.
Google is clearly attempting to implement 2 and not 1 for their
resolving DNS service. Based on Google's claims of speed (and my
testing of their response times), they have either found a way to exceed
the speed of light with packets or they are managing to keep most of
their traffic "local ish" to the requester.
To say that anycast "relies on BGP" and therefore expecting an optimal
path is an error - is disengenuous (I want a better word, but this one
will do). The internet as a whole "relies on BGP" and yet we expect
mostly optimal paths. While it is true that BGP has no capacity to
account for latency or loss, IGPs which can take into account these
factors end at the borders of networks (where prefixes are passed using
BGP). This is what makes up the "inter net".
If you were tracing from a host in Ashburn to a unicast host in NYC and
your path passed through San Jose, then you would say that was an
issue. The same would be true with an anycast destination address.
As to geography, IGPs don't have a concept of geography either. A
router in NYC doesn't know or care that the router at the other end of a
link is in CHI. All it knows is the prefixes that it gets from that
router and metrics to choose a best path for them. BGP combined with
"proper" (i.e. distributed) peering of networks does provide performant
paths for traffic. In an anycast configuration the "careful
configuration" is selecting providers to announce anycast prefixes to
and communities that you put on the prefixes to control redistribution.
Global anycast + careful configuration can and does provide mostly
performant paths and a very high level of geographic fidelity - though,
granted, not "guaranteed" (at least not guaranteed at a higher level
than unicast prefixes).
You can't "guarantee" performant paths ever (regardless of anycast or
unicast) if any path between the source and destination crosses the
border between two networks because some networks will choose a
"primary" upstream (single homed or heavily pref'ed) that only picks up
a prefix in a particular area and sends all of the traffic there. The
originator of the prefix can depref that provider to try to influence
path selection, but some networks will doggedly prefer to send packets
to that network despite the efforts of the originator. The only thing
to do then is to ask why this network selected that particular upstream
and then to explain to them why that might not have been the best
choice, if they want performant paths...
> The possible reasons for this are nearly innumerable. Perhaps Congent<> Google is congested in the US so one or the other prefers EU? Perhaps there is some IGP metric messed up inside Cogent that prefers the EU? Perhaps more nefarious problems, such as Google de-peering Cogent in the US? Etc., etc.
> You may be able to find out if you look, and you may not (I didn't even try). But even if you do figure out the answer, you can't fix it. Only Cogent and/or Google can.
My traces show all the Cogent locations in the US that I traced from
going to Telia in EU and then to Google.
My traces from Telia locations in the US all (properly) reach Google
destinations in the US.
So, Cogent is only receiving/using/preferring these two prefixes from
their peering(s) with Telia in EU.
As to the root cause of that... only the players in that game can say.
> Moreover, you can see things like this with anycast even when there is no problem!
The OP believes that it is a problem. You *can* see this with anycast,
but I would say that this *is* a "problem" (for my definition of
"problem" which admittedly may be different from others). There are
many potential solutions to the problem, the most obvious is for the OP
to stop preferring to send traffic to these prefixes over Cogent.
To the OP: I have to wonder what factors were used to decide "primary"
vs "backup" provider. If "price", then you should expect issues with
less performant routing. If "quality", then what measures were used to
determine a "quality" ranking? I am also curious as to who the "backup"
is (but that is just morbid curiousity).
More information about the NANOG