DNS pulling BGP routes?
Christopher Morrow
morrowc.lists at gmail.com
Mon Oct 11 15:04:38 UTC 2021
On Sat, Oct 9, 2021 at 11:16 AM Masataka Ohta <
mohta at necom830.hpcl.titech.ac.jp> wrote:
> Bill Woodcock wrote:
>
> >> It may be that facebook uses all the four name server IP addresses
> >> in each edge node. But, it effectively kills essential redundancy
> >> of DNS to have two or more name servers (at separate locations)
> >> and the natural consequence is, as you can see, mass disaster.
> >
> > Yep. I think we even had a NANOG talk on exactly that specific topic a
> long time ago.
> >
> >
> https://www.pch.net/resources/Papers/dns-service-architecture/dns-service-architecture-v10.pdf
>
> Yes, having separate sets of anycast addresses by two or more pops
> should be fine.
>
>
To be fair, it looks like FB has 4 /32's (and 4 /128's) for their DNS
authoritatives.
All from different /24's or /48's, so they should have decent routing
diversity.
They could choose to announce half/half from alternate pops, or other games
such as this.
I don't know that that would have solved any of the problems last week nor
any problems in the future.
I think Bill's slide 30 is pretty much what FB has/had deployed:
1) I would think the a/b cloud is really 'as similar a set of paths from
like deployments as possible
2) redundant pairs of servers in the same transit/network
3) hidden masters (almost certainly these are in the depths of the FB
datacenter network)
(though also this part isn't important for the conversation)
4) control/sync traffic on a different topology than the customer serving
one
> However, if CDN provider has their own transit backbone, which is,
> seemingly, not assumed by your slides, and retail ISPs are tightly
>
I think it is, actually, in slide 30 ?
"We need a network topology to carry control and synchronization traffic
between the nodes"
connected to only one pop of the CDN provider, the CDN provider
>
it's also not clear that FB is connecting their CDN to single points in any
provider...
I'd guess there are some cases of that, but for larger networks I would
imagine there are multiple CDN
deployments per network. I can't imagine that it's safe to deploy 1 CDN
node for all of 7018 or 3320...
for instance.
> may be motivated to let users access only one pop killing essential
> redundancy of DNS, which should be overengineering, which is my
> concern of the paragraph quoted by you.
>
>
it seems that the problem FB ran into was really that there wasn't either:
"secondary path to communicate: "You are the last one standing, do not
die" (to an edge node)
or:
"maintain a very long/less-preferred path to a core location(s) to
maintain service in case the CDN disappears"
There are almost certainly more complexities which FB is not discussion in
their design/deployment which
affected their services last week, but it doesn't look like they were very
far off on their deployment, if they
need to maintain back-end connectivity to serve customers from the CDN
locales.
-chris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nanog.org/pipermail/nanog/attachments/20211011/223dd11f/attachment.html>
More information about the NANOG
mailing list