DNS pulling BGP routes?

J. Hellenthal jhellenthal at dataix.net
Wed Oct 6 17:52:08 UTC 2021


They most likely sent an update to the DNS servers for TLV DNSSEC and in oversight forgot they needed to null something's out of the workbook to not touch the BGP instances.

I'd hardly believe that would be triggered by the dns server itself.

-- 
 J. Hellenthal

The fact that there's a highway to Hell but only a stairway to Heaven says a lot about anticipated traffic volume.

> On Oct 6, 2021, at 12:45, Michael Thomas <mike at mtcc.com> wrote:
> 
> So if I understand their post correctly, their DNS servers have the ability to withdraw routes if they determine are sub-optimal (fsvo). I can certainly understand for the DNS servers to not give answers they think are unreachable but there is always the problem that they may be partitioned and not the routes themselves. At a minimum, I would think they'd need some consensus protocol that says that it's broken across multiple servers.
> 
> But I just don't understand why this is a good idea at all. Network topology is not DNS's bailiwick so using it as a trigger to withdraw routes seems really strange and fraught with unintended consequences. Why is it a good idea to withdraw the route if it doesn't seem reachable from the DNS server? Give answers that are reachable, sure, but to actually make a topology decision? Yikes. And what happens to the cached answers that still point to the supposedly dead route? They're going to fail until the TTL expires anyway so why is it preferable withdraw the route too?
> 
> My guess is that their post while more clear that most doesn't go into enough detail, but is it me or does it seem like this is a really weird thing to do?
> 
> Mike
> 
> 
>> On 10/5/21 11:56 PM, Bjørn Mork wrote:
>> Masataka Ohta <mohta at necom830.hpcl.titech.ac.jp> writes:
>> 
>>> As long as name servers with expired zone data won't serve
>>> request from outside of facebook, whether BGP routes to the
>>> name servers are announced or not is unimportant.
>> I am not convinced this is true.  You'd normally serve some semi-static
>> content, especially wrt stuff you need yourself to manage your network.
>> Removing all DNS servers at the same time is never a good idea, even in
>> the situation where you believe they are all failing.
>> 
>> The problem is of course that you can't let the servers take the
>> decision to withdraw from anycast if you want to prevent this
>> catastrophe.  The servers have no knowledge of the rest of the network.
>> They only know that they've lost contact with it.  So they all make the
>> same stupid decision.
>> 
>> But if the servers can't withdraw, then they will serve stale content if
>> the data center loses backbone access. And with a large enough network
>> then that is probably something which happens on a regular basis.
>> 
>> This is a very hard problem to solve.
>> 
>> Thanks a lot to facebook for making the detailed explanation available
>> to the public.  I'm crossing my fingers hoping they follow up with
>> details about the solutions they come up with.  The problem affects any
>> critical anycast DNS service. And it doesn't have to be as big as
>> facebook to be locally critical to an enterprise, ISP or whatever.
>> 
>> 
>> 
>> Bjørn


More information about the NANOG mailing list