F.ROOT-SERVERS.NET moved to Beijing?

Danny McPherson danny at tcb.net
Mon Oct 3 16:38:25 UTC 2011


On Oct 3, 2011, at 11:20 AM, Leo Bicknell wrote:
> 
> Thus the impact to valid names should be minimal, even in the face
> of longer timeouts.

If you're performing validation on a recursive name server (or 
similar resolution process) expecting a signed response yet the 
response you receive is either unsigned or doesn't validate 
(i.e., bogus) you have to:

1) ask other authorities?  how many?  how frequently?  impact?
2) consider implications on _entire chain of trust?
3) tell the client something?  
4) cache what (e.g., zone cut from who you asked)? how long? 
5) other?

"minimal" is not what I was thinking..

> Network layer integrity and secure routing don't help the majority of
> end users.  At my house I can choose Comcast or AT&T service.  They will
> not run BGP with me, I could not apply RPKI, secure BGP, or any other
> method to the connections.  They may well do NXDOMAIN remapping on their
> resolvers, or even try and transparently rewrite DNS answers.  Indeed
> some ISP's have even experimented with injecting data into port 80
> traffic transparently!
> 
> Secure networks only help if the users have a choice, and choose to not
> use "bad" networks.  If you want to be able to connect at Starbucks, or
> the airport, or even the conference room Wifi on a clients site you need
> to assume it's a rogue network in the middle.
> 
> The only way for a user to know what they are getting is end to end
> crypto.  Period.

I'm not sure how "end to end" crypto helps end users in the advent
of connectivity and *availability* issues resulting from routing 
brokenness in an upstream network which they do not control. 
"crypto", OTOH, depending on what it is and where in the stack it's 
applied, might well align with my "network layer integrity" 
assertion.

> As for the speed of detection, its either instantenous (DNSSEC
> validation fails), or it doesn't matter how long it is (minutes,
> hours, days).  The real problem is the time to resolve.  It doesn't
> matter if we can detect in seconds or minutes when it may take hours
> to get the right people on the phone and resolve it.  Consider this
> weekend's activity; it happened on a weekend for both an operator
> based in the US and a provider based in China, so you're dealing
> with weekend staff and a 12 hour time difference.
> 
> If you want to insure accuracy of data, you need DNSSEC, period.
> If you want to insure low latency access to the root, you need
> multiple Anycasted instances because at any one point in time a
> particular one may be "bad" (node near you down for maintenance,
> routing issue, who knows) which is part of why there are 13 root
> servers.  Those two things together can make for resilliance,
> security and high performance.

You miss the point here Leo.  If the operator of a network service 
can't detect issues *when they occur* in the current system in some 
automated manner, whether unintentional or malicious, they won't be 
alerted, they certainly can't "fix" the problem, and the potential 
exposure window can be significant.

Ideally, the trigger for the alert and detection function is more 
mechanized than "notification by services consumer", and the network 
service operators or other network operators aware of the issue have 
some ability to institute reactive controls to surgically deal with 
that particular issue, rather than being captive to the [s]lowest 
common denominator of all involved parties, and dealing with 
additional non-determinsitic failures or exposure in the interim.

Back to my earlier point, for *resilience* network layer integrity 
techniques and secure routing infrastructure are the only preventative 
controls here, and necessarily to augment DNSSEC's authentication and 
integrity functions at the application layer.  Absent these, rapid 
detection enabling reactive controls that mitigate the issue are 
necessary.

-danny




More information about the NANOG mailing list