Who broke .org?

Joe Maimon jmaimon at ttec.com
Fri Jul 2 03:12:31 UTC 2004




Richard A Steenbergen wrote:

>I guess I'll ask first...
>
>  
>

There was a gentleman a while back that posited that having only two 
anycast NS records was broken by design. Suggested that while servicing 
the whole TLD from two NS that were really a little army of anycast 
clusters all around out there was very 'l33t', it would not hurt overly 
much if -  say 2 or up to 11 of these clusters were also identified and 
available by good old fashioned unicast in the NS records for the zone.

Seems to work for "."

Something about "eggs all in one basket". The basket being the anycast 
topology. Even should the topology be bulletproof overall, his point was 
that even a partial failure, if it failed "closed" could leave his 
resolver stuck on non-responsive servers, while perfectly good ones were 
still out there.

Come to think about it, there was a thread here a while back about this 
very thing. root server robustness and all that.

What number/timeframe reported .org hiccup does this make?

Is it just this anycast deployment? Has f-root anycast ever reported any 
stray problems causing some outage to somebody somewhere, were they to 
be relying on f and only f?

Does nobody else think this?

What about algorithms in recursive resolvers? How about trying first 
arbitrary "close" or "optimal" ns, if no response try TWO of the 
remaining next best candidates, then try FOUR of the remaining.....until 
all are gone and we restart the loop at 1 (to be nice) until request is 
timed out or one answers and becomes "optimal", with periodic probing or 
looping across the list for freshness. After all the irrelevant junk the 
roots and near roots get, some enthusiastic legit retries may not even 
be unwelcome.

I know I shouldnt hit send but....please be gentle.




More information about the NANOG mailing list