Who broke .org?
Joe Maimon
jmaimon at ttec.com
Fri Jul 2 03:12:31 UTC 2004
Richard A Steenbergen wrote:
>I guess I'll ask first...
>
>
>
There was a gentleman a while back that posited that having only two
anycast NS records was broken by design. Suggested that while servicing
the whole TLD from two NS that were really a little army of anycast
clusters all around out there was very 'l33t', it would not hurt overly
much if - say 2 or up to 11 of these clusters were also identified and
available by good old fashioned unicast in the NS records for the zone.
Seems to work for "."
Something about "eggs all in one basket". The basket being the anycast
topology. Even should the topology be bulletproof overall, his point was
that even a partial failure, if it failed "closed" could leave his
resolver stuck on non-responsive servers, while perfectly good ones were
still out there.
Come to think about it, there was a thread here a while back about this
very thing. root server robustness and all that.
What number/timeframe reported .org hiccup does this make?
Is it just this anycast deployment? Has f-root anycast ever reported any
stray problems causing some outage to somebody somewhere, were they to
be relying on f and only f?
Does nobody else think this?
What about algorithms in recursive resolvers? How about trying first
arbitrary "close" or "optimal" ns, if no response try TWO of the
remaining next best candidates, then try FOUR of the remaining.....until
all are gone and we restart the loop at 1 (to be nice) until request is
timed out or one answers and becomes "optimal", with periodic probing or
looping across the list for freshness. After all the irrelevant junk the
roots and near roots get, some enthusiastic legit retries may not even
be unwelcome.
I know I shouldnt hit send but....please be gentle.
More information about the NANOG
mailing list