What DNS Is Not

Mon Nov 9 00:59:24 UTC 2009

On Nov 8, 2009, at 7:46 PM, bmanning at vacation.karoshi.com wrote:
>>
>> "The paper also presents the results of trace-driven simulations that
>> explore the effect of varying TTLs and varying degrees of cache
>> sharing on DNS cache hit rates. "
>
> 	I'm not debating the traces - I wonder about the simulation
> 	model.  (and yes, I've read the paper)

I'm happy to chat about this offline if it bores people, but I'm  
curious what you're wondering about.

The method was pretty simple:

  - Record the TCP SYN/FIN packets and the DNS packets
  - For every SYN, figure out what name the computer had resolved to  
open a connection to this IP address
  - From the TTL of the DNS, figure out whether finding that binding  
would have required a DNS lookup

There are some obvious potential sources of error - most particularly,  
name-based HTTP virtual hosting may break some of the assumptions in  
this - but I'd guess that with a somewhat smaller trace, not too much  
error is introduced by clients going to different name-based vhosts on  
the same IP address within a small amount of time.  There are  
certainly some, but I'd be surprised if it was more than a %age of the  
accesses.  Are there other methodological concerns?

I'd also point out for this discussion two studies that looked at how  
accurately one can geo-map clients based on the IP address of their  
chosen DNS resolver.  There are obviously potential pitfalls here  
(e.g., someone who travels and still uses their "home" resolver).  In  
2002:

Z. M. Mao, C. D. Cranor, F. Douglis, and M. Rabinovich. A Precise and  
Efficient Evaluation of the Proximity between Web Clients and their  
Local DNS Servers. In Proc. USENIX Annual Technical Conference,  
Berkeley, CA, June 2002.

Bottom line:  It's ok but not great.

"We con- clude that DNS is good for very coarse-grained server  
selection, since 64% of the associations belong to the same Autonomous  
System. DNS is less useful for finer- grained server selection, since  
only 16% of the client and local DNS associations are in the same  
network-aware cluster [13] (based on BGP routing information from a  
wide set of routers)"

We did a wardriving study in Pittsburgh recently where we found that,  
of the access points we could connect to, 99% of them used their ISP's  
provided DNS server.  Pretty good if your target is residential users:

http://www.cs.cmu.edu/~dga/papers/han-imc2008-abstract.html

(it's a small part of the paper in section 4.3).

   -Dave