looking for hostname geographic hint validation

Bradley Huffaker bhuffake at caida.org
Wed Aug 28 19:16:14 UTC 2013


On Wed, Aug 28, 2013 at 04:07:05PM +0100, Ben wrote:
> Dear Bradley,
> 
> So basically you're asking others to do your homework for you ?   ;-)

Actually no, I'm asking people to do something which I can not.  

While it is true I could test against a manual inference, I would simply
be checking one inference against another. Agreement would only prove
that the algorithm does what I expect. Only the operators, who actually
know what they are doing, can give me the ground truth I need to test my
inferences against reality.

> For example, picking one example from your list ....
> 
> <iata>([^a-z]+[a-z]+\d*){3}.ic.ac.uk
>
> Far from being IATA codes, the intermediate subdomains actually refer to 
> departments (DepartmentOfComputing and CHemistry in the two I quoted).
> 
> Sorry to rain on your parade, but someone had to say it.  ;-)

You are most likely right, but I am not looking for perfection.  I am
hoping for an inference that will get me with in 10 km of the actual
city most of the time.

Given the validation I have so far, out of the 19,611 hostnames for which a
location is inferred, and I have validation data, we infer the city
correctly 93% of the time.

While there is work left to do, it is far from the lost cause you
present.

-- 
    the value of a world model is not how accurately it captures reality
    but how often it leads us to take appropriate action




More information about the NANOG mailing list