<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p>I legit guffawed.<br>
</p>
<div class="moz-cite-prefix">On 19-04-29 13 h 13, Eric Kuhnke wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAB69EHjPvjM3bFJ6J4nJG0ymmsvFzTnzXxSh7LGxDCsf_mjF5A@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">
<div>I would caution against putting much faith in the validity
of geolocation or site ID by reverse DNS PTR records. There
are a vast number of unmaintained, ancient, stale, erroneous
or wildly wrong PTR records out there. I can name at least a
half dozen ISPs that have absorbed other ASes, some of those
which also acquired other ASes earlier in their history,
forming a turducken of obsolete PTR records that has things
with ISP domain names last in use in the year 2002.</div>
<div><br>
</div>
<div><br>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Mon, Apr 29, 2019 at 6:15
AM Matthew Luckie <<a href="mailto:mjl@luckie.org.nz"
moz-do-not-send="true">mjl@luckie.org.nz</a>> wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi
NANOG,<br>
<br>
To support Internet topology analysis efforts, I have been
working on<br>
an algorithm to automatically detect router names inside
hostnames<br>
(PTR records) for router interfaces, and build regular
expressions<br>
(regexes) to extract them. By "router name" inside the
hostname, I<br>
mean a substring, or set of non-contiguous substrings, that is
common<br>
among interfaces on a router. For example, suppose we had the<br>
following three routers in the <a href="http://savvis.net"
rel="noreferrer" target="_blank" moz-do-not-send="true">savvis.net</a>
domain suffix, each with two<br>
interfaces:<br>
<br>
<a href="http://das1-v3005.nj2.savvis.net" rel="noreferrer"
target="_blank" moz-do-not-send="true">das1-v3005.nj2.savvis.net</a><br>
<a href="http://das1-v3006.nj2.savvis.net" rel="noreferrer"
target="_blank" moz-do-not-send="true">das1-v3006.nj2.savvis.net</a><br>
<br>
<a href="http://das1-v3005.oc2.savvis.net" rel="noreferrer"
target="_blank" moz-do-not-send="true">das1-v3005.oc2.savvis.net</a><br>
<a href="http://das1-v3007.oc2.savvis.net" rel="noreferrer"
target="_blank" moz-do-not-send="true">das1-v3007.oc2.savvis.net</a><br>
<br>
<a href="http://das2-v3009.nj2.savvis.net" rel="noreferrer"
target="_blank" moz-do-not-send="true">das2-v3009.nj2.savvis.net</a><br>
<a href="http://das2-v3012.nj2.savvis.net" rel="noreferrer"
target="_blank" moz-do-not-send="true">das2-v3012.nj2.savvis.net</a><br>
<br>
We might infer the router names are das1|nj2, das1|oc2, and
das2|nj2,<br>
respectively, and captured by the regex:<br>
^([a-z]+\d+)-[^\.]+\.([a-z]+\d+)\.savvis\.net$<br>
<br>
After much refinement based on smaller sets of ground truth,
I'm<br>
asking for broader feedback from operators. I've placed a
webpage at<br>
<a href="https://www.caida.org/~mjl/rnc/" rel="noreferrer"
target="_blank" moz-do-not-send="true">https://www.caida.org/~mjl/rnc/</a>
that shows the inferences my algorithm<br>
made for 2523 domains. If you operate one of the domains in
that<br>
list, I would appreciate it if you could comment (private is
probably<br>
better but public is fine with me) on whether the regex my
algorithm<br>
inferred represents your naming intent. In the first
instance, I am<br>
most interested in feedback for the suffix / date combinations
for<br>
suffixes that are colored green, i.e. appear to be reasonable.<br>
<br>
Each suffix / date combination links to a page that contains
the<br>
naming convention and corresponding inferences. The colored
part of<br>
each hostname is the inferred router name. The green
hostnames appear<br>
to be correct, at least as far as the algorithm determined.
Some<br>
suffixes have errors due to either stale hostnames or
incorrect<br>
training data, and those hostnames are colored red or orange.<br>
<br>
If anyone is interested in sets of hostnames the algorithm may
have<br>
inferred as 'stale' for their network, because for some
operators it<br>
was an oversight and they were grateful to learn about it, I
can<br>
provide that information.<br>
<br>
Thanks,<br>
<br>
Matthew<br>
</blockquote>
</div>
</blockquote>
</body>
</html>