<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <p>I legit guffawed.<br>

    </p>

    <div class="moz-cite-prefix">On 19-04-29 13 h 13, Eric Kuhnke wrote:<br>

    </div>

    <blockquote type="cite"

cite="mid:CAB69EHjPvjM3bFJ6J4nJG0ymmsvFzTnzXxSh7LGxDCsf_mjF5A@mail.gmail.com">

      <meta http-equiv="content-type" content="text/html; charset=UTF-8">

      <div dir="ltr">

        <div>I would caution against putting much faith in the validity

          of geolocation or site ID by reverse DNS PTR records. There

          are a vast number of unmaintained, ancient, stale, erroneous

          or wildly wrong PTR records out there. I can name at least a

          half dozen ISPs that have absorbed other ASes, some of those

          which also acquired other ASes earlier in their history,

          forming a turducken of obsolete PTR records that has things

          with ISP domain names last in use in the year 2002.</div>

        <div><br>

        </div>

        <div><br>

        </div>

      </div>

      <br>

      <div class="gmail_quote">

        <div dir="ltr" class="gmail_attr">On Mon, Apr 29, 2019 at 6:15

          AM Matthew Luckie <<a href="mailto:mjl@luckie.org.nz"

            moz-do-not-send="true">mjl@luckie.org.nz</a>> wrote:<br>

        </div>

        <blockquote class="gmail_quote" style="margin:0px 0px 0px

          0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi

          NANOG,<br>

          <br>

          To support Internet topology analysis efforts, I have been

          working on<br>

          an algorithm to automatically detect router names inside

          hostnames<br>

          (PTR records) for router interfaces, and build regular

          expressions<br>

          (regexes) to extract them.  By "router name" inside the

          hostname, I<br>

          mean a substring, or set of non-contiguous substrings, that is

          common<br>

          among interfaces on a router.  For example, suppose we had the<br>

          following three routers in the <a href="http://savvis.net"

            rel="noreferrer" target="_blank" moz-do-not-send="true">savvis.net</a>

          domain suffix, each with two<br>

          interfaces:<br>

          <br>

          <a href="http://das1-v3005.nj2.savvis.net" rel="noreferrer"

            target="_blank" moz-do-not-send="true">das1-v3005.nj2.savvis.net</a><br>

          <a href="http://das1-v3006.nj2.savvis.net" rel="noreferrer"

            target="_blank" moz-do-not-send="true">das1-v3006.nj2.savvis.net</a><br>

          <br>

          <a href="http://das1-v3005.oc2.savvis.net" rel="noreferrer"

            target="_blank" moz-do-not-send="true">das1-v3005.oc2.savvis.net</a><br>

          <a href="http://das1-v3007.oc2.savvis.net" rel="noreferrer"

            target="_blank" moz-do-not-send="true">das1-v3007.oc2.savvis.net</a><br>

          <br>

          <a href="http://das2-v3009.nj2.savvis.net" rel="noreferrer"

            target="_blank" moz-do-not-send="true">das2-v3009.nj2.savvis.net</a><br>

          <a href="http://das2-v3012.nj2.savvis.net" rel="noreferrer"

            target="_blank" moz-do-not-send="true">das2-v3012.nj2.savvis.net</a><br>

          <br>

          We might infer the router names are das1|nj2, das1|oc2, and

          das2|nj2,<br>

          respectively, and captured by the regex:<br>

          ^([a-z]+\d+)-[^\.]+\.([a-z]+\d+)\.savvis\.net$<br>

          <br>

          After much refinement based on smaller sets of ground truth,

          I'm<br>

          asking for broader feedback from operators.  I've placed a

          webpage at<br>

          <a href="https://www.caida.org/~mjl/rnc/" rel="noreferrer"

            target="_blank" moz-do-not-send="true">https://www.caida.org/~mjl/rnc/</a>

          that shows the inferences my algorithm<br>

          made for 2523 domains.  If you operate one of the domains in

          that<br>

          list, I would appreciate it if you could comment (private is

          probably<br>

          better but public is fine with me) on whether the regex my

          algorithm<br>

          inferred represents your naming intent.  In the first

          instance, I am<br>

          most interested in feedback for the suffix / date combinations

          for<br>

          suffixes that are colored green, i.e. appear to be reasonable.<br>

          <br>

          Each suffix / date combination links to a page that contains

          the<br>

          naming convention and corresponding inferences.  The colored

          part of<br>

          each hostname is the inferred router name.  The green

          hostnames appear<br>

          to be correct, at least as far as the algorithm determined. 

          Some<br>

          suffixes have errors due to either stale hostnames or

          incorrect<br>

          training data, and those hostnames are colored red or orange.<br>

          <br>

          If anyone is interested in sets of hostnames the algorithm may

          have<br>

          inferred as 'stale' for their network, because for some

          operators it<br>

          was an oversight and they were grateful to learn about it, I

          can<br>

          provide that information.<br>

          <br>

          Thanks,<br>

          <br>

          Matthew<br>

        </blockquote>

      </div>

    </blockquote>

  </body>

</html>