<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <p>It's a few years old, but Facebook has talked a little bit about
      their DNS infrastructure before.  Here's a little clip that talks
      about Cartographer: <a class="moz-txt-link-freetext" href="https://youtu.be/bxhYNfFeVF4?t=2073">https://youtu.be/bxhYNfFeVF4?t=2073</a> <br>
    </p>
    <p>From their outage report, it sounds like their authoritative DNS
      servers withdraw their anycast announcements when they're
      unhealthy.  The health check from those servers must have relied
      on something upstream.  Maybe they couldn't talk to Cartographer
      for a few minutes so they thought they might be isolated from the
      rest of the network and they decided to withdraw their routes
      instead of serving stale data.  Makes sense when a single node
      does it, not so much when the entire fleet thinks that they're out
      on their own.<br>
    </p>
    <p>A performance issue in Cartographer (or whatever manages this
      fleet these days) could have been the ticking time bomb that set
      the whole thing in motion.<br>
    </p>
    <div class="moz-cite-prefix">On 10/5/21 3:39 PM, Michael Thomas
      wrote:<br>
    </div>
    <blockquote type="cite"
      cite="mid:7adfb521-88e2-75bc-8240-44d61c78b6ff@mtcc.com">
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
      <p>This bit posted by Randy might get lost in the other thread,
        but it appears that their DNS withdraws BGP routes for prefixes
        that they can't reach or are flaky it seems. Apparently that
        goes for the prefixes that the name servers are on too. This
        caused internal outages too as it seems they use their front
        facing DNS just like everybody else. <br>
      </p>
      <p>Sounds like they might consider having at least one split
        horizon server internally. Lots of fodder here.</p>
      <p>Mike<br>
      </p>
      <div class="moz-cite-prefix">On 10/5/21 11:11 AM, Randy Monroe
        wrote:<br>
      </div>
      <blockquote type="cite"
cite="mid:CALOB-K_e0JxWhsk2RC=AvFdNnzFbbv66xfK0d0bVz7Qw=xuPFA@mail.gmail.com">
        <meta http-equiv="content-type" content="text/html;
          charset=UTF-8">
        <div dir="ltr">Updated: <a
href="https://engineering.fb.com/2021/10/05/networking-traffic/outage-details/"
            moz-do-not-send="true">https://engineering.fb.com/2021/10/05/networking-traffic/outage-details/</a></div>
        <br>
        <div class="gmail_quote">
          <div dir="ltr" class="gmail_attr">On Tue, Oct 5, 2021 at 1:26
            PM Michael Thomas <<a href="mailto:mike@mtcc.com"
              moz-do-not-send="true">mike@mtcc.com</a>> wrote:<br>
          </div>
          <blockquote class="gmail_quote" style="margin:0px 0px 0px
            0.8ex;border-left:1px solid
            rgb(204,204,204);padding-left:1ex"><br>
            On 10/5/21 12:17 AM, Carsten Bormann wrote:<br>
            > On 5. Oct 2021, at 07:42, William Herrin <<a
              href="mailto:bill@herrin.us" target="_blank"
              moz-do-not-send="true">bill@herrin.us</a>> wrote:<br>
            >> On Mon, Oct 4, 2021 at 6:15 PM Michael Thomas <<a
              href="mailto:mike@mtcc.com" target="_blank"
              moz-do-not-send="true">mike@mtcc.com</a>> wrote:<br>
            >>> They have a monkey patch subsystem. Lol.<br>
            >> Yes, actually, they do. They use Chef extensively
            to configure<br>
            >> operating systems. Chef is written in Ruby. Ruby
            has something called<br>
            >> Monkey Patches.<br>
            > While Ruby indeed has a chain-saw (read: powerful,
            dangerous, still the tool of choice in certain cases) in its
            toolkit that is generally called “monkey-patching”, I think
            Michael was actually thinking about the “chaos monkey”,<br>
            > <a
              href="https://en.wikipedia.org/wiki/Chaos_engineering#Chaos_Monkey"
              rel="noreferrer" target="_blank" moz-do-not-send="true">https://en.wikipedia.org/wiki/Chaos_engineering#Chaos_Monkey</a><br>
            > <a href="https://netflix.github.io/chaosmonkey/"
              rel="noreferrer" target="_blank" moz-do-not-send="true">https://netflix.github.io/chaosmonkey/</a><br>
            <br>
            No, chaos monkey is a purposeful thing to induce corner case
            errors so <br>
            they can be fixed. The earlier outage involved a config
            sanitizer that <br>
            screwed up and then pushed it out. I can't get my head
            around why <br>
            anybody thought that was a good idea vs rejecting it and
            making somebody <br>
            fix the config.<br>
            <br>
            Mike<br>
            <br>
            <br>
          </blockquote>
        </div>
        <br clear="all">
        <div><br>
        </div>
        -- <br>
        <div dir="ltr" class="gmail_signature">
          <div dir="ltr">
            <div>
              <div dir="ltr">
                <div dir="ltr">
                  <div dir="ltr">
                    <div dir="ltr">
                      <div dir="ltr">
                        <div dir="ltr">
                          <div dir="ltr">
                            <div dir="ltr">
                              <div dir="ltr"><span></span>
                                <p dir="ltr"
                                  style="line-height:1.71429;margin-top:0pt;margin-bottom:0pt"><span
style="color:rgb(51,51,51);font-family:helvetica;font-size:14px">Randy
                                    Monroe</span><br>
                                </p>
                                <p
style="margin:0px;padding:0px;font-family:helvetica;font-size:14px;line-height:1.4;color:rgb(51,51,51)">Network
                                  Engineering</p>
                                <p dir="ltr"
                                  style="line-height:1.71429;margin-top:0pt;margin-bottom:0pt"><span style="font-size:10pt;font-family:"Helvetica Neue",sans-serif;color:rgb(51,51,51);background-color:transparent;vertical-align:baseline;white-space:pre-wrap"></span></p>
                                <p style="margin:10px 0px
0px;padding:0px;font-family:helvetica;font-size:14px;line-height:1.4;color:rgb(51,51,51)"><a
                                    href="https://uber.com/"
                                    rel="nofollow"
                                    style="color:rgb(39,110,241)"
                                    target="_blank"
                                    moz-do-not-send="true"><span
                                      style="display:inline-block;max-width:none"><img
                                        alt="Uber"
src="https://s3.amazonaws.com/uber-static/emails/2018/global/logos/Uber_Logo_Black_RGB.png"
                                        style="margin: 0px 2px; padding:
                                        0px; border: 0px; max-width:
                                        none; display: block;"
                                        moz-do-not-send="true"
                                        width="50"></span></a></p>
                                <table
                                  style="border-collapse:collapse;border-spacing:0px;margin:0px
                                  0px
1.5rem;padding:0px;width:561px;font-size:12px;color:rgb(51,51,51);font-family:ff-clan-web-pro,"Helvetica
                                  Neue",Helvetica,sans-serif">
                                  <tbody>
                                    <tr style="border-bottom-style:none">
                                      <td
style="padding:0px;font-weight:600;height:64px;width:64px;vertical-align:bottom"><br>
                                      </td>
                                      <td style="padding:20px
                                        0px;width:24px"><br>
                                      </td>
                                      <td style="padding:20px
                                        0px;width:24px"><br>
                                      </td>
                                      <td
                                        style="text-align:center;padding:20px
                                        0px;width:24px">
                                        <div dir="ltr"><span
                                            style="color:rgb(18,147,154);font-family:Helvetica;line-height:20px">
                                            <div style="text-align:left"><br>
                                            </div>
                                          </span></div>
                                        <div dir="ltr">
                                          <table
style="text-align:left;border-collapse:collapse;border-spacing:0px;margin:0px
                                            0px
1.5rem;padding:0px;width:561px;font-family:ff-clan-web-pro,"Helvetica
Neue",Helvetica,sans-serif">
                                          </table>
                                        </div>
                                      </td>
                                    </tr>
                                  </tbody>
                                </table>
                              </div>
                            </div>
                          </div>
                        </div>
                      </div>
                    </div>
                  </div>
                </div>
              </div>
            </div>
          </div>
        </div>
      </blockquote>
    </blockquote>
  </body>
</html>