Facebook post-mortems...
Hank Nussbacher
hank at interall.co.il
Wed Oct 6 04:51:52 UTC 2021
On 05/10/2021 21:11, Randy Monroe via NANOG wrote:
> Updated:
> https://engineering.fb.com/2021/10/05/networking-traffic/outage-details/
Lets try to breakdown this "engineering" blog posting:
- "During one of these routine maintenance jobs, a command was issued
with the intention to assess the availability of global backbone
capacity, which unintentionally took down all the connections in our
backbone network"
Can anyone guess as to what command FB issued that would cause them to
withdraw all those prefixes?
- "it was not possible to access our data centers through our normal
means because their networks were down, and second, the total loss of
DNS broke many of the internal tools we’d normally use to investigate
and resolve outages like this. Our primary and out-of-band network
access was down..."
Does this mean that FB acknowledges that the loss of DNS broke their OOB
access?
-Hank
More information about the NANOG
mailing list