massive facebook outage presently

Luke Guillory LGuillory at reservetele.com
Mon Oct 4 18:48:12 UTC 2021


From what I believe was a FB employee on Reddit, account now deleted it seems.


As many of you know, DNS for FB services has been affected and this is likely a symptom of the actual issue, and that's that BGP peering with Facebook peering routers has gone down, very likely due to a configuration change that went into effect shortly before the outages happened (started roughly 1540 UTC).



There are people now trying to gain access to the peering routers to implement fixes, but the people with physical access is separate from the people with knowledge of how to actually authenticate to the systems and people who know what to actually do, so there is now a logistical challenge with getting all that knowledge unified.



Part of this is also due to lower staffing in data centers due to pandemic measures.



I believe the original change was 'automatic' (as in configuration done via a web interface). However, now that connection to the outside world is down, remote access to those tools don't exist anymore, so the emergency procedure is to gain physical access to the peering routers and do all the configuration locally.



https://twitter.com/jgrahamc/status/1445068309288951820 "About five minutes before Facebook's DNS stopped working we saw a large number of BGP changes (mostly route withdrawals) for Facebook's ASN."




From: NANOG <nanog-bounces+lguillory=reservetele.com at nanog.org> On Behalf Of Baldur Norddahl
Sent: Monday, October 4, 2021 1:41 PM
To: NANOG <nanog at nanog.org>
Subject: Re: massive facebook outage presently

*External Email: Use Caution*
I got a mail that Facebook was leaving NLIX. Maybe someone botched the script so they took down all BGP sessions instead of just NLIX and now they can't access the equipment to put it back... :-)


man. 4. okt. 2021 20.31 skrev Billy Croan <BCroan at unrealservers.net<mailto:BCroan at unrealservers.net>>:
I know what this is.....  They forgot to update the credit card on their godaddy account and the domain lapsed.  I guess it will be facebook.info<https://link.edgepilot.com/s/7bad5051/Di9CwLEB1E6iB_KlhyWtZA?u=http://facebook.info/> when they get it back online.  The post mortem should be an interesting read.

On Mon, Oct 4, 2021 at 11:46 AM Jason Kuehl <jason.w.kuehl at gmail.com<mailto:jason.w.kuehl at gmail.com>> wrote:
Looks like they run there own nameservers and I see the soa records are even missing.

On Mon, Oct 4, 2021, 12:23 PM Mel Beckman <mel at beckman.org<mailto:mel at beckman.org>> wrote:
Here’s a screenshot:

 -mel beckman


On Oct 4, 2021, at 9:06 AM, Eric Kuhnke <eric.kuhnke at gmail.com<mailto:eric.kuhnke at gmail.com>> wrote:

https://link.edgepilot.com/s/3926b9ff/bTkszib6zUmYbE_rZxhltQ?u=https://downdetector.com/status/facebook/

Normally not worth mentioning random $service having an outage here, but this will undoubtedly generate a large volume of customer service calls.

Appears to be failure in DNS resolution.



Links contained in this email have been replaced. If you click on a link in the email above, the link will be analyzed for known threats. If a known threat is found, you will not be able to proceed to the destination. If suspicious content is detected, you will see a warning.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nanog.org/pipermail/nanog/attachments/20211004/d901011b/attachment.html>


More information about the NANOG mailing list