Facebook post-mortems...
Warren Kumari
warren at kumari.net
Tue Oct 5 18:07:46 UTC 2021
On Tue, Oct 5, 2021 at 1:47 PM Miles Fidelman <mfidelman at meetinghouse.net>
wrote:
> jcurran at istaff.org wrote:
>
> Fairly abstract - Facebook Engineering -
> https://m.facebook.com/nt/screen/?params=%7B%22note_id%22%3A10158791436142200%7D&path=%2Fnotes%2Fnote%2F&_rdr
> <https://m.facebook.com/nt/screen/?params=%7B%22note_id%22:10158791436142200%7D&path=/notes/note/&_rdr>
>
> Also, Cloudflare’s take on the outage -
> https://blog.cloudflare.com/october-2021-facebook-outage/
>
> FYI,
> /John
>
> This may be a dumb question, but does this suggest that Facebook publishes
> rather short TTLs for their DNS records? Otherwise, why would an internal
> failure make them unreachable so quickly?
>
Looks like 60 seconds:
$ dig +norec star-mini.c10r.facebook.com. @d.ns.c10r.facebook.com.
; <<>> DiG 9.10.6 <<>> +norec star-mini.c10r.facebook.com. @
d.ns.c10r.facebook.com.
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 25582
;; flags: qr aa; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;star-mini.c10r.facebook.com. IN A
;; ANSWER SECTION:
star-mini.c10r.facebook.com. 60 IN A 157.240.229.35
;; Query time: 42 msec
;; SERVER: 185.89.219.11#53(185.89.219.11)
;; WHEN: Tue Oct 05 14:01:06 EDT 2021
;; MSG SIZE rcvd: 72
... and cue the "Bwahahhaha! If *I* ran Facebook I'd make the TTL be [2
sec|30sec|5min|1h|6h+3sec|1day|6months|maxint32]" threads....
Choosing the TTL is a balancing act between stability, agility, load,
politeness, renewal latency, etc -- but I'm sure NANOG can boil it down to
"They did it wrong!..."
W
> Miles Fidelman
>
> --
> In theory, there is no difference between theory and practice.
> In practice, there is. .... Yogi Berra
>
> Theory is when you know everything but nothing works.
> Practice is when everything works but no one knows why.
> In our lab, theory and practice are combined:
> nothing works and no one knows why. ... unknown
>
>
--
The computing scientist’s main challenge is not to get confused by the
complexities of his own making.
-- E. W. Dijkstra
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nanog.org/pipermail/nanog/attachments/20211005/7d438b56/attachment.html>
More information about the NANOG
mailing list