Disaster Recovery Process

Jared Mauch jared at puck.nether.net
Tue Oct 5 12:50:52 UTC 2021

> On Oct 4, 2021, at 4:53 PM, Jorge Amodio <jmamodio at gmail.com> wrote:
> How come such a large operation does not have an out of bound access in case of emergencies ???

I mentioned to someone yesterday that most OOB systems _are_ the internet.  It doesn’t always seem like you need things like modems or dial-backup, or access to these services, except when you do it’s critical/essential.

A few reminders for people:

1) Program your co-workers into your cell phone
2) Print out an emergency contact sheet
3) Have a backup conference bridge/system that you test
  - if zoom/webex/ms are down, where do you go?  Slack?  Google meet? Audio bridge?
  - No judgement, but do test the system!
4) Know how to access the office and who is closest.  
  - What happens if they are in the hospital, sick or on vacation?
5) Complacency is dangerous
  - When the tools “just work” you never imagine the tools won’t work.  I’m sure the lessons learned will be long internally.  
  - I hope they share them externally so others can learn.
6) No really, test the backup process.

* interlude *

Back at my time at 2914 - one reason we all had T1’s at home was largely so we could get in to the network should something bad happen.  My home IP space was in the router ACLs.  Much changed since those early days as this network became more reliable.  We’ve seen large outages in the past 2 years of platforms, carriers, etc.. (the Aug 30th 2020 issue is still firmly in my memory).  

Plan for the outages and make sure you understand your playbook.  It may be from snow day to all hands on deck.  Test it at least once, and ideally with someone who will challenge a few assumptions (eg: that the cell network will be up)

- Jared

More information about the NANOG mailing list