Facebook post-mortems...

Michael Thomas mike at mtcc.com
Tue Oct 5 20:52:50 UTC 2021

Actually for card readers, the offline verification nature of 
certificates is probably a nice property. But client certs pose all 
sorts of other problems like their scalability, ease of making changes 
(roles, etc), and other kinds of considerations that make you want to 
fetch more information online... which completely negates the advantages 
of offline verification. Just the CRL problem would probably sink you 
since when you fire an employee you want access to be cut off immediately.

The other thing that would scare me in general with expecting offline 
verification is the *reason* it's being used is for offline might get 
forgotten and back comes the online dependencies while nobody is looking.

BTW: you don't need to reach the trust anchor, though you almost 
certainly need to run OCSP or something like it if you have client certs.


On 10/5/21 1:34 PM, Matthew Petach wrote:
> On Tue, Oct 5, 2021 at 8:57 AM Kain, Becki (.) <bkain1 at ford.com 
> <mailto:bkain1 at ford.com>> wrote:
>     Why ever would have a card reader on your external facing network,
>     if that was really the case why they couldn't get in to fix it?
> Let's hypothesize for a moment.
> Let's suppose you've decided that certificate-based
> authentication is the cat's meow, and so you've got
> dot1x authentication on every network port in your
> corporate environment, all your users are authenticated
> via certificates, all properly signed all the way up the
> chain to the root trust anchor.
> Life is good.
> But then you have a bad network day.  Suddenly,
> you can't talk to upstream registries/registrars,
> you can't reach the trust anchor for your certificates,
> and you discover that all the laptops plugged into
> your network switches are failing to validate their
> authenticity; sure, you're on the network, but you're
> in a guest vlan, with no access.  Your user credentials
> aren't able to be validated, so you're stuck with the
> base level of access, which doesn't let you into the
> OOB network.
> Turns out your card readers were all counting on
> dot1x authentication to get them into the right vlan
> as well, and with the network buggered up, the
> switches can't validate *their* certificates either,
> so the door badge card readers just flash their
> LEDs impotently when you wave your badge at
> them.
> Remember, one attribute of certificates is that they are
> designated as valid for a particular domain, or set of
> subdomains with a wildcard; that is, an authenticator needs
> to know where the certificate is being presented to know if
> it is valid within that scope or not.   You can do that scope
> validation through several different mechanisms,
> such as through a chain of trust to a certificate authority,
> or through DNSSEC with DANE--but fundamentally,
> all certificates have a scope within which they are valid,
> and a means to identify in which scope they are being
> used.  And wether your certificate chain of trust is
> being determined by certificate authorities or DANE,
> they all require that trust to be validated by something
> other than the client and server alone--which generally
> makes them dependent on some level of external
> network connectivity being present in order to properly
> function.   [yes, yes, we can have a side discussion about
> having every authentication server self-sign certificates
> as its own CA, and thus eliminate external network
> connectivity dependencies--but that's an administrative
> nightmare that I don't think any large organization would
> sign up for.]
> So, all of the client certificates and authorization servers
> we're talking about exist on your internal network, but they
> all counted on reachability to your infrastructure
> servers in order to properly authenticate and grant
> access to devices and people.  If your BGP update
> made your infrastructure servers, such as DNS servers,
> become unreachable, then suddenly you might well
> find yourself locked out both physically and logically
> from your own network.
> Again, this is purely hypothetical, but it's one scenario
> in which a routing-level "oooooops" could end up causing
> physical-entry denial, as well as logical network access
> level denial, without actually having those authentication
> systems on external facing networks.
> Certificate-based authentication is scalable and cool, but
> it's really important to think about even generally "that'll
> never happen" failure scenarios when deploying it into
> critical systems.  It's always good to have the "break glass
> in case of emergency" network that doesn't rely on dot1x,
> that works without DNS, without NTP, without RADIUS,
> or any other external system, with a binder with printouts
> of the IP addresses of all your really critical servers and
> routers in it which gets updated a few times a year, so that
> when the SHTF, a person sitting at a laptop plugged into
> that network with the binder next to them can get into the
> emergency-only local account on each router to fix things.
> And yes, you want every command that local emergency-only
> user types into a router to be logged, because someone
> wanting to create mischief in your network is going to aim
> for that account access if they can get it; so watch it like a
> hawk, and the only time it had better be accessed and used
> is when the big red panic button has already been hit, and
> the executives are huddled around speakerphones wanting
> to know just how fast you can get things working again. ^_^;
> I know nothing of the incident in question.  But sitting at home,
> hypothesizing about ways in which things could go wrong, this
> is one of the reasons why I still configure static emergency
> accounts on network devices, even with centrally administered
> account systems, and why there's always a set of "no dot1x"
> ports that work to get into the OOB/management network even
> when everything else has gone toes-up.   :)
> So--that's one way in which an outage like this could have
> locked people out of buildings.   ^_^;
> Thanks!
> Matt
> [ready for the deluge of people pointing out I've overly simplified the
> validation chain for certificates in order to keep the post short and
> high-level.   ^_^; ]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nanog.org/pipermail/nanog/attachments/20211005/87caa017/attachment.html>

More information about the NANOG mailing list