Best practices inquiry: tracking SSH host keys

Wed Jun 28 20:25:35 UTC 2006

We all know that the weakest link of SSH is key management: if
you do not confirm by a secure out of band channel that the
public host key of the device you are connecting to is correct,
then SSH's crypto will not help you.

SSH implements neither a CA hierarchy (like X.509 certificates) nor
a web of trust (like PGP) so you are left checking the validity of
host keys yourself. Still, it's not so bad if you only connect to a
small handful of well known servers. You will either have verified
them all soon enough and not be bothered with it anymore, or system
administrators will maintain a global known_hosts file that lists
all the correct ones.

But it's quite different when you manage a network of hundreds or
thousands of devices. I find myself connecting to devices I've
never connected to before on a regular basis and being prompted
to verify the public host keys they are offering up. This happens
in the course of something else that I am doing and I don't
necesarily have the time to check a host key. If I did have time,
it's hard to check it anyway: the device is just one of a huge
number of network elements of no special significance to me and
I didn't install it and generate its key and I don't know who did.
>From time to time I also get hit with warning messages from my
SSH client about a changed host key and it's probably just that
someone swapped out the router's hardware sometime since the
last time I connected and a new key got generated. But I'm not sure.
Worst of all, my problem is repeated for every user because each
user is working with their own private ssh_known_hosts database
into which they accept host keys.

A possible solution is:

- Maintain a global known_hosts file. Make everyone who installs
a new router or turns up SSH on an existing one contribute to it.
Install it as the global (in /etc/) known_hosts file on all the
ssh clients you can.

Pro: The work to accept a new host key is done one, and it's
done by the person who installed the router, who is in the best
position to confirm that no man in the middle attack is taking
place.

Con: You need to make sure updates to this file are authentic
(its benefit is lost if untrusted people are allowed to
contribute), and you need to be sure it gets installed on the
ssh clients people use to connect to the network elements.

Con: If a host key changes but it is found to be benign (such as
the scenario I describe above), users can't do much about it
until the global file is corrected and redeployed (complicated
openssh options which users will generally not know to bypass
the problem notwithstanding).

I'm looking for information on best practices that are in use to
tackle this problem. What solutions are working for you?

Thanks

-Phil