BGP over TLS (was: Re: "Using Cloud Resources to Dramatically Improve Internet Routing")

Mon Oct 21 20:41:35 UTC 2019

> On Oct 21, 2019, at 4:17 PM, Brandon Martin <lists.nanog at monmotha.net> wrote:
> 
> On 10/21/19 3:37 PM, Jeffrey Haas wrote:
>> BGP over ipsec works fine.  But that said, it's mostly done with pre-shared keys.
> 
> Is anybody actually doing it in practice?

Absolutely.  In the SP sector?  Less clear.

>> The ugly issue of ipsec is that the ecosystem really wants IKE to do the good things people associate with long lived sessions.  I don't even vaguely pretend to be an ipsec/ike expert, but the wrangling over this and router bootstrapping issues generated a lot of heat and a small amount of light in IETF a while back.
> 
> Yes.  ipsec (IP layer) itself isn't too bad.  IKE is a complex mess.  A functional mess, perhaps, but a mess nonetheless.  I'd really like to hear from someone actually qualified on the cryptography side of things to chime in on whether long-lived symmetric keys are even really a problem anymore.  If they're not, just generating a decent "session" key and statically defining an SPI is a lot more straightforward.

I'm not someone qualified, but I'll regurgitate what I've distilled from past conversations with those who are. :-)

Presuming your key is strong enough, it may be infeasible to break it in a time that's of interest to the parties involved.  The primary issue is the usual issue of trying to keep anything secret: eventually disclosure becomes an issue.  And if you have no procedure for periodically updating your keys, it becomes a problem.

The problem is exacerbated by the fact that inter-provider key sharing is a PITA.  If you're having situations where you have to hit this list as a NOC of last resort, now try to imagine a regular cadence of conversations to update your key.  And then deal with the fact that key rotation for TCP-MD5 can be hit or miss.  In practice, this means that if you had someone that knew your keys and was kicked out of your company, they have the ability to do bad stuff.

The ability to more easily update your session keys is one of the big wins for tcp-ao.

A lot of the issues behind transport security are mitigated - and this is a point I end up raising to various IETF security reviewers on a regular basis when talking about control plane protocols:
- It's possible to reduce the attack surface by using things like GTSM.  You've acquired the key somehow?  Great - now get packets to that link.
- Similarly, protecting the link itself through things like MACSEC is a way to reduce the attack surface.
- What are the actual attacks they can do?  For BGP, knocking over the session is often the goal.  The necessary man in the middle for an active hijack if you can insert yourself into the conversation is absolutely doable... but you're better off just hijacking a router through another compromise and then simply injecting bad routing data.

Where much of this puts us is iBGP is of far more interest to an active attacker.  Protection of internal routing infrastructure, including firewalls that are properly configured, again can help here.  And this becomes even more tasty if you're in an environment making use of SDN-ish stuff.

> 
> OSPFv3 hopefully taught people some lessons with its initial lack of built-in authentication.  "Just use ipsec".

This one, IMNSHO, can be blamed on specific IETF religion at the time.  The fallouts around this are one of my more favored examples of "this needed operational review".
> 
>> And if you have a rather scaled out router, imagine the cpu melting that goes with a cold startup scenario where you have to get all of those IKE sessions up to start up your BGP.  Now think what that does to your restart time. 
> 
> Indeed, though I've seen a trend toward putting rather hefty CPUs on the control planes of "real" routers, nowadays, which I guess is welcome.  It doesn't really contribute much to the overall cost of something that can push 100s of Gbps in hardware, anyway.

Believe me, implementors are happy to have some extra cycles available.  However, too many target platforms are either still under-powered or have operational requirements that push them toward slower CPUs.  

And even for large enough platforms, security computation can eat every spare cycle you have.  Generally, a conversation with crypto experts will eventually devolve to "key lengths/cipher is now considered weak, use the next one" - which is shorthand for saying "if you have available cpu, you're not using strong enough crypto". :-)

-- Jeff