Securing the BGP or controlling it?

Marshall Eubanks tme at
Tue May 11 15:08:12 UTC 2010

Dear Danny;

On May 11, 2010, at 10:13 AM, Danny McPherson wrote:

> On May 11, 2010, at 7:32 AM, Nick Hilliard wrote:
>> Risk analysis is ass covering without the theatre.  You collect  
>> data, make
>> a judgement based on that data, and if it turns out that the  
>> judgement says
>> that signed bgp updates constitute more of a stability risk to  
>> network
>> operations than the occasional shock problem
> So apply the risk management analogy here.  We all know that
> pretty much anyone can assert reachability for anyone else's
> address space inter-domain on the Internet, in particular the
> closer you get to 'the core' the easier this gets.  We also
> know that route "leaks" commonly occur that result in outages
> and the potential for intercept or other nefarious activity.
> Additionally, we know that deaggregation, and similar events
> result in wide-scale systemic effects.  We also know that
> topologically localized events occur that can impact our reachability,
> whether we're party to the actual fault or not.  We have a slew of
> empirical data to support all of these things, some more high profile
> than others, with route leaks likely occurring at the highest
> frequency (every single day).
> I would suspect that the probability of fire effecting your
> network availability is very low, as you can fail over to a
> new facility.  OTOH, if you have a route hijack (intentional
> or not) failover to a new facility with that address space
> isn't going to help, and hijacks can be topologically localized
> - the same applies for DDoS.  Yet I suspect your organization
> has invested reasonably in fire suppression systems, but the
> asset that matters most that enables the substrate of some
> applications and services that you care about - the availability
> of your address space within the global routing system, has no
> safeguards whatsoever, and can be impacted from anywhere in the
> world.
> I'd also venture a guess that we've had more routing issues that
> have resulted in network downtime of critical sites than we have had
> fires (if someone disproves that _nice dinner on me!).

But there is also recovery time, which you don't mention in your bet.

If the building I am sitting in right now were to
burn down to the ground, the client I am at would be affected for  
months and months. Yes, they
have backups, and redundancy, but this is their HQ.

If they (say) fat finger their BGP, well, it would be bad, but if they  
fix it this
afternoon, everything will go back to normal shortly thereafter.

So, sure, network outages may be more frequent than catastrophic  
fires, but that doesn't mean that
the aggregated duration of disruption from network outages is greater  
than the aggregated
disruption duration from fires.


> We've got empirical data, we understand the vulnerability and the
> risk (probability of a threat being used).  Put that in your risk
> management equation and consider what assets are most vulnerable
> to your organization - I'd venture it's something to do with network,
> and if routing ain't working, network ain't working...
> -danny

More information about the NANOG mailing list