AS-Path - ORF Draft

Mon Oct 23 06:35:42 UTC 2017

Dear Baldur,

On Mon, Oct 23, 2017 at 12:53:48AM +0200, Baldur Norddahl wrote:
> I do not get why every BGP implementation kills the session at the
> prefix limit. It appears that is making a bad situation worse. Routing
> flaps creating lots of visible disturbance for end users. When the BGP
> session restarts, it will just happen again and again until operator
> intervention.

Maximum prefix limits are used as a naive last resort to attempt to
protect against catastrophic failures such as memory/fib overflow and
full table route leaks. The moment a maximum prefix limit kicks in,
something somewhere went wrong and indeed an operator has to intervene.
That is the beauty and essence of the maxpfx feature. :)

> Instead an implementation could ignore any additional prefixes 

This may work in some specific cases, but can be disastrous in other
cases. In my opinion, in context of Internet routing, the potential for
disaster outweighs any benefits I can see for "ignoring additional
prefixes" (in L3VPN context different considerations may apply).

You offered "killing a session may make a bad situation worse", but
there are of scenarios where keeping the session up can make a bad
situation into a diaster.

I'll elaborate on the above with an example to hopefully clarify myself.
Let's take this event and hypothetically assume 'soft maximum prefix
limits' are a commonly deployed thing.
https://bgpmon.net/bgp-leak-causing-internet-outages-in-japan-and-beyond/

According to PeeringDB AS 15169 recommends to configure 15,000 as the
maximum prefix limit for IPv4. (https://www.peeringdb.com/asn/15169)
Let's assume that Verizon had configured "a maximum of 15,000 but keep
the BGP session up"-style of soft limit. I currently see roughly 419
prefixes via AS15169 in the DFZ. 15000 - 419 = 14581, so this leaves
room for 14581 invalid announcements before the softlimit is kicks in.
At that point I'd argue that it is better to just tear down the BGP
session rather than create a situation where 14581 invalid announcements
(which are part of a 160,000 prefix route leak) can continue to exist.

We could go back and forth a bit on how high or low that '15,000' number
should be and how things would look if it was closer to 500. But in the
end actual operator intervention was needed, and soft maxprefix limits
would have the potential to hide that.

> or it could compare each additional prefix received to already learned
> prefixes and decide to drop one to make room for the new one. For
> example you could drop the most specific routes before less specific
> routes.

The moment a BGP implementation can do such RIB compression, it may
indeed make sense to offer two types of limits: a 'pre-policy maximum
prefix limit' and a 'post-policy maximum prefix limit'. The former type
of limit would be useful in context of route leaks, the latter in
context of protecting against overflow of the FIB capability.

Kind regards,

Job

ps. RPKI Origin Validation and BGPSEC do have the potential to change
the way we look at big hammers like maximum prefix limits, but we're not
there yet.