plea for comcast/sprint handoff debug help

Tim Bruijnzeels tim at nlnetlabs.nl
Mon Nov 2 08:13:16 UTC 2020


Hi Randy, all,

> On 31 Oct 2020, at 04:55, Randy Bush <randy at psg.com> wrote:
> 
>> If there is a covering less specific ROA issued by a parent, this will
>> then result in RPKI invalid routes.
> 
> i.e. the upstream kills the customer.  not a wise business model.

I did not say it was. But this is the problematic case.

For the vast majority of ROAs the sustained loss of the repository would lead to invalid ROA *objects*, which will not be used in Route Origin Validation anymore leading to the state 'Not Found' for the associated announcements.

This is not the case if there are other ROAs for the same prefixes published by others (most likely the parent). Quick back of the envelope analysis: this affects about 0.05% of ROA prefixes.

>> The fall-back may help in cases where there is an accidental outage of
>> the RRDP server (for as long as the rsync servers can deal with the
>> load)
> 
> folk try different software, try different configurations, realize that
> having their CA gooey exposed because they wanted to serve rrdp and
> block, ...

We are talking here about the HTTPS server being unavailable, while rsync *is*.

So this means, your HTTPS server is down, unreachable, or has an issue with its HTTPS certificate. Your repository could use a CDN if they don't want to do all this themselves. They could monitor, and fix things.. there is time.

Thing is even if HTTPs becomes unavailable this still leaves hours (8 by default for the Krill CA, configurable) to fix things. Routinator (and the RIPE NCC Validator, and others) will use cached data if they cannot retrieve new data. It's only when manifests and CRLs start to expire that the objects would become invalid.

So the fallback helps in case of incidents with HTTPS that were not fixed within 8 hours for 0.05% of prefixes.

On the other hand, the fallback exposes a Malicious-in-the-Middle replay attack surface for 100% of the prefixes published using RRDP, 100% of the time. This allows attackers to prevent changes in ROAs to be seen.

This is a tradeoff. I think that protecting against replay should be considered more important here, given the numbers and time to fix HTTPS issue.


> randy, finding the fort rp to be pretty solid!

Unrelated, but sure I like Fort too.

Tim


More information about the NANOG mailing list