Reliability of looking glass sites / rviews
mhuff at ox.com
Wed Sep 13 15:42:24 UTC 2017
Both should have been similar.
In the first case we lost power to all of our BGP border routers that are peered with the upstream providers
In the second case, I did an explicit “shut” on the interface connected to the upstream provider that appeared “stuck” after an hour after the outage.
From: <christopher.morrow at gmail.com> on behalf of Christopher Morrow <morrowc.lists at gmail.com>
Date: Wednesday, September 13, 2017 at 10:58 AM
To: Matthew Huff <mhuff at ox.com>
Cc: nanog2 <nanog at nanog.org>
Subject: Re: Reliability of looking glass sites / rviews
On Wed, Sep 13, 2017 at 5:30 AM, Matthew Huff <mhuff at ox.com<mailto:mhuff at ox.com>> wrote:
This weekend our uninterruptible power supply became interruptible and we lost all circuits. While I was doing initial debugging of the problem while I waited on site power verification, I noticed that there was still paths being shown in rviews for the circuit that were down. This was over an hour after we went hard down and it took hours before we were back up.
explicit vs implicit withdrawals causing different handling of the problem routes?
I worked with our providers last night to verify there weren't any hanging static routes, etc... We shut the upstream circuit down and watched the convergence and saw that eventually all the paths disappeared. Given what we saw on Saturday, what would cause route-views to cache the paths that long? Some looking glass sites only show what they are peered with or at most what their peers are peered with, that's why I've always used route-views.
What looking glass sites other than route-views would people recommend?
More information about the NANOG