FW: Reliability of looking glass sites / rviews

Tim Evens tim at snas.io
Fri Sep 15 14:45:12 UTC 2017



You didn't mention details about which ASN or prefixes you were
checking. Are you referring to ASN 14607 that only advertises two
prefixes 129.77.0.0/16 and 2620:0:2810::/48?

Based what we see over the weekend (using routeviews data), we see:

Event Start Time: 2017-09-09 11:29:23 UTC (2017-09-09 07:29:23 EDT)
Event End Time: 2017-09-09 13:31:30 UTC (2017-09-09 09:31:30 EDT)

Are the above times correct?

We see the routes withdraw and then come back. For example:
http://demo-rv.snas.io:3000/dashboard/db/prefix-history?orgId=2&var-prefix=129.77.0.0&var-prefix_len=16&var-asn_num=All&var-router_name=All&var-peer_name=All&from=1504908000000&to=1505203200000

When you checked routeviews, which router and peer were you looking at?
When you did a "show ip bgp ..." did you include the prefix length? If
not, it would have then shown you 0/0 or 128/5, depending on which
router you were on.

--Tim 

On 9/13/17, 8:43 AM, "NANOG on behalf of Matthew Huff"
<nanog-bounces at nanog.org on behalf of mhuff at ox.com> wrote:

Both should have been similar.

In the first case we lost power to all of our BGP border routers that
are peered with the upstream providers
In the second case, I did an explicit "shut" on the interface connected
to the upstream provider that appeared "stuck" after an hour after the
outage.

From: <christopher.morrow at gmail.com> on behalf of Christopher Morrow
<morrowc.lists at gmail.com>
Date: Wednesday, September 13, 2017 at 10:58 AM
To: Matthew Huff <mhuff at ox.com>
Cc: nanog2 <nanog at nanog.org>
Subject: Re: Reliability of looking glass sites / rviews

On Wed, Sep 13, 2017 at 5:30 AM, Matthew Huff
<mhuff at ox.com<mailto:mhuff at ox.com>> wrote:
This weekend our uninterruptible power supply became interruptible and
we lost all circuits. While I was doing initial debugging of the problem
while I waited on site power verification, I noticed that there was
still paths being shown in rviews for the circuit that were down. This
was over an hour after we went hard down and it took hours before we
were back up.

explicit vs implicit withdrawals causing different handling of the
problem routes?

I worked with our providers last night to verify there weren't any
hanging static routes, etc... We shut the upstream circuit down and
watched the convergence and saw that eventually all the paths
disappeared. Given what we saw on Saturday, what would cause route-views
to cache the paths that long? Some looking glass sites only show what
they are peered with or at most what their peers are peered with, that's
why I've always used route-views.

What looking glass sites other than route-views would people recommend?

ripe ris.




More information about the NANOG mailing list