Global BGP - 2001-06-23 - Vendor X's statement...

lucifer at lightbearer.com lucifer at lightbearer.com
Wed Jun 27 23:11:04 UTC 2001


E.B. Dreger wrote:
> 
> > Date: Wed, 27 Jun 2001 13:21:20 -0400
> > From: Matt Levine <matt at deliver3.com>
> 
> > Agreed, so throw the bad route to the bit bucket and leave the bgp
> > session open, or at the very least (as others have suggested) give me
> > an OPTION to do that.  Bad enough we were only operating at 33%
> > capacity, however, if we only had transit from the 4 that were giving
> > us the bad route, we would have lost connectivity totally.  While it
> 
> <imesho>
> 
> On the surface, this appears to be correct.
> 
> But let's ask ourselves _why_ those upstreams had bad routes.  It's
> because _they_ did not filter at the edge.  If bad routes leak, but are
> filtered before reaching the core, then they never make it to you.
> 
> IOW, your concern is a non-issue if the large providers apply similar
> filtering at the edge.  You wouldn't be cutting yourself off because the
> provider in question would have filtered it long ago.

Correct. However, this means I have to place my complete trust in them
to Do Things Right (well, them, and more importantly in this case, their
vendors). As Saturday has demonstrated, this is not a safe assumption,
in that there appears to be some significant number of boxes in the core
which will propagate bad routing data, even if they are also resetting
the sessions which it came from (note: I'm not saying it's Cisco. It
might be; historically, Ciscos have done this before. But I have no
direct evidence that they did, or didn't; only the inferrence that it
had to be *something* used on a very widespread basis, given the number
of peers that had the problem simulataneously. Oh, and I *do* know,
from direct observation, that the Ciscos facing us were either causing
this bug themselves (possible, but it doesn't seem terribly likely
given the spread of them), or transiting the route to us when they
should have been ditching it, along with the session).

> Do it at the edge, and the Internet does not become any more brittle.

The same with source-filtering IPs. Do it at the edge, and the problem
goes away. Now, *how* long has it taken to implement this?

Someone said, a few messages ago, that the purpose of a routing protocol
is to avoid loops. I disagree. The purpose of a routing protocol is to
propagate good, viable routing information. Thus, it MUST have a way
to deal with bad routing information, but it SHOULD (IMO) have a way
to deal with said information that is not necessarily fatal. We have
quite clearly demonstrated that it is a non-trivial possibility that
A) bad routes will manage to become widespread, through various bugs,
and B) it is possible to have one or two bad routes in an otherwise
useful table of 100,000 routes.

When reality says the basis of your design theory is inaccurate, well,
it's time to look at revamping the design to accomodate for it, if
that can be done without trashing the whole thing (sometimes even if
it takes that, but I see no call for it in this case, as it's not that
severe, and it is entirely fixable without tossing out everything that
has worked so far).

> As for making money... if the general agreement is that "BGP death
> penalty" is correct, let the violators and bad BGP speakers face the
> consequences of spewing garbage.

When the violators are "Almost ever major transit provider", this means
you'll be off in a corner playing Internet by yourself. This isn't very
attractive to most potential customers, no matter how RFC compliant you
are. Again, Saturday showed that this is, in fact, the case. I would love
to see the core problem fixed, and never *need* to invoke anything that
ditches single bad routes because the only breakages occur when a peer
goes completely nuts and spews garbage at me. Unfortunately, this hasn't
been the case for a long time now, and doesn't appear terribly likely to
be fixed tomorrow, given what the press releases have said about various
vendors...
-- 
***************************************************************************
Joel Baker                           System Administrator - lightbearer.com
lucifer at lightbearer.com              http://www.lightbearer.com/~lucifer



More information about the NANOG mailing list