Did your BGP crash today?

Jack Bates jbates at brightok.net
Mon Aug 30 15:55:03 UTC 2010


Florian Weimer wrote:
> This whole thread is quite schizophrenic because the consensus appears
> to be that (a) a *researcher is not to blame* for sending out a BGP
> message which eventually leads to session resets, and (b) an
> *implementor is to blame* for sending out a BGP messages which
> eventually leads to session resets.  You really can't have it both
> ways.
> 

As good a place to break in on the thread as any, I guess. Randy and 
others believe more testing should have been done. I'm not completely 
sure they didn't test against XR. They very likely could have tested in 
a 1 on 1 connection and everything looked fine.

I don't know the full details, but at what point did the corruption 
appear, and was it visible? We know that it was corrupt on the output 
which caused peer resets, but was it necessarily visible in the router 
itself?

Do we require a researcher to setup a chain of every vender BGP speaker 
in every possible configuration and order to verify a bug doesn't cause 
things to break? In this case, one very likely would need an XR 
receiving and transmitting updates to detect the failure, so no less 
than 3 routers with the XR in the middle.

What about individual configurations? Perhaps the update is received and 
altered by one vendor due to specific configurations, sent to the next 
vendor, accepted and altered (due to the first alteration, where as it 
wouldn't be altered if the original update had been received) which 
causes the next vendor to reset. Then we add to this that it may pass 
silently through several middle vendor routers without problems and we 
realize the scope of such problems and why connecting to the Internet is 
so unpredictable.


Jack




More information about the NANOG mailing list