seeing the trees in the forest of confusion

Doug Junkins junkins at nwnet.net
Sat Apr 26 21:02:49 UTC 1997


On Sat, 26 Apr 1997, John Hawkinson wrote:

> > These cases seem to point to a problem with BGP route withdrawls that will
> > continue to increase the time it takes to recover from network problems.
> > Perhaps the router vendors would like to comment.
> 
> This seems inappropriate to me.
> 
> You have just said: "I sat and watched a provider keep routes around
> long past their being withdrawn, and they didn't know what to do so
> suggested two kludges: 1) advertising more-specifics and 2) rebooting
> routers. Could some vendor comment on this problem?".
>

Perhaps I should have been more clear with what the provider did during
the 5 hours that the routing loop continued in there backbone.  It didn't
take 5 hours to for the provider to identify that there was a problem with
the routes in their tables (i.e. a few of their routers in their IBGP mesh 
had more specifics from Provider Y while most did not).  Instead, it took
the provider 5 hours to troubleshoot the problem with the router vendor
before both agreed that it was a software bug and identified the need to
reload some of the routers.  The hack of advertising more specifics was
used to buy time before reloading the routers to minimize the impact.


> This is every vendor's worst nightmare.
> 
> Every vendor necessarily (and rightly so!) provides all users enough
> rope to hang themselves with. It seems inappropriate for someone who
> doesn't know what the full story is to call vendors to account.
> 
> If the provider in question adjusted some knobs and settings so as to
> cause such a problem, what is the vendor to do?
> 
> How could the vendor even come close to trying to explain the problem
> without detailed information about the problems and configurations?
> 
> 
> Pessimistically speaking, it seems that there are two ways that this
> thread could come to a close:
> 
> 	1)	People will keep badgering the vendor and the vendor
> 		will come out looking ugly if they cannot account for
> 		the problem based on insufficient data.
> 
> 	2)	People will all be quiet and stop complaining until
> 		the operator(s) in question and vendor(s) have information
> 		and communicate it.
> 
> 2) seems obviously preferable, but I suspect that the people on this
> list will go for 1) since it will allow everyone to flame and chatter
> incessantly, increasing NANOG mail volume and everyone's productivity.
> 

If I'm the only person that's seen this type of problem, I'll shut up
about it.  But if this type of problem has impacted more providers, I
think it's appropriate in this forum to ask the router vendors to comment
on any known problems with BGP route withdrawals.  If they don't have
enough information to account for the problem, then they should tell us
that so we can get the data to them the next time something like this
happens. 

- Doug 

> If anyone who has seen this problem first hand has detailed technical
> information to provide, that is of course useful and welcome in this
> forum. But complaining without having any of the data? What's the point?
> 
> --jhawk
> 






More information about the NANOG mailing list