Destination Preference Attribute for BGP
Mark Tinka
mark at tinka.africa
Fri Aug 18 21:33:28 UTC 2023
On 8/18/23 22:40, Matthew Petach wrote:
>
> Hi Robert,
>
> Without naming any names, I will note that at some point in the
> not-too-distant past, I was part of a new-years-eve-holiday-escalation
> to $BACKBONE_ROUTER_PROVIDER when the global network I was involved
> with started seeing excessive convergence times (greater than one hour
> from BGP update message received to FIB being updated).
> After tracking down development engineer from $RTR_PROVIDER on the new
> years eve holiday, it was determined that the problem lay in
> assumptions made about how communities were stored in memory. Think
> hashed buckets, with linked lists within each bucket. If the
> communities all happened to hash to the same bucket, the linked list
> in that bucket became extremely long; and if every prefix coming in,
> say from multiple sessions with a major transit provider, happened to
> be adding one more community to the very long linked list in that one
> hash bucket, well, it ended up slowing down the processing to the
> point where updates to the FIB were still trickling in an hour after
> the BGP neighbor had finished sending updates across.
>
> A new hash function was developed on New Year's day, and a new version
> of code was built for us to deploy under relatively painful
> circumstances.
>
> It's easy to say "Considering that we are talking about control
> plane memory I think the cost/space associated with storing
> communities is less then negligible these days."
> The reality is very different, because it's not just about efficiently
> *storing* communities, it's really about efficiently *parsing and
> updating* communities--and the choices made there absolutely *DO*
> "contribute to longer protocol convergences in any measurable way."
>
> Matt
> (the names have been obscured to increase my chances of being hireable
> in the industry again at some future date. ;)
To be fair, you are talking about an arbitrary value of years back, on
boxes you don't name running code you won't mention.
This really not saying much :-).
Corner cases, while valid, do not speak to the majority. If this was a
major issue, there would have been more noise about it by now.
There has been quite some noise about lengthy AS_PATH updates that bring
some routers down, which has usually been fixed with improved BGP code.
But even those are not too common, if one considers a 365-day period.
Mark.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nanog.org/pipermail/nanog/attachments/20230818/21b3dd29/attachment.html>
More information about the NANOG
mailing list