Destination Preference Attribute for BGP

Jon Lewis jlewis at lewis.org
Sat Aug 19 21:53:53 UTC 2023


On Fri, 18 Aug 2023, Matthew Petach wrote:

> Hi Robert,
> 
> Without naming any names, I will note that at some point in the not-too-distant past, I was part of a new-years-eve-holiday-escalation to $BACKBONE_ROUTER_PROVIDER when
> the global network I was involved with started seeing excessive convergence times (greater than one hour from BGP update message received to FIB being updated).  
> After tracking down development engineer from $RTR_PROVIDER on the new years eve holiday, it was determined that the problem lay in assumptions made about how communities
> were stored in memory.  Think hashed buckets, with linked lists within each bucket.  If the communities all happened to hash to the same bucket, the linked list in that
> bucket became extremely long; and if every prefix coming in, say from multiple sessions with a major transit provider, happened to be adding one more community to the very
> long linked list in that one hash bucket, well, it ended up slowing down the processing to the point where updates to the FIB were still trickling in an hour after the BGP
> neighbor had finished sending updates across.
> 
> A new hash function was developed on New Year's day, and a new version of code was built for us to deploy under relatively painful circumstances. 

This reminds me of two things.

First, some code I wrote more than 20 years ago to track and bill for 
overlapping dial-up sessions (i.e. dial-up account sharing).  Processing 
the RADIUS accounting data, I built a binary tree of users with each node 
having a linked list of session data.  I found while testing it, that as 
the amount of data fed in grew, the program got slower.  I solved it by 
converting the session data linked lists to doubly linked lists, allowing 
me to add session data to the lists by jumping directly to the end, seeing 
if that's where the current session belonged, and walking back the list 
if necessary, but generally it was not since the input data was generally 
in chronological order.  That made it super fast again.

Second, we ran into an issue with Arista some time ago and a peer on 
AMS-IX that set a ridiculous number of communities on their routes. 
Arista uses (used?) a fixed length buffer for communities in route-map 
processing and when doing "match community" in a route-map, if the set of 
communities on the route is longer than the fixed length buffer, and the 
communitites you're trying to match fall off the end, your route map match 
statement will fail to match, even though a show ip bgp... will show you 
that the communities you're trying to match are there.

----------------------------------------------------------------------
  Jon Lewis, MCP :)           |  I route
  StackPath, Sr. Neteng       |  therefore you are
_________ http://www.lewis.org/~jlewis/pgp for PGP public key_________


More information about the NANOG mailing list