Destination Preference Attribute for BGP
Jon Lewis
jlewis at lewis.org
Sat Aug 19 21:53:53 UTC 2023
On Fri, 18 Aug 2023, Matthew Petach wrote:
> Hi Robert,
>
> Without naming any names, I will note that at some point in the not-too-distant past, I was part of a new-years-eve-holiday-escalation to $BACKBONE_ROUTER_PROVIDER when
> the global network I was involved with started seeing excessive convergence times (greater than one hour from BGP update message received to FIB being updated).
> After tracking down development engineer from $RTR_PROVIDER on the new years eve holiday, it was determined that the problem lay in assumptions made about how communities
> were stored in memory. Think hashed buckets, with linked lists within each bucket. If the communities all happened to hash to the same bucket, the linked list in that
> bucket became extremely long; and if every prefix coming in, say from multiple sessions with a major transit provider, happened to be adding one more community to the very
> long linked list in that one hash bucket, well, it ended up slowing down the processing to the point where updates to the FIB were still trickling in an hour after the BGP
> neighbor had finished sending updates across.
>
> A new hash function was developed on New Year's day, and a new version of code was built for us to deploy under relatively painful circumstances.
This reminds me of two things.
First, some code I wrote more than 20 years ago to track and bill for
overlapping dial-up sessions (i.e. dial-up account sharing). Processing
the RADIUS accounting data, I built a binary tree of users with each node
having a linked list of session data. I found while testing it, that as
the amount of data fed in grew, the program got slower. I solved it by
converting the session data linked lists to doubly linked lists, allowing
me to add session data to the lists by jumping directly to the end, seeing
if that's where the current session belonged, and walking back the list
if necessary, but generally it was not since the input data was generally
in chronological order. That made it super fast again.
Second, we ran into an issue with Arista some time ago and a peer on
AMS-IX that set a ridiculous number of communities on their routes.
Arista uses (used?) a fixed length buffer for communities in route-map
processing and when doing "match community" in a route-map, if the set of
communities on the route is longer than the fixed length buffer, and the
communitites you're trying to match fall off the end, your route map match
statement will fail to match, even though a show ip bgp... will show you
that the communities you're trying to match are there.
----------------------------------------------------------------------
Jon Lewis, MCP :) | I route
StackPath, Sr. Neteng | therefore you are
_________ http://www.lewis.org/~jlewis/pgp for PGP public key_________
More information about the NANOG
mailing list