Scalability issues in the Internet routing system

Thu Oct 27 14:30:41 UTC 2005

>>>
>> Neat!  So you were thinking you would leave the actual route   
>> selection process monolithic and create separate processes per  
>> peer?   I have seen folks doing something similar with separate  
>> MBGP routing  instances.  Had not heard of anyone attempting this  
>> for a "global"  routing table with separate threads per neighbor  
>> (as opposed to per  table).  What do you do if you have one  
>> neighbor who wants to send  you all 2M routes though?  I am  
>> thinking of route reflectors  specifically but also confederation  
>> EIBGP sessions.
>>
> > I think you hit the nail on the head regarding record locking.   
> This  is
> > the thing that is going to bite you if anything will.  I have  heard
> > none of the usual suspects speak up so I suspect that either  this
> > thread is now being ignored or no one has heard of an   
> implementation
> > like the one you just described.
>
> In BGP there is no 'global' route (actually path) selection in BGP.
> Everything is done per prefix+path.  In the RIB you can just lock  
> the prefix,
> insert the new path and recalculate which one wins.  Then issue the  
> update
> to the FIB, if any.  Work done.  Statistically there is very little
> contention on the prefix and the path records.  For contention two  
> updates
> for the same prefix would have to arrive at the same time from two  
> different
> peers handled by different CPU's.  I'd guess the SMP scaling factor  
> for BGP
> is around 1.98.  The 0.02 go lost for locking overhead and negative  
> caching
> effects.  Real serialization happens only at the FIB change queue.   
> However
> serializing queues can be handled very efficiently on SMP too.
>

Hey Andre,

If you are intending to break the BGP process into per neighbor  
threads this does not sound like it would have beneficial impact on a  
single neighbor with the majority of the routes (thinking  
specifically of EIBGP and/or Route Reflectors).  Was your idea  
specifically related to per neighbor processing or were you thinking  
you could break the BGP process itself into chunks?