BGP convergence problem

Kevin Hodle kevin.hodle at gmail.com
Tue Jun 8 16:50:19 UTC 2010


Hi Andy,

  We have had similar problems with s720/3bxl on exchanges with large
numbers of peers. Exact same symptoms, can be triggered by any
significant UPDATE flux, even iBGP originated path-hunts. This problem
is compounded if you are taking full tables on the same device, to the
extent that the bgp scanner and bgp IO processes grind the
control-plane to halt causing ISIS/OSPF adjacencies to drop, SNMP and
SSH unresponsive, etc. Same behavior is seen regardless of IOS train.
As others have pointed out, the sad fact of the matter is that the
s720/3BXL simply does not have the CPU power to cope with hundreds of
neighbor sessions and the growing numbers of paths. Here are some
things that we tried with varied success to remedy bgp deadlock on
this platform:

* lower process-max-time to prevent bgp scanner/bgp io processes from
completely consuming the control-plane
* Take soft-reconfiguration off of neighbors/peer-groups where you
can, this will help tremendously
* Split the load of neighbor sessions between multiple devices, move
full table feeds to other devices

The 'final solution' is to simply replace this platform with a newer
more powerful alternative, and there are numerous candidates :)

Best Regards,
Kevin Hodle

On Tue, Jun 8, 2010 at 4:58 AM, Andy B. <globichen at gmail.com> wrote:
> Hi,
>
> This morning there was an ethernet loop problem on DECIX, causing many
> BGP sessions to flap throughout the entire platform.
> While this can happen, I am myself facing with BGP convergence
> problems on our DECIX router (SUP720-3BXL with IOS SXI3).
>
> De DECIX loop has been solved two hours ago, but my BGP sessions are
> still flapping and not converging at all. This has been flooding our
> logs, and is still going on:
>
> Jun  8 11:47:03 x.x.x.131 239447: Jun  8 11:48:38.364 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.194.32 Up
> Jun  8 11:47:03 x.x.x.131 239448: Jun  8 11:48:38.364 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.192.231 Up
> Jun  8 11:47:03 x.x.x.131 239449: Jun  8 11:48:38.364 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.192.109 Up
> Jun  8 11:47:03 x.x.x.131 239450: Jun  8 11:48:38.364 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.194.50 Up
> Jun  8 11:47:03 x.x.x.131 239451: Jun  8 11:48:38.364 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.192.81 Up
> Jun  8 11:47:03 x.x.x.131 239452: Jun  8 11:48:38.364 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.192.28 Up
> Jun  8 11:47:03 x.x.x.131 239453: Jun  8 11:48:38.364 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.193.212 Up
> Jun  8 11:47:03 x.x.x.131 239454: Jun  8 11:48:38.368 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.193.147 Up
> Jun  8 11:47:03 x.x.x.131 239455: Jun  8 11:48:38.368 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.192.74 Up
> Jun  8 11:47:03 x.x.x.131 239456: Jun  8 11:48:38.368 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.192.241 Up
> Jun  8 11:47:03 x.x.x.131 239457: Jun  8 11:48:38.368 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.194.5 Up
> Jun  8 11:47:03 x.x.x.131 239458: Jun  8 11:48:38.368 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.192.40 Up
> Jun  8 11:47:03 x.x.x.131 239459: Jun  8 11:48:38.368 CEST:
> %BGP-5-ADJCHANGE: neighbor 2001:7F8::1A44:0:1 Up
> Jun  8 11:47:03 x.x.x.131 239460: Jun  8 11:48:38.368 CEST:
> %BGP-5-ADJCHANGE: neighbor 2001:7F8::8605:0:1 Up
> Jun  8 11:47:03 x.x.x.131 239461: Jun  8 11:48:38.368 CEST:
> %BGP-5-ADJCHANGE: neighbor 2001:7F8::1A0B:0:1 Up
> Jun  8 11:47:03 x.x.x.131 239462: Jun  8 11:48:38.368 CEST:
> %BGP-5-ADJCHANGE: neighbor 2001:7F8::3029:0:1 Up
> Jun  8 11:47:03 x.x.x.131 239463: Jun  8 11:48:38.368 CEST:
> %BGP-5-ADJCHANGE: neighbor 2001:7F8::6E4:0:1 Up
> Jun  8 11:47:03 x.x.x.131 239464: Jun  8 11:48:38.372 CEST:
> %BGP-5-ADJCHANGE: neighbor 2001:7F8::CB0:0:1 Up
> Jun  8 11:47:03 x.x.x.131 239465: Jun  8 11:48:38.372 CEST:
> %BGP-5-ADJCHANGE: neighbor 2001:7F8::21C8:0:1 Up
> Jun  8 11:47:03 x.x.x.131 239466: Jun  8 11:48:38.372 CEST:
> %BGP-5-ADJCHANGE: neighbor 2001:7F8::8463:0:2 Up
> Jun  8 11:47:04 x.x.x.131 239467: Jun  8 11:48:38.372 CEST:
> %BGP-5-ADJCHANGE: neighbor 2001:7F8::31AA:0:1 Up
> Jun  8 11:47:04 x.x.x.131 239468: Jun  8 11:48:38.372 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.194.29 Up
> Jun  8 11:47:04 x.x.x.131 239469: Jun  8 11:48:38.372 CEST:
> %BGP-5-ADJCHANGE: neighbor 2001:7F8::62BF:0:1 Up
> Jun  8 11:47:04 x.x.x.131 239470: Jun  8 11:48:39.656 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.192.101 Down BGP Notification sent
> Jun  8 11:47:04 x.x.x.131 239471: Jun  8 11:48:39.656 CEST:
> %BGP-3-NOTIFICATION: sent to neighbor 80.81.192.101 4/0 (hold time
> expired) 0 bytes
> Jun  8 11:47:07 x.x.x.131 239472: Jun  8 11:48:41.696 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.192.104 Up
> Jun  8 11:47:10 x.x.x.131 239473: Jun  8 11:48:44.488 CEST:
> %BGP-3-BGP_NO_REMOTE_READ: 80.81.193.187 connection timed out - has
> not accepted a message from us for 20000ms (hold time), 1 messages
> pending transmition.
> Jun  8 11:47:10 x.x.x.131 239474: Jun  8 11:48:44.488 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.193.187 Down BGP Notification sent
> Jun  8 11:47:10 x.x.x.131 239475: Jun  8 11:48:44.488 CEST:
> %BGP-3-NOTIFICATION: sent to neighbor 80.81.193.187 4/0 (hold time
> expired) 0 bytes
> Jun  8 11:47:10 x.x.x.131 239476: Jun  8 11:48:44.900 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.194.61 Up
> Jun  8 11:47:10 x.x.x.131 239477: Jun  8 11:48:44.900 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.192.149 Up
> Jun  8 11:47:10 x.x.x.131 239478: Jun  8 11:48:44.900 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.192.136 Up
> Jun  8 11:47:10 x.x.x.131 239479: Jun  8 11:48:44.904 CEST:
> %BGP-5-ADJCHANGE: neighbor 2001:7F8::8463:0:1 Up
> Jun  8 11:47:10 x.x.x.131 239480: Jun  8 11:48:46.352 CEST:
> %BGP-5-ADJCHANGE: neighbor 2001:7F8::6268:0:1 Up
> Jun  8 11:47:14 x.x.x.131 239481: Jun  8 11:48:48.084 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.193.78 Up
> Jun  8 11:47:14 x.x.x.131 239482: Jun  8 11:48:49.172 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.193.239 Up
> Jun  8 11:47:14 x.x.x.131 239483: Jun  8 11:48:49.172 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.194.24 Up
> Jun  8 11:47:17 x.x.x.131 239484: Jun  8 11:48:52.160 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.194.45 Up
> Jun  8 11:47:17 x.x.x.131 239485: Jun  8 11:48:52.160 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.192.108 Up
> Jun  8 11:47:17 x.x.x.131 239486: Jun  8 11:48:52.160 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.192.164 Up
> Jun  8 11:47:17 x.x.x.131 239487: Jun  8 11:48:52.164 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.193.49 Up
> Jun  8 11:47:17 x.x.x.131 239488: Jun  8 11:48:52.164 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.192.139 Up
> Jun  8 11:47:17 x.x.x.131 239489: Jun  8 11:48:52.164 CEST:
> %BGP-5-ADJCHANGE: neighbor 2001:7F8::1536:0:1 Up
> Jun  8 11:47:17 x.x.x.131 239490: Jun  8 11:48:52.164 CEST:
> %BGP-5-ADJCHANGE: neighbor 2001:7F8::8601:0:1 Up
> Jun  8 11:47:17 x.x.x.131 239491: Jun  8 11:48:53.788 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.192.45 Up
> Jun  8 11:47:17 x.x.x.131 239492: Jun  8 11:48:53.788 CEST:
> %BGP-5-ADJCHANGE: neighbor 2001:7F8::A2DC:0:1 Up
> Jun  8 11:47:21 x.x.x.131 239493: Jun  8 11:48:55.056 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.194.91 Down BGP Notification sent
>
>
>
> Jun  8 11:49:04 x.x.x.131 239583: Jun  8 11:50:37.684 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.194.14 Down Peer closed the session
> Jun  8 11:49:04 x.x.x.131 239584: Jun  8 11:50:38.656 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.192.120 Down BGP Notification sent
> Jun  8 11:49:04 x.x.x.131 239585: Jun  8 11:50:38.656 CEST:
> %BGP-3-NOTIFICATION: sent to neighbor 80.81.192.120 4/0 (hold time
> expired) 0 bytes
> Jun  8 11:49:04 x.x.x.131 239586: Jun  8 11:50:38.656 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.192.229 Down BGP Notification sent
> Jun  8 11:49:04 x.x.x.131 239587: Jun  8 11:50:38.656 CEST:
> %BGP-3-NOTIFICATION: sent to neighbor 80.81.192.229 4/0 (hold time
> expired) 0 bytes
> Jun  8 11:49:04 x.x.x.131 239588: Jun  8 11:50:38.656 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.192.108 Down BGP Notification sent
> Jun  8 11:49:04 x.x.x.131 239589: Jun  8 11:50:38.656 CEST:
> %BGP-3-NOTIFICATION: sent to neighbor 80.81.192.108 4/0 (hold time
> expired) 0 bytes
> Jun  8 11:49:07 x.x.x.131 239590: Jun  8 11:50:41.944 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.194.73 Down BGP Notification sent
> Jun  8 11:49:07 x.x.x.131 239591: Jun  8 11:50:41.944 CEST:
> %BGP-3-NOTIFICATION: sent to neighbor 80.81.194.73 4/0 (hold time
> expired) 0 bytes
> Jun  8 11:49:07 x.x.x.131 239592: Jun  8 11:50:41.944 CEST:
> %BGP-5-ADJCHANGE: neighbor 2001:7F8::20AD:0:2 Down BGP Notification
> sent
> Jun  8 11:49:07 x.x.x.131 239593: Jun  8 11:50:41.944 CEST:
> %BGP-3-NOTIFICATION: sent to neighbor 2001:7F8::20AD:0:2 4/0 (hold
> time expired) 0 bytes
> Jun  8 11:49:07 x.x.x.131 239594: Jun  8 11:50:41.944 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.193.115 Down BGP Notification sent
> Jun  8 11:49:07 x.x.x.131 239595: Jun  8 11:50:41.944 CEST:
> %BGP-3-NOTIFICATION: sent to neighbor 80.81.193.115 4/0 (hold time
> expired) 0 bytes
> Jun  8 11:49:07 x.x.x.131 239596: Jun  8 11:50:44.124 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.194.3 Down BGP Notification sent
> Jun  8 11:49:11 x.x.x.131 239597: Jun  8 11:50:44.124 CEST:
> %BGP-3-NOTIFICATION: sent to neighbor 80.81.194.3 4/0 (hold time
> expired) 0 bytes
> Jun  8 11:49:11 x.x.x.131 239598: Jun  8 11:50:45.200 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.192.215 Down BGP Notification sent
> Jun  8 11:49:11 x.x.x.131 239599: Jun  8 11:50:45.200 CEST:
> %BGP-3-NOTIFICATION: sent to neighbor 80.81.192.215 4/0 (hold time
> expired) 0 bytes
> Jun  8 11:49:11 x.x.x.131 239600: Jun  8 11:50:47.336 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.192.141 Down BGP Notification sent
> Jun  8 11:49:11 x.x.x.131 239601: Jun  8 11:50:47.336 CEST:
> %BGP-3-NOTIFICATION: sent to neighbor 80.81.192.141 4/0 (hold time
> expired) 0 bytes
> Jun  8 11:49:14 x.x.x.131 239602: Jun  8 11:50:48.432 CEST:
> %BGP-5-ADJCHANGE: neighbor 2001:7F8::3B41:0:1 Down BGP Notification
> sent
> Jun  8 11:49:14 x.x.x.131 239603: Jun  8 11:50:48.432 CEST:
> %BGP-3-NOTIFICATION: sent to neighbor 2001:7F8::3B41:0:1 4/0 (hold
> time expired) 0 bytes
> Jun  8 11:49:14 x.x.x.131 239604: Jun  8 11:50:49.720 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.192.239 Down BGP Notification sent
> Jun  8 11:49:14 x.x.x.131 239605: Jun  8 11:50:49.720 CEST:
> %BGP-3-NOTIFICATION: sent to neighbor 80.81.192.239 4/0 (hold time
> expired) 0 bytes
> Jun  8 11:49:17 x.x.x.131 239606: Jun  8 11:50:50.976 CEST:
> %BGP-5-ADJCHANGE: neighbor 2001:2000:3080:B4::1 Down Peer closed the
> session
> Jun  8 11:49:17 x.x.x.131 239607: Jun  8 11:50:52.976 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.194.20 Down BGP Notification sent
> Jun  8 11:49:17 x.x.x.131 239608: Jun  8 11:50:52.976 CEST:
> %BGP-3-NOTIFICATION: sent to neighbor 80.81.194.20 4/0 (hold time
> expired) 0 bytes
> Jun  8 11:49:17 x.x.x.131 239609: Jun  8 11:50:54.044 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.193.21 Down BGP Notification sent
> Jun  8 11:49:17 x.x.x.131 239610: Jun  8 11:50:54.044 CEST:
> %BGP-3-NOTIFICATION: sent to neighbor 80.81.193.21 4/0 (hold time
> expired) 0 bytes
> Jun  8 11:49:20 x.x.x.131 239611: Jun  8 11:50:56.204 CEST:
> %BGP-5-ADJCHANGE: neighbor 2001:7F8::1A0B:0:1 Down BGP Notification
> sent
> Jun  8 11:49:20 x.x.x.131 239612: Jun  8 11:50:56.204 CEST:
> %BGP-3-NOTIFICATION: sent to neighbor 2001:7F8::1A0B:0:1 4/0 (hold
> time expired) 0 bytes
> Jun  8 11:49:23 x.x.x.131 239613: Jun  8 11:50:58.400 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.194.63 Down BGP Notification sent
> Jun  8 11:49:23 x.x.x.131 239614: Jun  8 11:50:58.400 CEST:
> %BGP-3-NOTIFICATION: sent to neighbor 80.81.194.63 4/0 (hold time
> expired) 0 bytes
> Jun  8 11:49:23 x.x.x.131 239615: Jun  8 11:50:59.448 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.192.97 Down BGP Notification sent
> Jun  8 11:49:23 x.x.x.131 239616: Jun  8 11:50:59.448 CEST:
> %BGP-3-NOTIFICATION: sent to neighbor 80.81.192.97 4/0 (hold time
> expired) 0 bytes
> Jun  8 11:49:27 x.x.x.131 239617: Jun  8 11:51:01.664 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.192.131 Down BGP Notification sent
> Jun  8 11:49:27 x.x.x.131 239618: Jun  8 11:51:01.664 CEST:
> %BGP-3-NOTIFICATION: sent to neighbor 80.81.192.131 4/0 (hold time
> expired) 0 bytes
> Jun  8 11:49:27 x.x.x.131 239619: Jun  8 11:51:03.872 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.194.83 Down BGP Notification sent
> Jun  8 11:49:27 x.x.x.131 239620: Jun  8 11:51:03.872 CEST:
> %BGP-3-NOTIFICATION: sent to neighbor 80.81.194.83 4/0 (hold time
> expired) 0 bytes
> Jun  8 11:49:27 x.x.x.131 239621: Jun  8 11:51:03.872 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.192.156 Down BGP Notification sent
> Jun  8 11:49:30 x.x.x.131 239622: Jun  8 11:51:03.872 CEST:
> %BGP-3-NOTIFICATION: sent to neighbor 80.81.192.156 4/0 (hold time
> expired) 0 bytes
> Jun  8 11:49:30 x.x.x.131 239623: Jun  8 11:51:06.056 CEST:
> %BGP-5-ADJCHANGE: neighbor 80.81.194.50 Down BGP Notification sent
> Jun  8 11:49:30 x.x.x.131 239624: Jun  8 11:51:06.056 CEST:
> %BGP-3-NOTIFICATION: sent to neighbor 80.81.194.50 4/0 (hold time
> expired) 0 bytes
>
> CPU load is constantly at 100% doing BGP and more BGP.
>
> We have around 200 BGP sessions on DECIX and I would not want to shut
> them down and bring them up individually.
>
> How can I get out of this deadlock?
>
>
> Andy
>
>



-- 
================================================================
 :: :: Kevin Hodle | http://www.linkedin.com/in/kevinhodle
 :: :: PGP Key ID  | fingerprint
 :: :: 0x803F24BE  | 1094 FB06 837F 2FAB C86B E4BE 4680 3679 803F 24BE

"Elegance is not a dispensable luxury but a factor that decides
between success and failure. "
-Edsgar Dijkstra
================================================================




More information about the NANOG mailing list