Cisco IOS MPLS VPN Bug
jason at lixfeld.ca
Sat Mar 12 10:34:52 CST 2011
On 2011-03-12, at 2:31 AM, Joe Renwick wrote:
> These routers
> are configured as BGP route-reflectors.
> soft nor hard clears on the BGP neighbors worked, only the config removal.
> Once re-applied life was good.
> The bug itself was with the BGP updates sent by the RR. During the outage
> these updates did not include the Route Target Extended Community required
> by the route-reflector clients which identifies which VRF the route belongs
> Notice the mysterious disappearance of the RT community.
> Looking to see if anyone has seen this issue particularly with this version
> of code. TAC is trying to tell me that this was a bug in a previous version
> but is fixed in the code I am running.
Interesting. I recently closed off a TAC case on a similar issue, but not an identical issue. In my case, it was 12.2(52)EY on an ME3600 and in my particular topology, an ME3600 wasn't announcing a plain ol' BGP community to one of it's two RRs. The extended communities were fine tho. Also, the announcements were being stuffed into two different update groups; the ME that was sending the 'good' announcement was announcing updates to update-group 1 and 2 and the ME that was announcing the 'bad' announcement was announcing updates to update-group 1 only.
We didn't spend as much time as you clearly have troubleshooting the issue because we caught it before it was customer affecting. That said, at the time, I noticed the same thing; hard clearing the sessions didn't fix it. I didn't try to unconfigure the neighbour though; in my case, I was running EY on this switch and because the ME3600s are so new and EY1 was available and I knew that I'd have to reboot anyway to clear the issue, I decided to upgrade to EY1 and that seemed to clear up the problem.
I haven't seen this resurface since. EY1 was available as soon as we started receiving our ME3600s, so as a policy we upgraded every one before it went into the field, except I had missed this one in particular.
There were no open bugs pointing to my issue that the TAC engineer could find, but if you could pass me the case number, I'd like to give it to my engineer so he can see if your issue is somehow related to mine, just manifested in a slightly different way.
More information about the NANOG