CSR1000v + ASR1000 Code Upgrade Pleasure...

Mark Tinka mark at tinka.africa
Wed Oct 13 04:53:28 UTC 2021


Hi all.

I thought I'd share our recent experiences, per subject, just in case 
others run into the same problems.

So... we finally decided to try 17.3(4a)MD for the CSR1000v, after years 
of happy operation. Good Lord, what a drama!

At first, we couldn't figure out why iBGP sessions to all Cisco boxes 
could not stand up. Then we realized it's because IS-IS to them could 
not stand up. Then we realized it's because BFD sessions could not stand up.

But even after removing BFD, IS-IS remained down.

After 3 days of searching, we finally landed on CSCuz58508. In case you 
don't have CCO access, it is the same issue as described here:

https://community.cisco.com/t5/cisco-cloud-service-router-csr/b00ocg4q4e-csr-1000v-16-3-1a-can-t-set-mtu-on-gig-interface/td-p/3054853

This was even more confusing for us, because our interface driver on 
VMware ESXi is vmxnet3.

The bug ID suggests the problem is fixed in 16.3(2) and 16.4(1). So to 
be safe, we tested 16.12(5)MD, which allowed us to enable jumbo frames, 
but that only appeared to be a cosmetic thing. In the background, the 
box was simply dropping packets, silently. We found this out when we 
tried to copy other files to the node, and it would just hang without 
any feedback. Removing the jumbo frame support allowed the files to come 
through.

We noticed that nodes still running 3.17(0)S did not have any issues 
with IS-IS or BFD, or MTU. However, this code was only ever released as 
an ED train (and to be fair, we've been having dodgy issues with it in 
recent years), so we decided to downgrade to 3.16(9)S (which is actually 
an upgrade from 3.17(00)S, since the 3.16 train is an MD release, with 
the latest release being March 2019, vs. July 2017 for 3.17(4)SED).

With that, no more MTU issues, BFD and IS-IS are happy, iBGP is happy.

We definitely won't be wasting any more time trying to make Denali, 
Gibraltar, Fuji, Everest or Amsterdam work on our CSR1000v complement.

Needless to say, moving the ASR1000 platform to 17.3 has also come with 
its own avenue of pleasure, what with all the ROMMON, CPLD and FPGA 
upgrade mess that is. What the documentation says and what happens in 
real life are two very different things. It has taken us a week to come 
up with our own working procedure to upgrade just one box, worse if it's 
a dual-RP system.

Mark.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nanog.org/pipermail/nanog/attachments/20211013/b9cef984/attachment.html>


More information about the NANOG mailing list