Whats so difficult about ISSU

Juuso Lehtinen juuso.lehtinen at gmail.com
Fri Nov 9 07:36:22 UTC 2012


In vendor-speak ISSU usually refers to 'minimal traffic impact' upgrade.
Definition of minimal varies from vendor to vendor and from upgrade to
upgrade, depending of which parts of the code need to be upgraded. In
general, traffic loss during ISSU is an order of magnitude less than by
reloading the whole box or line card as with conventional upgrade.

On high level, the ISSU can be divided to two areas:
* Control plane / controller card software upgrade
* Forwarding plane / line card software upgrade

Control card software upgrade is the easy part. In 1+1 controller design,
the standby controller card is upgraded first. Next, control card
switchover is performed. And last, the remaining controller card is
upgraded.

Line card upgrade is the more tricky part. On high level, the line card can
be divided into forwarding plane and control plane (yes - there is CPU
complex on line cards as well). The control plane part of the line card can
be upgraded separately and then restarted. If line-card CPU is responsible
for generating OSPF hellos, the OSPF session might time out during the
restart. However, for most protocols, graceful restart extensions help over
any such issues. While the control plane is rebooting, the forwarding bits
on the line card continue packet forwarding.

The forwarding plane upgrade of the line card is the tricky part. This is
the part that will cause the 'short outage' during ISSU. If the code
upgrade needs to touch microcode or FPGA code, you will be seeing some
traffic loss. It is just the way these chips are built - you cannot
reprogram FPGA without taking the FPGA out of service first. The same
applies to network processors as well.

In theory you could duplicate these forwarding plane chips on line cards
and implement simple switch before the PHY. However, I doubt if any vendor
has gone this way as it would push line card prices much higher.

If your SLAs are built so that no packet loss is acceptable, you need to
work around the ISSU limitations:
* Use line-level protection on adjacent line cards (LAG, APS1+1, MSP1+1) -
when primary card goes down, the backup card will carry the traffic
* When upgrading a transit router, route traffic via redundant path before
starting transit router upgrade

BR,
 Juuso

 is such that no traffic loss whatsoever is acceptable, be sure to


On Thu, Nov 8, 2012 at 3:22 PM, Kasper Adel <karim.adel at gmail.com> wrote:

> Hello,
>
> We've been hearing about ISSU for so many years and i didnt hear that any
> vendor was able to achieve it yet.
>
> What is the technical reason behind that?
>
> If i understand correctly, the way it will be done would be simply to have
> extra ASICs/HW to be able to build dual circuits accessing the same memory,
> and gracefully switch from one to another. Is that right?
>
> Thanks,
> Kim
>



More information about the NANOG mailing list