Whats so difficult about ISSU

Jonathan Lassoff jof at thejof.com
Fri Nov 9 05:15:04 UTC 2012


On Thu, Nov 8, 2012 at 8:13 PM, Mikael Abrahamsson <swmike at swm.pp.se> wrote:
> On Thu, 8 Nov 2012, Phil wrote:
>
>> The major vendors have figured it out for the most part by moving to
>> stateful synchronization between control plane modules and implementing
>> non-stop routing.
>
>
> NSR isn't ISSU.
>
> ISSU contains the wording "in service". 6 seconds of outage isn't "in
> service". 0.5 seconds of outage isn't "in service". I could accept a few
> microseconds of outage as being "ISSU", but tenths of seconds isn't in
> service.
>
>
>> The main remaining hurdle is updating microcode on linecards, they still
>> need to be rebooted after an upgrade.
>
>
> ... and as long as this is the case, there is no ISSU. There is only
> "shorter outages during upgrade compared to a complete reboot".

This.
There are some wonderfully reconfigurable router hardwares out in the
world, and platforms that can dynamically program their forwarding
hardware make this seem possible.

It's possible to build things such that portions of a single box can
be upgraded at a time. With multiple links, or forwarding-paths out to
a remote destination, it seems to me that if the upgrade process could
just coordinate things and update each piece of forwarding hardware
while letting traffic cut over and waiting for it to come back before
moving on.

I could envision a Juniper M/TX box, where MPLS FRR or an "ae"
interface across FPCs could take backup traffic while a PFE is
upgraded.
Of course, every possible path would need to be able to survive an FPC
being down, and the process would have to have hooks into protocols to
know when everything is switched back.




More information about the NANOG mailing list