Cisco ASR9010 vs Juniper MX960
saku at ytti.fi
Thu Feb 18 14:46:07 UTC 2016
On 18 February 2016 at 15:45, Colton Conor <colton.conor at gmail.com> wrote:
> I would like opinions of the differences between these two platforms if
Summary, I think MX is better HW and SW right now.
Warning, rant incoming.
I liked ASR9k lot more before I needed to run it. On paper IOS-XR is
superior to JunOS, JunOS is old fashioned non-pre-emptive,
run-to-completion. In theory this is most efficient way to run code,
but in practice it means programmer needs to be hyper aware how long
any bit of code they are writing may execute, if they get it wrong,
and don't yield manually, simple things like parsing community list
while doing commit may cause IGP flap.
IOS-XR otoh has multiple processes scheduled either by QNX or Linux,
which means programmer does need to be so careful, Linux can pre-empt
the process and run something more important.
However, with this distribution comes problem of IPC, sharing-state in
fast and economical manner, and I believe IOS-XR has dropped the ball
here, I don't know if it's even possible to solve today, it is
probably a very hard problem. This is just speculation, but I feel
like Cisco underestimated the problem, and instead of rethinking
infrastructure, they are duplicating state in efforts to keep
performance acceptable, as IPC cannot be made fast enough. All this
adds complexity which adds bugs.
So in practice, I believe JunOS to be currently the better system. But
IOS-XR 6 may show some light behind the tunnel, unsure yet. (Isn't
this always the case, in two years time, everything will be great)
For hardware, ASR9k have trident and typhoon generation, which are
Israeli EZChip (since acquired) NPUs, and now tomahawk which is
completely different NPU. Juniper MX has DPCE and Trio, from microcode
POV both have two generations, but you can't buy anymore DPCE it's
very old, so all MX systems really are Trio only, which means JNPR
only needs to develope features once for single NPU generation. Cisco
needs to do it twice and operator needs to learn two platforms to
troubleshoot, and there is feature disparity with troubleshooting
I also believe that Trio NPU is better NPU than EZchip or the one in
Typhoon, they atypically have succeeded doing all lookups (FIB and
ALC) in RLDRAM, instead of TCAM which is easier to pull off but more
expensive. Trio can do more in HW, like fragmentation, can look deeper
in packet. Lot of flexibility is exposed to operator, like ability to
arbitrary firewall filters by checking specific bit-positions.
For multicast ASR9k is better, as it can replicate in fabric, where as
in MX replication is done by linecard, either binary or unary. But
this really is relevant unless you actually have large volume of
multicast replicated to many ports.
For troubleshooting/instrumentation, for some things MX is better,
like packet-via-dmem capture for all transit packets is god-sent. But
ASR9k has far more NPU counters for various drop/punt/limit
conditions, which most can be capture (at cost of stopping forwarding
for a moment). Most of the stuff in ASR9k is very new or just coming,
while MX has had sufficient instrumentation for years. ASR9k team is
focusing on this and lot of good stuff is in pipeline, which may make
ASR9k instrumentation better on the long run.
IOS-XR does not have any guaranteed machine parseable presentation of
data, in JunOS every command can be outputted as high quality XML. In
IOS-XR this is rarely possible, and even when it is, there is no
strong relation CLI, and often the actual output is just single
string-blob, so using it is no better than screee-scraping. JunOS
inherently will have this XML, much like TimOS would inherently have
SNMP presentation of data.
I don't imagine this being solved any time soon, because it's very
fundamental infrastructure issue. What is our truth source? Truth
source should be single presentation, out of which both CLI/XML/YANG
is extracted, so that there simply is no possibility of de-sync.
Lot of the stuff Cisco wanted to solve from Classic IOS are actually
worse in IOS-XR. Software management is worse, yeah you have SMUs but
managing them is a nightmare and most of them are reload or routign
flap anyhow, so it does not really help you. I actually prefer
managing Classic IOS software than XR. Most of the time we need to
upgrade, we need to do it because HW isn't supported. JunOS has
figured this out correctly as well, by having hardware abstraction
layer they can in-service add 'JAM' or new support for new hardware,
without changing the software.
For control-plane protection IOS-XR has pretty solid idea in 'LPTS',
the platform should know what is to be punted and what not, so why not
automatically program ACLs and policers for that stuff. It works
somewhat well, better than JunOS out-of-the-box. But for operator who
knows what they are doing, JunOS can be protected much, much better.
'LPTS' only has single policer for specific traffic-class, like
'BGP-known', if this is offended, all BGP suffer. Where as JunOS has
multiple levels of policers, aggregate policer, which is same as
IOS-XR, but there are also 'subscriber' level (L4 keys), 'ifl' level
and 'ifd' level. So even if single BGP neighbour flloods you tons of
frames, you can still have all other BGP sessions protected by having
the misbehaving BGP neighbour limited in its IFD, IFL or Sub level.
If I could get classicIOS with commit and RPL, I'd run that rather
than XR right now.
For MX you might want to ping account team about MX2008, which will
(IMHO) replace MX960 RSN. Main advantage on top of supporting newer
MPCs is that you don't have mid-plane, fabrics are connected to LC's
directly, so you never need to upgrade chassis to support higher rate
SERDES in future.
More information about the NANOG