Devil's Advocate - Segment Routing, Why?

Saku Ytti saku at ytti.fi
Thu Jun 18 05:25:46 UTC 2020


On Thu, 18 Jun 2020 at 01:17, Mark Tinka <mark.tinka at seacom.mu> wrote:

> IOS XR does not appear to support SR-OSPFv3.
> IOS XE does not appear to support SR-ISISv6.
> IOS XE does not appear to support SR-OSPFv3.
> Junos does not appear to support SR-OSPFv3.

The IGP mess we are in is horrible, but I can't blame SR for it. It's
really unacceptable we spend NRE hours developing 3 identical IGP
(OSPFv2, OSPFv3, ISIS). We all pay a 300-400% premium for a single
IGP.

In a sane world, we'd retire all of them except OSPFv3 and put all NRE
focus on there or move some of the NRE dollars to some other problems
we have, perhaps we would have room to support some different
non-djikstra IGP.

In a half sane world, IGP code, 90% of your code would be identical,
then you'd have adapter/ospfv2 adapter/ospfv3 adapter/isis which
translates internal struct to wire and wire to internal struct. So any
features you code, come for free to all of them. But no one is doing
this, it's 300% effort, and we all pay a premium for that.

In a quarter sane world we'd have some CIC, common-igp-container RFC
and then new features like SR would be specified as CIC-format,
instead of OSPFv2, OSPFv3, ISIS and BGP. Then each OSPFv2, OSPFv3,
ISIS and BGP would have CIC-to-x RFC. So people introducing new IGP
features do not need to write 4 drafts, one is enough.

I would include IPv4+IPv6 my-igp-of-choice SR in my RFP. Luckily ISIS
is supported on platforms I care about for IPV4+IPV6, so I'm already
there.

> MPLS/VPN service signaling in IPv6-only networks also has gaps in SR.

I don't understand this.


> So for networks that run OSPF and don't run Juniper, they'd need to move to IS-IS in order to have SR forward IPv6 traffic in an MPLS encapsulation. Seems like a bit of an ask. Yes, code needs to be written, which is fine by me, as it also does for LDPv6.

And it's really just adding TLV, if it already does IPv4 all the infra
should be in place, only  thing missing is transporting the
information. Adding TLV to IGP is a lot less work than LDPv6.

> I'd be curious to understand what bugs you've suffered with LDP in the last 10 or so years, that likely still have open tickets.

3 within a year.
- PR1436119
- PR1428081
- PR1416032

I don't have IOS-XR LDP bugs within a year, but we had a bunch back
when going from 4 to 5. And none of these are cosmetic, these are
blackholing.

I'm not saying LDP is bad, it's just, of course more code lines you
exercise more bugs you see.

But yes, LDP has a lot of bug surface compared to SR, but in _your
network_ lot of that bug surface and complexity is amortised
complexity. So status quo bias is strong to keep running LDP, it is
simpler _NOW_ as a lot of the tax has been paid and moving to an
objectively simpler solution carries risk, as its complexity is not
amortised yet.


> Yes, we all love less state, I won't argue that. But it's the same question that is being asked less and less with each passing year - what scales better in 2020, OSPF or IS-IS. That is becoming less relevant as control planes keep getting faster and cheaper.

I don't think it ever was relevant.

> I'm not saying that if you are dealing with 100,000 T-LDP sessions you should not consider SR, but if you're not, and SR still requires a bit more development (never mind deployment experience), what's wrong with having LDPv6? If it makes near-as-no-difference to your control plane in 2020 or 2030 as to whether your 10,000-node network is running LDP or SR, why not have the choice?

I can't add anything to the upside of going from LDP to SR that I've
not already said. You get more by spending less, it's win:win. Only
reason to stay in LDP is status quo bias which makes short term sense.

> Routers, in 2020, still ship with RIPv2. If anyone wants to use it (as I am sure there are some that do), who are we to stand in their way, if it makes sense for them?

RIP might make sense in some deployments, because it's essentially
stateless (routes age out, no real 'session') so if you have 100k VM
per router that you need to support and you want dynamic routing, RIP
might be the least resistance solution with the highest scale. Timing
wheels should help it scale and maintain great number of timers.

--
  ++ytti



More information about the NANOG mailing list