Devil's Advocate - Segment Routing, Why?

Mark Tinka mark.tinka at seacom.mu
Thu Jun 18 07:13:09 UTC 2020



On 18/Jun/20 07:25, Saku Ytti wrote:

> The IGP mess we are in is horrible, but I can't blame SR for it. It's
> really unacceptable we spend NRE hours developing 3 identical IGP
> (OSPFv2, OSPFv3, ISIS). We all pay a 300-400% premium for a single
> IGP.
>
> In a sane world, we'd retire all of them except OSPFv3 and put all NRE
> focus on there or move some of the NRE dollars to some other problems
> we have, perhaps we would have room to support some different
> non-djikstra IGP.
>
> In a half sane world, IGP code, 90% of your code would be identical,
> then you'd have adapter/ospfv2 adapter/ospfv3 adapter/isis which
> translates internal struct to wire and wire to internal struct. So any
> features you code, come for free to all of them. But no one is doing
> this, it's 300% effort, and we all pay a premium for that.
>
> In a quarter sane world we'd have some CIC, common-igp-container RFC
> and then new features like SR would be specified as CIC-format,
> instead of OSPFv2, OSPFv3, ISIS and BGP. Then each OSPFv2, OSPFv3,
> ISIS and BGP would have CIC-to-x RFC. So people introducing new IGP
> features do not need to write 4 drafts, one is enough.

While I don't have a real opinion on how to fix the IGP mess, the point
is we sit with it now. Getting all these fixed is going to increase the
bug surface area for some time to come as both vendors and operators
work the kinks out, in addition to SR's own kinks.

Yes, it's all par for the course for new features, which is why I'd also
like to have an alternative that has been baked in for many years to
give me an option for stability, as we roll the new kid out.

I probably will deploy SR-MPLS at some point in my lifetime, but I'm not
feeling awfully comfortable to do so right now; and yet I do need MPLSv6
forwarding.


>
> I would include IPv4+IPv6 my-igp-of-choice SR in my RFP. Luckily ISIS
> is supported on platforms I care about for IPV4+IPV6, so I'm already
> there.

Which is great for you, me, and a ton of other folk that run IS-IS on
Juniper. What about folk that don't have Juniper, or run OSPF?

I know, not your or my problem, but the Internet isn't just a few networks.



> I don't understand this.

I mean the same gaps that exist in RFC 7439, for would-be IPv6-only MPLS
networks.



> And it's really just adding TLV, if it already does IPv4 all the infra
> should be in place, only  thing missing is transporting the
> information. Adding TLV to IGP is a lot less work than LDPv6.

What we theorize as "should be easy" can turn out to be a whole
discussion with the vendors about it being months or years of work. Not
being inside their meeting rooms, I can't quite challenge how they
present the task.

Fundamentally, LDPv6 already has 5+ years in implementation (and LDPv4
is 20 years old), inter-op issues seem to be mostly fixed, and for what
we need it to do, it's working very well.

There are probably as many networks running SR-MPLS as there are running
LDPv6, likely fewer if your SR deployment doesn't yet support OSPFv3 or
SR-ISISv6. I concede that for some networks looking to go SR-MPLS, label
distribution state reduction is probably higher up on the agenda than
MPLSv6 forwarding. For me, I'd like the option to have both, and decide
whether my network is in a position to handle the additional state
required for LDPv6, if I feel that I'd prefer to deal with a protocol
that has had more exposure to the sun.

Ultimately, boxes with LDPv6 have been shipping for some time, and we
have a ton of them deployed and running for a while now. If it comes
down to kicking out the 20% that won't support it because of an
all-or-nothing vendor approach on a platform without full SR-MPLS
support for all IGP's, it is what it is.



> 3 within a year.
> - PR1436119
> - PR1428081
> - PR1416032
>
> I don't have IOS-XR LDP bugs within a year, but we had a bunch back
> when going from 4 to 5. And none of these are cosmetic, these are
> blackholing.
>
> I'm not saying LDP is bad, it's just, of course more code lines you
> exercise more bugs you see.
>
> But yes, LDP has a lot of bug surface compared to SR, but in _your
> network_ lot of that bug surface and complexity is amortised
> complexity. So status quo bias is strong to keep running LDP, it is
> simpler _NOW_ as a lot of the tax has been paid and moving to an
> objectively simpler solution carries risk, as its complexity is not
> amortised yet.

And FWIW, if some operators are willing to benefit from all the
experience that has gone into developing and maintaining LDP, while we
let SR settle down, I don't see why that choice shouldn't be there.

I'm not saying it should be an SR vs. LDP debate like it was
BGP-signaling vs. LDP-signaling for VPLS 12+ years ago. All I'm saying
is for those who want to go bleeding edge with SR, go for it. For those
who prefer to gracefully transition toward SR over time by settling on
LDP that has been in the field for a minute, go for it too.

I won't claim to know whether LDP or SR have a smaller or larger bug
surface area. What I do know is that there will be plenty of bugs for
SR, as there have been for MPLS and all related protocols in the last
20+ years. From my side, I'd prefer to give SR the time it needs to get
all of its Vitamin D, but don't oppose anyone that prefers to deploy it.


> I can't add anything to the upside of going from LDP to SR that I've
> not already said. You get more by spending less, it's win:win. Only
> reason to stay in LDP is status quo bias which makes short term sense.

I can't argue the usefulness of reducing label distribution state in
MPLS. Heck, that is what got me excited about SR back in 2013, and also
what caused me to pump the brakes on the noise I was making to vendors
about developing LDPv6 (which started in 2008), because I was finally
going to get native MPLSv6 forwarding in SR without all the LDP/RSVP
fluff. But, things took their own turn, and with the IGP mess that it
currently is, we are where we are. Thankfully, some vendors did develop
LDPv6 anyway, so we got MPLSv6 in the end as SR was still in the embryo.

If I'm still in the game in half-a-decade from now or so, I will very
likely dump LDP and move to SR-MPLS. I'm just not too comfortable doing
so now because IGP support is not where it needs to be, and it still has
to through its own life cycle of bugs and fixes, which will be quite an
effort as global deployment is still far behind LDP and RSVP.


> RIP might make sense in some deployments, because it's essentially
> stateless (routes age out, no real 'session') so if you have 100k VM
> per router that you need to support and you want dynamic routing, RIP
> might be the least resistance solution with the highest scale. Timing
> wheels should help it scale and maintain great number of timers.

I guess my point was the vendors won't be dumping RIP, even if general
conensus is to avoid it whenever possible.

If I'm not concerned about LDP state, and protocol stability is more
important to me in the near-to-medium term, we'd be remiss to start a
culture of taking that choice away.

Because the next time vendors get bored with what they've built and sold
and decide that SR or some other feature has seen enough light of day,
let's dream up something else to shout about between the 2030 -  2040
decade, they'll have the had the experience of cornering operators into
making rash decisions, and they'd never let us forget it.

Mark.




More information about the NANOG mailing list