MPLS routing loop

Swinton, Jeff jswinton at conxion.net
Wed Jan 31 22:15:35 UTC 2001



Has anyone out there deployed MPLS TE in a hybrid Cisco and Juniper
environment
using IS-IS?  In the lab, we've come across an interoperability issue and
we're wondering if anyone else has seen it and, more importantly, determined
a work around.  The issue is a routing loop caused by the differences in how
IOS and JUNOS determine the best IP route when LSPs are present.

Cisco has two approaches for assigning metrics to LSP tunnels:  absolute and
relative.  Absolute metrics are, for the most part, independent of the
underlying IGP metric.  Relative LSP metrics are based on the underlying
IGP - they change when the IGP metrics change.

By setting an absolute metric on a Cisco MPLS tunnel, the metric applies not
only to the path to the egress router, but to all paths downstream of egress
router as well.  So given A-B-C-D-E, with A (Cisco) having a tunnel to B
with absolute metric m, A will have routes to B, C, D, and E all with metric
m, no matter what the IGP link metrics are between B, C, D, or E.

JUNOS behaves differently.  Downstream IGP metrics are added to the tunnel,
so route selection by A, if it were a Juniper, would consider B-C-D-E IGP
metrics before installing the route.

OK.  So given this behavior, here's how the loop occurs:



      x			y
      |                 |
      |                 |
      A                 B
      |                 |
      |                 |
      J-----------------C


J = Juniper
C = Cisco
x and y are EBGP peers that advertise prefix z with the same BGP attributes.

J-C has an IS-IS link metric of 1000.
J-A has an IS-IS link metric of 10.
C-B has an IS-IS link metric 5.

Now build LSP Q (in both directions) between J and C, with LSP metric 3.
This Cisco-originated metric is "absolute".

Router A and B have EBGP sessions to x.
J and C have BGP sessions to both A and B (full IBGP mesh).

The route from C to A is through J with metric 3.
The route from J to B is through C with metric 3 + 5 = 8.

The Juniper, A, sees two routes to destination z with the determining factor
being IGP
distance.
It has a metric of 10 to A and a metric of 8 to B, so it forwards the
packets to C.

The Cisco, C, sees two routes to destination z with the determining factor
being IGP
distance.  It has a metric of 3 to A (because it doesn't add the downstream
metric) and
a metric of 5 to B.  The Cisco forwards the packet to J.

Voila.  Loop.

With a full LSP mesh of routers, you wouldn't see this problem, but there
are reasons why you might not have a full mesh.

 - Some routers on a network might not support MPLS.
 - Link or protocol (e.g. RSVP) failures may cause a node to drop out of the
mesh.
 - And then there's always misconfigurations.
 - Large providers may decide to have a core LSP mesh only, to minimize
scaling complexity.






More information about the NANOG mailing list