Converged Networks Threat (Was: Level3 Outage)
Matthew Crocker
matthew at crocker.com
Wed Feb 25 18:43:14 UTC 2004
> I'm saying that if a network had a FR/ATM/TDM failure in the past
> it would be limited to just the FR/ATM/TDM network. (well, aside from
> any IP circuits that are riding that FR/ATM/TDM network). We're now
> seeing
> the change from the TDM based network being the underlying network to
> the
> "IP/MPLS Core" being this underlying network.
>
> What it means is that a failure of the IP portion of the network
> that disrupts the underlying MPLS/GMPLS/whatnot core that is now
> transporting these FR/ATM/TDM services, does pose a risk. Is the risk
> greater than in the past, relying on the TDM/WDM network? I think that
> there could be some more spectacular network failures to come. Overall
> I think people will learn from these to make the resulting networks
> more reliable. (eg: there has been a lot learned as a result of the
> NE power outage last year).
>
Internet traffic should run over an IP/MPLS core in a separate session
(VRF, Virtual context, whatever..) so the MPLS core never sees the full
BGP routing information of the Internet. So long as router vendors can
provide proper protection between routing instances so one virtual
router can't consume all memory/cpu; The MPLS core should be pretty
stable. The core MPLS network and control plane should be completely
separate from regular traffic and much less complex for any given
carrier. VoIP, Internet, EoM, AToM, FRoM, TDMoM should all run in
separate sessions all isolated from each other. A router should act
like a unix machine treating each MPLS/VRF session as a separate user,
isolating and protecting users from each other, providing resource
allocation and limits. I'm not sure of the effectiveness of current
generation routers but it should be coming down the line. That said,
the IP/MPLS core should be more stable than traditional TDM networks,
the Internet itself may not stabilize but that shouldn't affect the
core. What happened at L3 was an internet outage, that shouldn't in
theory affect the MPLS core. Think back 10 years when it was common
for a unix binary to wipe out a machine by consuming all resources
(fork bombs anyone?). Unix machines have come a long way since then.
Routers need to follow the same progression. What is the routing
equivalent of 'while (1) { fork(); };'? Currently it is massive BGP
flapping that chew resources. A good router should be immune to that
and can be with proper resource management.
-Matt
More information about the NANOG
mailing list