Converged Networks Threat (Was: Level3 Outage)

Matthew Crocker matthew at crocker.com
Wed Feb 25 18:43:14 UTC 2004


> 	I'm saying that if a network had a FR/ATM/TDM failure in the past
> it would be limited to just the FR/ATM/TDM network.  (well, aside from
> any IP circuits that are riding that FR/ATM/TDM network).  We're now 
> seeing
> the change from the TDM based network being the underlying network to 
> the
> "IP/MPLS Core" being this underlying network.
>
> 	What it means is that a failure of the IP portion of the network
> that disrupts the underlying MPLS/GMPLS/whatnot core that is now
> transporting these FR/ATM/TDM services, does pose a risk.  Is the risk
> greater than in the past, relying on the TDM/WDM network?  I think that
> there could be some more spectacular network failures to come.  Overall
> I think people will learn from these to make the resulting networks
> more reliable.  (eg: there has been a lot learned as a result of the
> NE power outage last year).
>

Internet traffic should run over an IP/MPLS core in a separate session 
(VRF, Virtual context, whatever..) so the MPLS core never sees the full 
BGP routing information of the Internet.  So long as router vendors can 
provide proper protection between routing instances so one virtual 
router can't consume all memory/cpu; The MPLS core should be pretty 
stable.  The core MPLS network and control plane should be completely 
separate from regular traffic and much less complex for any given 
carrier.  VoIP, Internet, EoM, AToM, FRoM, TDMoM should all run in 
separate sessions all isolated from each other.  A router should act 
like a unix machine treating each MPLS/VRF session as a separate user, 
isolating and protecting users from each other, providing resource 
allocation and limits.  I'm not sure of the effectiveness of current 
generation routers but it should be coming down the line.   That said, 
the IP/MPLS core should be more stable than traditional TDM networks, 
the Internet itself may not stabilize but that shouldn't affect the 
core.  What happened at L3 was an internet outage, that shouldn't in 
theory affect the MPLS core.  Think back 10 years when it was common 
for a unix binary to wipe out a machine by consuming all resources 
(fork bombs anyone?).  Unix machines have come a long way since then.  
Routers need to follow the same progression.  What is the routing 
equivalent of 'while (1) { fork(); };'?  Currently it is massive BGP 
flapping that chew resources.  A good router should be immune to that 
and can be with proper resource management.


-Matt




More information about the NANOG mailing list