Curiosity about AS3356 L3/CenturyLink network resiliency (in general)

Mike Hammett nanog at ics-il.net
Thu May 17 13:24:53 UTC 2018


I often question why\how people build networks the way they do. There's some industry hard-on with having a few ginormous routers instead of many smaller ones. I've learned that when building Internet Exchanges, the number of networks that don't have BGP edge routers in major markets where they have a presence is quite a bit larger than one would expect. I heard a podcast once (I forget if it was Packet Pushers or Network Collective) postulating that the reason why everything runs back to a few big ass routers is that someone decided to spend a crap-load of money on big ass routers for bragging rights, so now they have to run everything they can through them to A) "prove" their purchase wasn't foolish and B) because they now can't afford to buy anything else. 

There's no reason why Tampa doesn't have a direct L3 adjacency to Miami, Atlanta, Houston, and Charlotte over diverse infrastructure to all four. Obviously there's room to add\drop from that list, but it gets the point across. 



----- 
Mike Hammett 
Intelligent Computing Solutions 
http://www.ics-il.com 

Midwest-IX 
http://www.midwest-ix.com 

----- Original Message -----

From: "David Hubbard" <dhubbard at dino.hostasaurus.com> 
To: nanog at nanog.org 
Sent: Wednesday, May 16, 2018 11:59:42 AM 
Subject: Curiosity about AS3356 L3/CenturyLink network resiliency (in general) 

I’m curious if anyone who’s used 3356 for transit has found shortcomings in how their peering and redundancy is configured, or what a normal expectation to have is. The Tampa Bay market has been completely down for 3356 IP services twice so far this year, each for what I’d consider an unacceptable period of time (many hours). I’m learning that the entire market is served by just two fiber routes, through cities hundreds of miles away in either direction. So, basically two fiber cuts, potentially 1000+ miles apart, takes the entire region down. The most recent occurrence was a week or so ago when a Miami-area cut and an Orange, Texas cut (1287 driving miles apart) took IP services down for hours. It did not take point to point circuits to out of market locations down, so that suggests they even have the ability to be more redundant and simply choose not to. 

I feel like it’s not unreasonable to expect more redundancy, or a much smaller attack surface given a disgruntled lineman who knows the routes could take an entire region down with a planned cut four states apart. Maybe other regions are better designed? Or are my expectations unreasonable? I carry three peers in that market, so it hasn’t been outage-causing, but I use 3356 in other markets too, and have plans for more, but it makes me wonder if I just haven't had the pleasure of similar outages elsewhere yet and I should factor that expectation into the design. It creates a problem for me in one location where I can only get them and Cogent, since Cogent can't be relied on for IPv6 service, which I need. 

Thanks 






More information about the NANOG mailing list