N+? redundancy

Michael.Dillon at radianz.com Michael.Dillon at radianz.com
Fri Apr 15 16:10:35 UTC 2005


> >  And a very few population centers such as New York,
> >  London, Tokyo, and Cheyenne Mountain should probably
> >  have more than 5 paths.
> 
>    I disagree.  They may need that many spare paths beyond what is 
> required to provide their services, but in my experience pretty much 
> all infrastructure services are overloaded (or at least heavily 
> loaded), and even if they have N+M "redundancy", eliminating just one 
> or two of those "M" links will be enough to overwhelm the others and 
> take everything down in a complete cascade failure.

Now you are talking about path capacity as well as
the separacy issue. Let's consider the situation where
you are designing a network to serve a city of over
1 million population. According to the rule of thumb,
5 paths are enough. What does this mean in practice?

First, it probably means that you need to have 5
PoPs fully meshed within the city, or perhaps a larger
number of PoPs in a partial mesh, so that you can get
traffic from any point in the city to one of your 5
exits. However, the rule of thumb doesn't talk about
this at all.

Let's further simplify and consider the city to be
a node, with 5 paths radiating out from it. One path
could fail and the traffic from that path would have
to be distributed to the other 4. As you have pointed
out, this doesn't work unless all paths are running
at four fifths of a normal load. Presumably, since
IP circuits cannot run at 100% capacity, the 5 circuits
are at somewhat less than 80%, but for now, let's
just assume that they are running at no more than
80%. This is an N+1 scenario where one link can fail
and service continues. But if two links fail, then
3 links must carry all the traffic, therefore all
links must be at less than 60% of capacity. This 
would be an N+2 scenario.

In the rule of thumb, I didn't really consider
what failover scenario was right because it isn't
a rule of thumb if it goes into too much detail.
But the numbers, 1, 2, 3, 5, were chosen because
I think that the sites which close up every
evening, can live with N+0, then the others
can probably live with N+1 except when you get
to population centers above 1 million where 
connectivity to the rest of the world is important
enough to have N+2 overall.

In the real world, at the city level, there is 
more than one company providing the connectivity
so it becomes tricky to analyze the true connectivity.
Remember, paths are not equal to circuits. Therefore,
the aggregate of all circuits from all companies which
connect Chicago to St. Louis should count as one path.

I'd love to see someone do a serious academic 
analysis along these lines to see what kind of 
"rules of thumb" have EMERGED from the past decades
of network building and consolidation. It would
be interesting if this type of research compared 
the network's topology to the topology of villages,
market towns and cities which is remarkably uniform
across continents and civilizations.

--Michael Dillon




More information about the NANOG mailing list