Does anyone multihome anymore?
scg at gibbard.org
Wed Aug 22 19:26:47 UTC 2007
On Wed, 22 Aug 2007, Mike Tancsa wrote:
>> > Multihoming is great for when there is a total outage. In the case of
>> > Cogent on Monday, it wasnt "down"... In this case, there is only so much
>> > you can do to influence how packets come back at you as BGP doesnt know
>> > anything about a "lossy" or slow connections.
>> > ---Mike
>> Take the carrier that is causing you issues out of your eBGP setup and
>> all's well....
> In my case, I have 6453 and 174 for transit. I want to get to 577 which is
> directly connected to 6453 and 174. 577 has a higher local pref on paths via
> 174. Short of shutting my 174 session (or some deaggregation), I dont have a
> way to influence how 577 gets back to me. I can easily exit out 6453, but it
> does nothing for the return packets. I have enough capacity on 6453 to
> handle all my traffic, but its a Draconian step to take and some traffic via
> 174 is fine and would be worse if I fully shut the session. (ie. peers of 174
> in Toronto)
I'm posting too much this week and should stop, but...
Again, this is a matter of thinking about design goals. What were you
trying to accomplish when you bought redundant connections? It probably
wasn't "redundancy," but rather something that redundancy would give you.
What redundancy gives you is a better statistical probability that not all
of the redundant components will be broken at once.
It should be noted that multi-homing is just one of many areas of possible
redundancy. Anything else that can break -- routers, switches, cables,
etc. can all be set up redundantly. No amount of redundancy in any of
those components guarantees reliability. What they do mean is that your
network can keep functioning if some components break, as long as you
still have enough of whatever component it is to keep running.
So, in a redundant setup, what happens when a component breaks? In an
ideal situation, it breaks cleanly, fail-over happens automatically, and
nobody notices. Then you just have to hope your monitoring system is good
enough that you know there's something to fix. But, in an ideal
situation, things wouldn't break at all, so designing your procedures
around "ideal" failure scenarios doesn't make much sense. What redundancy
really gives you is the ability to have outages not turn into major
disruptions; the ability, when you see that a component is malfunctioning,
to turn it off and go back to sleep. You can then do the real fix later,
when it's more convenient or less disruptive.
Thought about that way, there's nothing "Draconian" about turning off a
connection (or a switch, or a router, or any other redundant component)
that's not doing what you want it to. Instead, you're taking advantage of
a main feature of your design. If your other providers are doing 95th
percentile billing, you even have a day and a half per month that you can
leave a connection down at no financial cost. The alternative, as you
seem to have noticed, is to spend your day stressing out about your
network not working properly, and complaining about being helpless. You
don't need redundancy for that.
More information about the NANOG