shim6 @ NANOG (forwarded note from John Payne)
toasty at dragondata.com
Wed Mar 1 01:16:02 UTC 2006
On Feb 28, 2006, at 1:22 PM, Iljitsch van Beijnum wrote:
> On 28-feb-2006, at 17:09, Kevin Day wrote:
>> 4) Being able to do 1-3 in realtime, in one place, without waiting
>> for DNS caching or connections to expire
> How fast is real time?
> And are we just talking about changing preferences here, or about
> what happens when there are outages?
5-30 seconds? Including already established connections.
"Oh, crap. We're going over our commit on provider C because of a
traffic surge on one of our sites. We need to rebalance this before
we get dinged for 95th percentile overage."
"Packet loss to AS1234 through provider A suddenly skyrocketed. We
need to bypass A to that ASN until it's fixed."
"1 of the 2 lines in our trunk to provider B went down, we're at half
bandwidth. We need to shed some load immediately."
We also have incredibly long TCP sessions for some of our services
(streaming video/audio). We need to be able to make routing changes
while those are active, without relying on a keepalive failing to
make the hosts re-evaluate their path decision. If I'm a VOIP
provider, I can't wait for someone to hang up a phone call for new
routing policy to take effect. A VPN provider could have sessions
open for days/weeks.
We make extensive use of near-immediate routing changes on both
inbound and outbound, relying on the fact that they take effect
immediately. No matter where we put the routing information, how are
the end nodes that are now making the routing decisions going to see
the changes quickly? And how do they see changes for already
Anything done in DNS is just too slow. As an example, take a busy/
popular website. Put a 5 minute TTL on the records weeks in advance.
Change the IP and watch how long it takes for 100% of the traffic to
stop reaching the old IP. 90% within 1-3 hours, 99% within 24 hours.
You'll still get hits to the old IP days later. Too many people
blatantly disregard DNS caching, or just get it wrong.
>> 5) Being able to make routing/policy changes without having to
>> rely on the owners/administrators of the machines/sites/domains
>> themselves to do the right thing. (i.e. untrusted/not-maintained-
>> by-us systems/networks on our network)
> If you're a multihomed hosting company you would want to do TE for
> your entire POP, but you wouldn't necessarily be able to change
> information in the DNS for all the hosts/services that your
> customers run. Is that what you mean?
Exactly. More detail in my followup message.
>> 6) Anycast?
> I don't think shim6 applies to interdomain anycast. (Which is a
> hack anyway.)
Well, it's a hack that many people are using. If we can't do anycast
after we migrate to IPv6, that again raises the bar of transitioning.
>> 7) During what will be a very lengthy dual-stack transitional
>> period, having to do TE in two entirely different ways. BGP
>> +Prepending+Selective-announcements along side Shim6 doesn't
>> really sound like fun to me. We can't treat bits as bits, we have
>> to consider if they're IPv4 bits or IPv6 bits, and engineer them
>> differently, even though they're sharing the same lines and are
>> probably going to have a 1:1 addressing relationship between IPv4
>> and IPv6 services.
> This is a result of the transition to IPv6, regardless of shim6.
It is, but it's one more thing in the list of "We have to do things
differently, and it's questionable if it's better - if not flat out
worse" things about moving to IPv6. From a hosting company's standpoint:
1) Virtually unlimited IP space
1) Even if you qualified for PI space in IPv4, unless you're huge,
you're not getting PI space in IPv6. Want to change providers? You're
renumbering all of your customers.
2) If you do need to move, your new provider can't temporarily
announce your space from your old provider, which is possible now.
3) No matter how easily configurable IPv6 makes renumbering, you are
going to have customers leave rather than deal with readdressing.
Some just won't respond/do anything at all no matter how much you
harass them that they need to take an action. "Big" hosting companies
who do enough connectivity sales to justify PI space get the upper hand.
4) Once you publish AAAA records, every user who has broken their
IPv6 stack on their desktop (even if they don't have IPv6
connectivity at all) suddenly can't reach you.
5) The only proposal that looks like it has any traction at all to
multihome(shim6) requires trust in customers to administer their
boxes to our instructions a lot more closely, and/or requires control
over DNS for each site we host.
6) If you do get PI space, the mantra of "Announce only/exactly what
you were allocated. No more specifics. No deaggregation." requires a
complete redesign of how a lot of us do things.
And now adding shim6 to the mix:
7) You can't run BGP or traffic engineer your network the way you're
doing with IPv4. You now have two places you have to make routing
policy decisions, and they're done in completely different ways.
8) If you're using shim6, public/private peering is probably not
possible either. (And yes, there are those who participate in peering
arrangements who don't provide transit to others, and wouldn't
qualify for PI space)
The "migrate to IPv6" pain v.s. benefit ratio for those actually
running the content side of the internet is pretty poor at the
moment. I don't think you'll be finding many doing it willingly at
this stage, or in the foreseeable future.
And don't confuse this with laziness or some dislike to IPv6. I went
into our transition attempt really wanting to make this work, and
eventually dropped it because it would require too many business-
model changing transitions to do so.
>> On top of those, even if shim6 accomplishes the failover and
>> reliability goals, I can't see how shim6 is going to make path
>> decisions as optimal as IPv4/BGP/etc.
> Really??? The way I see it, BGP decisions today are mediocre at
> best. If anything, I would expect things to get better with shim6.
BGP has the benefit of each network in the middle being able to add
their say into things. Each transit network can prepend/localpref/med/
etc to produce an end-to-end decision. Shim6 presents both ends with
multiple choices, but little in the way of information as to which
one to prefer. It's also moving the decision making into LOTS of
equipment, instead of the borders. Any fancy ideas we come up with to
make better decisions has to be deployed everywhere, and possibly on
equipment we don't control.
BGP allows information to be added to the routing decision making
process that isn't visible from each end. We're making use of that now.
More information about the NANOG