shim6 @ NANOG (forwarded note from John Payne)
jabley at isc.org
Wed Mar 1 15:07:39 UTC 2006
On 1-Mar-2006, at 02:56, Kevin Day wrote:
> On Mar 1, 2006, at 12:47 AM, Joe Abley wrote:
>>> o a small to medium multi-homed tier-n isp
>> A small-to-medium, multi-homed, tier-n ISP can get PI space from
>> their RIR, and don't need to worry about shim6 at all. Ditto
>> larger ISPs, up to and including the largest.
> If you include "Web hosting company" in your definition of ISP,
> that's not true.
Right. I wasn't; I listed them separately.
It's important to note that even if you are a hosting company who
*does* qualify for PI v6 space, you still need shim6-capable servers,
if you want to make them optimally available to multi-homed, shim6-
capable hosts. The difference PI makes is in the distribution of
addresses to servers (the servers only need a single set).
> You don't get PI space, and Shim6 is looking like your only
> alternative for multihoming.
Right. For a hosting company with multiple PA netblocks, shim6 is the
option on the table.
> Many content providers set up multiple non-interconnected POPs in
> different geographical locations. The only way this can be
> accomplished is by making separate announcements in each POP for
> each space. This means either being able to deaggregate, or to get
> a block for each POP. I don't know of *ANY* that are deploying 5000
> + servers per POP.
Right. With shim6, getting a block per POP is trivial, since they are
all PA assignments from transit providers.
> I'm just one guy, one ASN, and one content/hosting network. But I
> can tell you that to switch to using shim6 instead of BGP speaking
> would be a complete overhaul of how we do things.
You are not alone in fearing change.
> Putting routing decisions in the control of servers we don't
> operate scares me. I wouldn't rely on 90% of our customers to get
> this right unless it was completely idiot proof. Even if it was, I
> don't see how we can trust that users aren't messing with things to
> "game the system" somehow.
This is the kind of feedback that the shim6 architects need. There is
talk at present of whether the protocol needs to be able to
accommodate a site-policy middlebox function to enforce site policy
in the event that host behaviour needs to be controlled. The scope of
that policy mediation function depends strongly on people like you
saying "at a high level, this is the kind of decision I am not happy
with the hosts making".
> We deal with long lived TCP sessions (hours/days). I don't see how
> routing updates can happen that won't result in a disconnect/
> reconnect, which isn't acceptable.
One of the primary objectives of shim6 is to provide session
survivability over re-homing events. Since routing protocols are not
used to manage re-homing, the speed at which a session can recover
from a topological event depends on the operation of the shim6
protocol between client and server.
It seems reasonable to say that in some cases shim6 re-homing
transitions will be faster than the equivalent routing transition in
v4; in other cases it will be shorter. Depends on the network, and
how enthusiastically you flap, perhaps.
The experience of people who provide services involving long-held TCP
sessions is exactly the kind of thing that the shim6 architects need
to hear about.
> We have peering arrangements with about 120 ASNs. How do we mix BGP
> IPv6 peering and Shim6 for transit?
You advertise all your PA netblocks to all your peers.
> So far it looks like Shim6 is going to rely on DNS. The DNS caching
> issue is a real problem. We need changes to happen faster than DNS
> caching will allow.
Well, not quite.
If you change a transit provider, then you need to remove a set of
AAAA records from the servers you operate, and substitute a new set.
The time taken for this change to propagate in the DNS is non-zero,
assuming you use reasonable TTLs. This is your point above, I think.
With shim6-capable clients and servers, the dark period during which
the changes propagate is handled by an address selection/retry
algorithm in the client (for new sessions) and by the shim6 protocol
doing failure detection and selecting a new locator (for established
Once the DNS change has propagated, the address selection and shim6
band-aids are no longer required, and clients have an accurate set of
Renumbering for hosting providers can be a monstrous pain in the
neck, especially for hosting providers who rely on third parties (or,
horrors, their customers) to maintain the zone files within which
services are named.
Some hosting providers of my acquaintance insist on customer zones
being redelegated to the hosting providers' nameservers, so that any
renumbering that needs to happen can be coordinated by the hosting
provider directly. Hosting providers who don't do this, and who use
PA addresses with shim6 to multi-home, are definitely going to face
> Our network is complicated. We have a /21 that's split into 4 /23s.
> One for each non-interconnected POP. We only advertise the /23 for
> each POP out to transit, but we give peers access to our entire
> network wherever they peer with us and we pay to haul/tunnel it
> around. How do we even do this without PI space, let alone through
You avoid it completely, and use PA space in every POP. You can still
announce PA space from other POPs to peers, if you want to retain
> For quite the foreseeable future, we'd be running IPv4 and IPv6 at
> the same time, over the same transit connections. We'd have to TE
> our IPv6 bits completely differently than our IPv4 bits, even
> though we'd be billed for the aggregate usage of both. Automated
> tools for tweaking total usage per transit port is hard enough in
> BGP. Having to tweak both BGP and some external shim6 method of TE
> when the goal is a common aggregate number is going to be a very
> difficult issue.
Yep. Difficult and expensive.
> Some of our applications are extremely sensitive to jitter/latency.
> We've spent ages tweaking route-maps manually (and through
> automated continual tweaking) to make sure we avoid any congested
> links. [...]
The site-policy middleware that I alluded to earlier seems like the
analogous place to specify this policy. Such a facility might
actually give you more control than you have now -- tweaking BGP
attributes to accomplish this kind of thing is often like a game of
whack-a-mole; if you were able to control the route taken by traffic
in both directions by influencing the locator selection for each and
every session, you'd have far greater, and more fine-grained, control
over your external traffic than BGP/swamp-abuse gives you currently.
Your specific requirements in this regard (the high-level objectives
that you currently meet using BGP) would no doubt be gratefully
received on the shim6 list.
> We'd still be relying on PA space. No matter how great dhcp6 is,
> there will be significant renumbering pain when providers are
> changed. Static ACLs, firewall rules, etc. If you're including
> customer machines in the renumbering, many simply won't do it.
Agreed, renumbering is a pain. Dhcp6 sounds like a scary thing to use
with servers. Customers suck. Change in operational practices will be
Lest I sound too much like a foam-at-the-mouth shim6 advocate, I
think it would be perfectly fine if, in the final analysis, the
conclusion was that shim6 and PA/renumbering was not an option for
hosting providers. A reasoned technical argument which came to that
conclusion would provide a solid basis for the RIRs to modify their
allocation policies such that hosting providers could use PI space
instead. As perhaps the recent attempt to change the v6 PI policy
indicates, the chances of making changes without such a reasoned
argument are slim.
However, I think it's possible that shim6, incorporating some
facility for a site to manage the locator selection of the hosts,
could actually make some things easier for hosting providers. There
might even be reasons to like it :-)
More information about the NANOG