shim6 @ NANOG (forwarded note from John Payne)

Joe Abley jabley at isc.org
Wed Mar 1 15:07:39 UTC 2006



On 1-Mar-2006, at 02:56, Kevin Day wrote:

> On Mar 1, 2006, at 12:47 AM, Joe Abley wrote:
>>
>>>   o a small to medium multi-homed tier-n isp
>>
>> A small-to-medium, multi-homed, tier-n ISP can get PI space from  
>> their RIR, and don't need to worry about shim6 at all. Ditto  
>> larger ISPs, up to and including the largest.
>
> If you include "Web hosting company" in your definition of ISP,  
> that's not true.

Right. I wasn't; I listed them separately.

It's important to note that even if you are a hosting company who  
*does* qualify for PI v6 space, you still need shim6-capable servers,  
if you want to make them optimally available to multi-homed, shim6- 
capable hosts. The difference PI makes is in the distribution of  
addresses to servers (the servers only need a single set).

> You don't get PI space, and Shim6 is looking like your only  
> alternative for multihoming.

Right. For a hosting company with multiple PA netblocks, shim6 is the  
option on the table.

> Many content providers set up multiple non-interconnected POPs in  
> different geographical locations. The only way this can be  
> accomplished is by making separate announcements in each POP for  
> each space. This means either being able to deaggregate, or to get  
> a block for each POP. I don't know of *ANY* that are deploying 5000 
> + servers per POP.

Right. With shim6, getting a block per POP is trivial, since they are  
all PA assignments from transit providers.

> I'm just one guy, one ASN, and one content/hosting network. But I  
> can tell you that to switch to using shim6 instead of BGP speaking  
> would be a complete overhaul of how we do things.

You are not alone in fearing change.

> Putting routing decisions in the control of servers we don't  
> operate scares me. I wouldn't rely on 90% of our customers to get  
> this right unless it was completely idiot proof. Even if it was, I  
> don't see how we can trust that users aren't messing with things to  
> "game the system" somehow.

This is the kind of feedback that the shim6 architects need. There is  
talk at present of whether the protocol needs to be able to  
accommodate a site-policy middlebox function to enforce site policy  
in the event that host behaviour needs to be controlled. The scope of  
that policy mediation function depends strongly on people like you  
saying "at a high level, this is the kind of decision I am not happy  
with the hosts making".

> We deal with long lived TCP sessions (hours/days). I don't see how  
> routing updates can happen that won't result in a disconnect/ 
> reconnect, which isn't acceptable.

One of the primary objectives of shim6 is to provide session  
survivability over re-homing events. Since routing protocols are not  
used to manage re-homing, the speed at which a session can recover  
from a topological event depends on the operation of the shim6  
protocol between client and server.

It seems reasonable to say that in some cases shim6 re-homing  
transitions will be faster than the equivalent routing transition in  
v4; in other cases it will be shorter. Depends on the network, and  
how enthusiastically you flap, perhaps.

The experience of people who provide services involving long-held TCP  
sessions is exactly the kind of thing that the shim6 architects need  
to hear about.

> We have peering arrangements with about 120 ASNs. How do we mix BGP  
> IPv6 peering and Shim6 for transit?

You advertise all your PA netblocks to all your peers.

> So far it looks like Shim6 is going to rely on DNS. The DNS caching  
> issue is a real problem. We need changes to happen faster than DNS  
> caching will allow.

Well, not quite.

If you change a transit provider, then you need to remove a set of  
AAAA records from the servers you operate, and substitute a new set.  
The time taken for this change to propagate in the DNS is non-zero,  
assuming you use reasonable TTLs. This is your point above, I think.

With shim6-capable clients and servers, the dark period during which  
the changes propagate is handled by an address selection/retry  
algorithm in the client (for new sessions) and by the shim6 protocol  
doing failure detection and selecting a new locator (for established  
sessions).

Once the DNS change has propagated, the address selection and shim6  
band-aids are no longer required, and clients have an accurate set of  
information.

Renumbering for hosting providers can be a monstrous pain in the  
neck, especially for hosting providers who rely on third parties (or,  
horrors, their customers) to maintain the zone files within which  
services are named.

Some hosting providers of my acquaintance insist on customer zones  
being redelegated to the hosting providers' nameservers, so that any  
renumbering that needs to happen can be coordinated by the hosting  
provider directly. Hosting providers who don't do this, and who use  
PA addresses with shim6 to multi-home, are definitely going to face  
some challenges.

> Our network is complicated. We have a /21 that's split into 4 /23s.  
> One for each non-interconnected POP. We only advertise the /23 for  
> each POP out to transit, but we give peers access to our entire  
> network wherever they peer with us and we pay to haul/tunnel it  
> around. How do we even do this without PI space, let alone through  
> shim6?

You avoid it completely, and use PA space in every POP. You can still  
announce PA space from other POPs to peers, if you want to retain  
your tunnels.

> For quite the foreseeable future, we'd be running IPv4 and IPv6 at  
> the same time, over the same transit connections. We'd have to TE  
> our IPv6 bits completely differently than our IPv4 bits, even  
> though we'd be billed for the aggregate usage of both. Automated  
> tools for tweaking total usage per transit port is hard enough in  
> BGP. Having to tweak both BGP and some external shim6 method of TE  
> when the goal is a common aggregate number is going to be a very  
> difficult issue.

Yep. Difficult and expensive.

> Some of our applications are extremely sensitive to jitter/latency.  
> We've spent ages tweaking route-maps manually (and through  
> automated continual tweaking) to make sure we avoid any congested  
> links. [...]

The site-policy middleware that I alluded to earlier seems like the  
analogous place to specify this policy. Such a facility might  
actually give you more control than you have now -- tweaking BGP  
attributes to accomplish this kind of thing is often like a game of  
whack-a-mole; if you were able to control the route taken by traffic  
in both directions by influencing the locator selection for each and  
every session, you'd have far greater, and more fine-grained, control  
over your external traffic than BGP/swamp-abuse gives you currently.

Your specific requirements in this regard (the high-level objectives  
that you currently meet using BGP) would no doubt be gratefully  
received on the shim6 list.

> We'd still be relying on PA space. No matter how great dhcp6 is,  
> there will be significant renumbering pain when providers are  
> changed. Static ACLs, firewall rules, etc. If you're including  
> customer machines in the renumbering, many simply won't do it.

Agreed, renumbering is a pain. Dhcp6 sounds like a scary thing to use  
with servers. Customers suck. Change in operational practices will be  
required.

Lest I sound too much like a foam-at-the-mouth shim6 advocate, I  
think it would be perfectly fine if, in the final analysis, the  
conclusion was that shim6 and PA/renumbering was not an option for  
hosting providers. A reasoned technical argument which came to that  
conclusion would provide a solid basis for the RIRs to modify their  
allocation policies such that hosting providers could use PI space  
instead. As perhaps the recent attempt to change the v6 PI policy  
indicates, the chances of making changes without such a reasoned  
argument are slim.

However, I think it's possible that shim6, incorporating some  
facility for a site to manage the locator selection of the hosts,  
could actually make some things easier for hosting providers. There  
might even be reasons to like it :-)


Joe



More information about the NANOG mailing list