shim6 @ NANOG (forwarded note from John Payne)

Iljitsch van Beijnum iljitsch at muada.com
Tue Feb 28 19:04:02 UTC 2006


On 28-feb-2006, at 16:34, Todd Vierling wrote:

>> A B Y
>> C C C D Y

>> All else being equal, X will choose the path over A to reach Y.

> There's plenty of route mangler technologies out there that provide
> overriding BGP information to borders that trumps path length.   
> "All else"
> is often not as equal as you seem to expect.

> It's time to wake up and smell the intelligent routing trend.  The
> usefulness of prepending is rapidly dwindling.  Don't try to push  
> it as a
> future-compatible solution; it is not.  Prepending is not a tool;  
> it is a
> hack that has outlived its usefulness.

In my experience, if anything, AS path prepending is TOO effictive:  
just one prepend can make a 60/40 split that you're trying to get to  
50/50 into 25/75 instead. So I agree that it's not as useful as it  
used to be, but I blamed this on the flattening of the AS  
interconnection hierarchy. But maybe it's the routing/TE boxes that  
are responsible.

>> Another capability that would be hard to replicate with shim6 is  
>> selective
>> announcement.

> Now, selective announcement is something completely different --  
> but it's
> still a historical hack for lack of better mechanisms in BGP[34].   
> If the
> route isn't there at all, it won't be selected in today's world.

Right. That would be hard to accomplish with shim6.

> But also consider this:

> - C does not advertise the prefix for Y, but it does have the next
>   superprefix for Y (and C is "transit", so the superprefix must be
>   considered valid);

> - X's link to A dies.

> So X will still try to push packets over C to reach Y, and per the  
> existence
> of the superprefix on C, that route should[!] be valid.

This kind of thing is, as far as I can see, pretty much impossible to  
replicate in shim6. Mind you, even if we end up with PI in IPv6, it's  
unlikely that you get to do this with IPv6 because the address space  
and the provider aggregates are so large, that deagregating becomes a  
hazard rather than a nuisance. Deaggregating a /32 into /48 makes for  
upto 65536 additional routes, which is a third of the current IPv4  
routing table (and several dozen times the current IPv6 routing  
table). So I think most people will use strict prefix length filters  
to avoid this. At least, after it has happened for the first time.

> Don't think this will forever be a rare circumstance, either.  The  
> route
> mangling technologies I mentioned above are now starting to offer the
> ability for traffic to go out a "transit" neighbor so long as some
> containing prefix is advertised (even if it's not the most specific).

> Traffic engineering is happening on both ends of the BGP mesh  
> *today*, so
> you should present any proposed solution in that context.

I'm not too worried about what happens on both ends: since both ends  
implement the shim protocol and the two ends communicate with each  
other, we can build in whatever is required. The challenges are:

- getting site wide policies into the individual hosts or apply side  
wide policies in middleboxes in a secure way
- come up with a reasonable way to have information "in the middle"  
taken into account

And we have to figure out which capabilities must be present as a  
mandatory part of the specification on day one, and which can be  
optional and/or added later. (Ideally, all TE is kept outside of the  
base spec because modularity makes everything easier, but some stuff  
is only useful if it's everywhere so it either has to be mandatory or  
forget it, and other stuff is so important that we need it from day  
one.)



More information about the NANOG mailing list