multi-homing fixes

Iljitsch van Beijnum iljitsch at muada.com
Fri Aug 31 14:42:35 UTC 2001


On Thu, 30 Aug 2001, Sean M. Doran wrote:

> I probably still don't get it, but let me see if I understand
> the mechanism.

> First, assign a prefix to a particular non-topological "locus",
> such as a metropolitan area, or a continent.

How this is done is important, because it influences the number of
customers an ISP will have per bitmap. Assigning a prefix to a continent
wouldn't be a good idea, because that way every regional ISP has to
announce the very large bitmap for the entire continent, while most of it
contains just zeros. Per metro area would be better. But two ISPs that
have many multi-homing customers in common could use a prefix for just the
two of them, regardless of geography.

>  Second, networks
> inside that locus will announce only the prefix, but with these
> exception bits.  [Implied, but not stated: third, all these
> networks will exchange full information so as to be able to
> generate these exception bits].

The bitmaps are generated inside the source AS (presumably, iBGP will
still carry regular routes) and the bitmaps are transmitted from one
network to another, so there is no requirement for full interconnetion at
the routing level.

>  Fourth, receivers of these
> prefixes, with the exception bits, will expand the longest-match trie
> (a Patricia tree is a compact representation of a trie, in common
> use when you have data with many nodes with just one child) so
> that lookups will only match in the case where there is no exception.

Yes.

> If I understand you, what you are trying to do is to reduce the
> requirement for EVERY network operating within the aggregate to
> carry traffic to the ENTIRE aggregate at all times.

Yes.

> This ordinarily
> would require announcing more specifics.  So you propose a scheme
> where you use an attribute instead of the more specifics.  Unfortunately,
> your attribute will cause the same behaviour in a receiver as
> would the list of more specifics, and therefore is merely a compression
> of the representation on the line that is somewhat better than, say, gzip.

> IOW, I think you are solving the wrong problem.

I'm mostly trying to solve the memory problem, but it should also help
with (but certaintly not completely solve) the processing problem.

Since an updated bitmap is always the same size and it updates many routes
at a time, it should take less CPU power to process the updates. Also, you
could make a certain group of routers responsible for the more specifics
(this would work well if the prefixes are assigned geographically) and let
the others delay processing of the bitmaps or even drop the bitmaps
completely.

> Your scheme does let one warn of black holes in this eventuality,
> takes a bit less bandwith on the line, probably allows for the
> "slosh" to happen all at once rather than in dribs and drabs,
> and so forth, but it represents the same amount of work for the
> routers processing the attribute.   That is, those routers are
> effectively brought inside the abstraction boundary of the "locus",
> and as a result the goal of hiding information from those routers
> is not met.

I think the only way to really know what the processing benefits of all of
this are is implementing it, or run detailed simulations, but those
require pretty much an implementation as well.

Note that bandwidth on the line is not an issue, BGP encodes the routing
information sufficiently efficient.

> My gut feeling is that for any sizable "locus", almost all of
> what we consider the core of the global routing system would be contained
> within the new abstraction boundary, so we're no better off than
> not aggregating in the first place.

> That is, we are MUCH better off with PA addressing.

Suppose that every "P" would only announce a single "A". (I know, the
other 300 are important too, but just for the sake of argument.) Would
that solve the problem? Only if there is a limit on the number of ISPs. I
don't think there is such a limit. I have my own web and mail servers at
home, along with a router that can do BGP and handle incoming modem
connections. So basically, I'm my own ISP. I have recently helped a medium
sized business with their BGP and they became an "ISP" so they could get a
/20.

The only way we're ever going back to a 8k routing table in IPv6 is if
multihoming at the host level becomes a decent alternative. There is SCTP,
a transport protocol that will handle multiple source and destination IP
addresses, so when one path goes down, it will use another. (SCTP is
useless as a TCP replacement, though.) And there have been successful
experiments with adding this kind of functionality to TCP.

But the problem is that you can't just update a billion or so running TCP
stacks over night. Multihoming will be here for a while. Filtering is
coming back in style now, but it will go away when customers start to
notice they can't reach certain destinations through certain networks:
that's bad business. (It will also make multihoming even more attractive.)
So we either start to build better EGPs now, even if we don't have a new
algorithm that will magically make everything right, or start buying Cisco
and Juniper stock while it's low.




More information about the NANOG mailing list