Outbound Route Optimization
Sean Finn
seanf at routescience.com
Mon Jan 26 18:58:49 UTC 2004
Richard,
you have made some good points in this thread.
One general observation, and then specific responses
... I don't assert that current route optimization
technology solves ALL routing problems, but do think
that there are some specific problems that automation
can effectively, and gracefully solve.
> * The inability to receive FULL bgp routes from every bgp peer to your
> optimization box without requiring your transit providers to set up a host
> of eBGP Multihop sessions (which most refuse to do). This means you will
> always be stuck assuming that every egress path is a transit and can reach
> any destination on the Internet until your active or passive probing says
> otherwise.
The issue that you describe does indeed offer some
constraints to the application of route optimization
technology. Within the scope of this issue, though,
I think that you would agree that a network which is
ALL transit would face no challenge here -- and more
specifically, if there is a routing optimization
decision among local transit links, that problem
could be solved independantly of the existance of
"non-transit" links.
Applying this technology in the presence of "non-
transit" routes requires constraining measurments to
only the prefixes appropriate for a given link. It
is true that knowing all BGP routes ("BGP Losers")
would be a nice way to get this information ...
but it's not necessarily the only approach towards
the goal. Some solutions may have topological
dependancies, but it can be feasible to simply drop
all measurement towards "illegal" destinations.
In other cases, it may be possible to define the
set of destinations that are legal over a given
link, and constrain measurements for that link.
> * The requirement of deaggregation in order to make best path decisions
> effective. For example, someone's T3 to genuithree gets congested and the
> best path to their little /24 of the Internet is through another provider.
> Do you move 4.0.0.0/8?
Perhaps. Yes, it's a /8. But if measurements to the /8 show
better collective performance over another link, why NOT
move it? Yes, it could be carrying a lot of traffic, and
could result in congesting the next link ... so it is
necessary to be able to:
- know when links are at/near capacity,
and so avoid their use; and
- react quickly in case of congestion
Note that these problems are not specific to /8s,
and that traffic loads are dynamic - even if it
does look like there is "room" for a prefix on a
link, once the route gets changed, conditions
could very well change also. Any route optimization
system needs to deal with these issues for ALL
prefixes.
There are multiple levels of optimization possible
on top of this:
a) If there is a general belief that /8s are
simply "too big" to move, they can be manually
deaggregated. Our experience shows that by
breaking up a /8 into as few as (10) or (15)
carefully designed "chunks", the resultant
load per (deaggregated) prefix becomes equivalent
to hundreds of other prefixes.
b) If manually configuring deaggregates is not
desirable, automated approaches to deaggregation
are possible: "If I see traffic in this range,
and a /xx does not exist for the observed traffic,
then create the /xx".
c) Dynamically measure all of the possible
deaggregations of all active space, and dynamically
determine which prefixes need to be deaggregated
to what level.
Note that in any of the above cases, the de-aggregated
routes should be marked NO_EXPORT.
I know of solid commercial implementations of (a) and
(b). (c) is a more interesting project ... :)
> * The constant noise of stupid scripts pinging everything
> on the Internet.
Pinging the Internet is clearly a wasteful approach. Essentially
no one needs optimization to the ENTIRE Internet. Granted, major
backbones probably actually use a great deal of the routing
table ...
(Quiz for the list readers:
What percentage of the Internet routing table does
your network actually use?)
... but for many ISP/hosting facility/major multihomed
enterprise, our experience shows that only a very small
fraction of traffic is seen beyond about (20,000-30,000)
routes in a given day.
There is no reason to measure destinations unless they
are involved with traffic to your network. Basing
measurements on observed traffic, or having applications
instrumented to automatically generate their own measurement
are both "clean" options here.
Companies and ISPs today spend time(=money) managing their
connectivity to the Internet. Loop-free connectivity is a
basic first step; but in many cases real connectivity goals
include:
- Capacity management (especially in the presence
of asymmetrical bandwidth)
- Load management (in the case of usage-based billig)
- Performance management (realizing 'best possible'
performance)
- Maximizing application availability (fastest possible
reroute, in the case of congestive failure)
Manually tweaking routing policies to achieve these goals
is a time-honored craft (especially with this crowd :) ...
but I suspect that even the most experienced in this area
will acknowledge that there is a tier of this problem that
may be best automated. (Note that I said "a tier" -- there
are clearly additional problems that current route optimization
technology DOESN'solve. :)
cheers -- Sean
More information about the NANOG
mailing list