Networks ignoring prepends?

Steve Gibbard scg at gibbard.org
Mon Jan 22 18:17:54 UTC 2024


To expand on what others have said here, I find it helpful to think of BGP as a policy enforcement protocol, rather than as a distance vector routing protocol.  

To that end, there’s a generally expected hierarchy of routes, and then a lot of individuality between networks.  Having done traffic engineering for some global CDNs, there’s a bunch of inbound traffic control that you can do by letting an understanding of how most other providers think about this guide your transit and peering policies, and a remaining portion that generally needs to be solved through either discussions, negotiations, or commercial arrangements with the sending party or their upstreams.

For the general rules, local-preference trumps everything else.  The number of AS path hops comes after local-preference.  Other things being equal networks usually like to hand off traffic to a short AS path, and at the closest point to its origination (there are valid performance reasons for this) but local-preference policies will override both of those.

Local-preferences usually have three default tiers — customer, peering, and transit.  In other words, get paid, hand off for free, and pay.  There are often some additional peers that can be selected for traffic engineering reasons, either internally or by customers using BGP communities.  BUT, those BGP communities don’t transit to other ASes, so even if you manage to signal one hop up stream, you may still find your upstream provider announcing your routes to those who have different ideas.

One example of this from the early days of anycasted DNS root servers involved k.root-servers.net <http://k.root-servers.net/> installing a node in Delhi, which pulled 60% of its traffic from North America.  This was clearly non-optimal.  They had attempted to get routing diversity by getting transit from different providers in different parts of the world, but their Delhi node was, if I recall correctly, a customer of a customer of a customer of Level3.  Oops.

So, what do you do about this?

If you’re a global network operator, you probably attempt to maintain consistent peering/transit relationships across sites.  That way, AS paths and local-preferences should be fairly even, and you can let nearest exit routing do its thing.

If you have a smaller network, but have multiple interconnection locations that are far enough apart to make a performance difference, make the same transit and peering relationships at each one.  Make exceptions only for peers (not transit providers) whose customers or services only exist in one of the areas, and make sure they don’t announce your routes to their upstreams.  That way you won’t trombone traffic.

If you’ve done all that, and traffic is still coming in the wrong place, then you start talking to people.  “Hey, I’m buying transit from you in both Asia and the Western US, and all my traffic from asian-country-x is coming into San Jose.  Why?”  “Well, they only have a 100 Mb/s interconnection to us in Asia.  We have to traffic engineer around it.”  And then you have to figure out how to convince some national telco to want to talk to you more than they want to talk to your transit provider.

I think in your case, I would be asking why you have a 5,000 mile, five-prepend loop to get to a provide ten miles away.  It suggests that your network is doing things 5,000 miles away that are inconsistent with what you're doing locally, or that you have upstreams who aren’t interconnecting locally or aren’t maintaining sufficient capacity or sufficient political relationships on those paths.  All of those would predictably have this result.  The solution is likely to take a look at your transit relationships, ask your transit providers about their transit relationships, and either supplement or switch to a set of transit providers who can provide the routing you want.

-Steve



> On Jan 22, 2024, at 4:49 AM, William Herrin <bill at herrin.us> wrote:
> 
> Howdy,
> 
> Does anyone have suggestions for dealing with networks who ignore my
> BGP route prepends?
> 
> I have a primary ingress with no prepends and then several distant
> backups with multiple prepends of my own AS number. My intention, of
> course, is that folks take the short path to me whenever it's
> reachable.
> 
> A few years ago, Comcast decided it would prefer the 5000 mile,
> five-prepend loop to the short 10 mile path. I was able to cure that
> with a community telling my ISP along that path to not advertise my
> route to Comcast. Today it's Centurylink. Same story; they'd rather
> send the packets 5000 miles to the other coast and back than 10 miles
> across town. I know they have the correct route because when I
> withdraw the distant ones entirely, they see and use it. But this time
> it's not just one path; they prefer any other path except the one I
> want them to use. And Centurylink is not a peer of those ISPs, so
> there doesn't appear to be any community I can use to tell them not to
> use the route.
> 
> I hate to litter the table with a batch of more-specifics that only
> originate from the short, preferred link but I'm at a loss as to what
> else to do.
> 
> Advice would be most welcome.
> 
> Regards,
> Bill Herrin
> 
> -- 
> William Herrin
> bill at herrin.us
> https://bill.herrin.us/



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nanog.org/pipermail/nanog/attachments/20240122/f897c078/attachment.html>


More information about the NANOG mailing list