best practice for advertising peering fabric routes

Patrick W. Gilmore patrick at ianai.net
Wed Jan 15 03:35:31 UTC 2014


On Jan 14, 2014, at 22:20 , Leo Bicknell <bicknell at ufp.org> wrote:
> On Jan 14, 2014, at 7:55 PM, Eric A Louie <elouie at yahoo.com> wrote:
> 
>> I have a connection to a peering fabric and I'm not distributing the peering fabric routes into my network.
> 
> There's a two part problem lurking.
> 
> Problem #1 is how you handle your internal routing.  Most of the "big boys" will next-hop-self in iBGP all external routes.  However depending on the size and configuration of your network there may be advantages to not using next-hop-self, or just putting it in your IGP.  Basically, you should be doing the same thing you do for a /30 from a peer or transit provider in your network.  There is one thing special about an exchange point though, for security reasons you probably want to add it to your "never accept" routing filter from peers/customers/transit providers.  You don't need someone injecting a couple of more specifics to mess with your routing.
> 
> Problem #2 is your customers.  If you have customers that may operate default free, and they use one of the traceroute tools that not only finds the route, but then continues to probe it (like MTR, or Visual Traceroute) there can be an issue.  The initial traceroute probe may return an IP on the exchange of your peer's router, but then when they subsequently source ICMP Ping to that IP there will be no route in their network, and it will simply never respond.  Some call this a feature, some call this a problem.  There is also an extremely rare problem where the far end of the peering exchange steps down MTU, and thus PMTU discovery is invoked, but your customers use Unicast RPF.  Since the exchange LAN isn't in their table, Unicast RPF may drop the PMTU packet-too-big message, causing a timeout.
> 
> If your customers have a default to you, all is well.  However if they have a default to someone else, and take a table from you to selectively override the same problem can occur for any routes they select through you that also traverse the exchange.
> 
> IMHO the best fix for #2 is that the exchange have an ASN, and announce the exchange LAN from that ASN, typically via the route server.  You should then peer with the route server to pick up that network.  That makes the announcement consistent, and makes it clear who operates that network, and your customers can then access it.  Many exchanges do not do this, and then the next best solution might be to originate it from your ASN and announce it to your customers only, with no-export set on the way out.
> 
> Various people will no doubt chime in and tell you the last two suggestions are either excellent wonderful and the worst idea ever.  Safe to say I know of networks doing both and the world has not ended.  YMMV, some assembly required, batteries not included, actual conditions may affect product performance, do not taunt the happy fun ball, and consult a doctor if your network is up for more than four hours.

I've known Leo for .. well, let's just say a long time. And I have great respect for his networking abilities. But I fall into the second camp. As someone who owns & operates an IXP, and is on the board of a couple more, and helped start even more, I'm going to stick to my guns here.

As for knowing networks that do both, blah, blah, blah. I know lots of networks that allow spam, don't configure BCP38, have abusable name or NTP servers, etc. and the world has not come to an end. Doesn't mean you should. Lame excuse, Leo, and beneath you to even go there.

NEVER EVER EVER put an IX prefix into BGP, IGP, or even static route. An IXP LAN should not be reachable from any device not directly attached to that LAN. Period.

If for no better reason, how about because it is not your prefix, and chances are the IXP does not want you to use the prefix. In fact, I challenge you to find a major IXP route server which is announcing the IXP block.

But because this is a teaching list, let's go through the problems Leo mentions. Anyone who steps down MTU on an IXP is far too broken to worry about your customer having RFP and not getting PMTU. Again, I challenge you to find someone doing this today, their network would be close to unusable. As for traceroute .... Seriously? You want to increase breakage on the Internet because it might cause 3 stars in a traceroute? Puh-LEEEZE. Sorry, neither of those pass the sniff test, IMHO.

So Just Don't Do It. Setting next-hop-self is not just for "big guys", the crappiest, tiniest router that can do peering at an IXP has the same ability. Use it. Stop putting me and every one of your peers in danger because you are lazy.

-- 
TTFN,
patrick

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 535 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mailman.nanog.org/pipermail/nanog/attachments/20140114/7a893078/attachment.sig>


More information about the NANOG mailing list