best practice for advertising peering fabric routes

Wed Jan 15 04:41:42 UTC 2014

On Jan 14, 2014, at 23:03 , Leo Bicknell <bicknell at ufp.org> wrote:
> On Jan 14, 2014, at 9:35 PM, Patrick W. Gilmore <patrick at ianai.net> wrote:
> 
>> So Just Don't Do It. Setting next-hop-self is not just for "big guys", the crappiest, tiniest router that can do peering at an IXP has the same ability. Use it. Stop putting me and every one of your peers in danger because you are lazy.
> 
> I'm going to have to disagree here with Patrick, because this is security through obscurity, and that doesn't work well.

Leo, each of your points below is incorrect. I'm happy to discuss off-list if you'd like.

> For some history about why people like Patrick take the position he did, read: http://blog.cloudflare.com/the-ddos-that-almost-broke-the-internet
> 
> Exchange points got attacked, so people yanked them from the routing table hoping to prevent attacks.  If you're on this list it should take you all of about 3 seconds to realize the attackers could do a traceroute, and attack the IP one hop on the far side of the exchange for a few dozen providers and still cause all sorts of havoc, or do any of another half dozen things I won't mention to cause problems.  The effect would be nearly, if not perfectly identical, since that traffic still has to cross the exchange.

Let's take just the incident mentioned in the blog post above (which is pretty broken itself, but hey, who said the CEO of CDN had to know anything about networking... ? :).

To where would the attacker traceroute -to-? Somewhere inside Cloudflare? Other LINX members? Remember, most of the attack was sourced from networks which were not attached to the LINX. If the source network or the source network's upstreams are not LINX members, there is probably _no_ path that goes through LINX. Even if they are members, lots of networks have alternative paths (other IXPs, private interconnections, etc.). For instance, sources in Germany may well flow over DE-CIX even if there is a peering session at LINX. Etc. There is no single or set of IP addresses that will guarantee even a majority of packets traverse a specific IXP except the IXP LAN.

Also, the attack was reflected DNS, so the attacker couldn't actually perform the traceroutes you suggest from each source as he did not control the sources. He _might_ be able to find _some_ of the paths with a lot of sleuthing through route & traceroute servers, but that would make things massively more difficult, as well as massively cut the number of servers he can abuse to the same effect. Both of which are huge wins for the good guys.

Pulling the IXP prefix has a enormous benefits and essentially no downside. I know literally hundreds of ISPs large & small who do not carry the IXP prefix, and none have seen any significant issues (most have seen zero, a few get asked about 3 stars, but as I said before, puh-leeeeze).

I'm a bit surprised you even tried to bring this up. I know you well enough to know you would have realized all of the above if you had though about it for a while (or just asked).

> I'll point out the MTU step-down issue is real, and it's part of why we can't have 9K MTU exchanges be the default on the Internet, which would really make things better for a significant number of users.  I think Patrick is a bit quick to dismiss some of the potential issues.

MTU step-down is a real issue, and it's real enough whether IXP LANs are in the DFZ or not. Let's solve the overarching problem before doing something which has real, proven harm and leaves the root cause in place.

Besides, the two VLAN method already exists in multiple places and it hasn't helped adoption of 9K packets. Unless you are talking about letting some people attach with 1500 MTU and others with 9000 MTU? 'Cause if that's what you meant, then I'm just going to call you loony and ask what you're smoking.

> Every link on every router is subject to attack.  Exchange point LAN's really aren't special in that regard.  If anything the only thing that makes them slightly special is that they may in fact be more oversubscribed than most links.  Where a backbone might have a router with 20x10GE, so attackers could try and drive 190GE out a 10GE in theory; an exchange point may have 100 people with 20x10GE coming in.  An alternate view that mega-exchange points are massively oversubscribed potential single points of failure, and perhaps network operators should consider that.  While a DDOS taking an exchange down for half a day is bad, imagine if there was a more sinister attack, taking out the physical infrastructure of an exchange.  That can't be "fixed" with a routing advertisement.

IXPs are more special because they are shared. Other links are between you & one other network not hundreds of other networks, some of whom have no relationship with you.

If you don't like the rules of IXPs, don't join one. But hooking up to one and deciding "I'm going to carry this prefix" even when told not to is .. well, let's call it bad manners.

As for the rest, nothing is a silver bullet. Claiming "this doesn't solve every possible problem so we shouldn't do it" is even more lame than your first excuse that the world hasn't ended. This solves lots of real, provable problems. It is trivial to implement. There is no network which peers at an IXP and cannot implement it. It _has_ been implemented 1000s of times without the harm you mention. In short, it should be done.

I repeat: NEVER EVER EVER put an IX prefix into BGP, IGP, or even static route. An IXP LAN should not be reachable from any device except those directly attached to that LAN. Period.

If you join one of my IXPs and I find you are carrying a prefix I own, did not advertise to you, and specifically told you not to carry, I'm going to ask you to stop immediately or face possible disconnection. The other members of my IXP should not be endangered because you don't like to follow the rules.

What's more, I get a lot more people thanking me for doing that than complaining about it.

-- 
TTFN,
patrick

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 535 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mailman.nanog.org/pipermail/nanog/attachments/20140114/25dfad41/attachment.sig>