worse than IPv6 Pain Experiment

Thu Oct 10 02:35:19 UTC 2019

> On Oct 9, 2019, at 18:43 , Matt Harris <matt at netfire.net> wrote:
> 
> On Wed, Oct 9, 2019 at 5:28 PM Owen DeLong <owen at delong.com <mailto:owen at delong.com>> wrote:
> 
> > URLs are an obvious candidate to consider because they're in use, seem
> > to basically work to identify routing endpoints, and are far from a
> > random, out of thin air, choice.
> 
> In reality, you’re not really talking about URLs here, even. You’re talking
> about DNS host names. (The part before the // isn’t really part of what
> you want to consider in your network routing scenario, neither is anything
> that comes after the first /).
> 
> It’s not that we couldn’t use some form of hierarchically structured human-
> readable name for this purpose… It’s that using DNS host names _REALLY_
> wouldn’t work well.
> 
> Except what if we used basic textual representations for addresses that kind of looked like DNS names, but didn't actually try to use DNS names? Let's even assume we keep DNS largely unchanged, but introduced "B records" to handle the new addressing scheme, similar to how we introduced AAAA records to handle translating between names and IPv6 addresses. Perhaps we also add a special-case "TLD-alike" called .address to indicate when we want to connect to the specified address and not do a DNS lookup of the name we've requested? 
> 
> For example, let's say my internet domain is nanog.org <http://nanog.org/>. I might have DNS setup for nanog.org <http://nanog.org/>, but I may also claim addressing space under nanog.org <http://nanog.org/>. Since my ASN is 64500, I will use it to advertise "nanog.org <http://nanog.org/>" to my peers: so when you check a looking glass for nanog.org <http://nanog.org/>, you'll see that it's routing to AS 64500 just like any IPv4 or IPv6 announcement. 

Except that’s not how it works for IPv4 or IPv6 announcements. You don’t route to an ASN (for better or worse, worse IMHO, but hard to fix at this point). You route to a next-hop (NLRI) and ASNs are primarily for loop detection/prevention and demarcation of boundaries of administrative control.

Other than what you do in policy based on the AS PATH or other AS-related attributes, they really have zero significance in forwarding decisions.

> Now if you want to visit a website called www.nanog.org <http://www.nanog.org/>, and you punch that into your web browser, it's going to do a DNS lookup. Assuming this addressing scheme is preferred over IPv4 or IPv6, the first thing your browser will do is a DNS lookup for a B record for "www.nanog.org <http://www.nanog.org/>" - and in my nanog.org <http://nanog.org/> zone, I'll have one or more B records pointing to the address hosting the site, for example:
> www.nanog.org <http://www.nanog.org/>. IN B webserver1.nanog.org <http://webserver1.nanog.org/>
> Upon receiving this, your browser will then initiate a port 443 TCP connection (or UDP for QUIC or whatever its protocol of choice is, in this, the year 3305) to webserver1.nanog.org <http://webserver1.nanog.org/>, which is what will be in the packet headers. Its upstream router will see this and route it until it reaches a member of the DFZ at which point that DFZ router will then determine that "webserver1.nanog.org <http://webserver1.nanog.org/>" is part of "nanog.org <http://nanog.org/>" and that "nanog.org <http://nanog.org/>" is announced by AS64500 which is available from two transit providers, prefer one of them based on whatever traffic engineering rules are the norm in 3305, and send the packet on to the next hop for that route. 

Well, I’m not wild about the addressing scheme and I think it creates tremendous potential for confusion (and some serious implementation challenges for fast packet switching), but, I do like the idea of DFZ routing being based on destination ASN and candidate routes rather than on specific IPv4/IPv6 next-hop addresses.

Also, you state two transit providers as if that is a simple router-implementable concept. That’s very hand-wavy in today’s world. Consider the following…

Let’s say we have transit providers A and B. You are ISP Q. You have border routers numbered Q01 to Q50 scattered around the world. Let’s say that each of these routers peers with at least one of {A,B} and that some of them peer with both.

For a packet arriving on any Qn router, the answer is relatively simple (pass it to a local peering session and you’re done.).

Now, consider the scenario where 25% of your routers peer with A and 25% peer with B, but the two groups have a 50% overlap (12.5% of your total routers peer with A and B).

The packet arrives at one of your routers Qn that is not a member of that 37.5% total (12.5% A only, 12.5% B only, 12.5% both) routers that are peered with either A or B.

However, you have an indirect path on Qn via another transit AS Z that is announcing both A and B as reachable via its network.

Do you forward the packet internally to a router that can reach A or B? Do you pass the packet directly to Z? What existing or new knobs in BGP are used to control this? What are the default behaviors?

Today, the answer is simplified because you have a “Next-hop IP address” for each route and you forward the packet to the next hop of the best path received for that destination.

In your proposal, there are multiple “paths” to the same AS, but you haven’t allowed for the tracking/management of the multiple next-hops.

Of course, there’s also the issue of what happens when you forward a packet internally past a BGP-unaware router that doesn’t know about destinations {A,B} and thinks its best path is back towards where it came from, but that’s something we grapple with in today’s environment when faced with non-adjacent (indirect) next-hop information, so I would presume the same topological solutions would apply there.

> On the other hand, if you wish to simply load whatever comes up when you make an HTTPS request to port 443 on webserver1.nanog.org <http://webserver1.nanog.org/>, you might enter "https://webserver1.nanog.org.address/ <https://webserver1.nanog.org.address/>" which will then skip the DNS check and simply try connecting to webserver1.nanog.org <http://webserver1.nanog.org/>. 
> 
> When I think about it, this actually seems shockingly reasonable, potentially massive RAM requirements for routers aside (we're getting there anyway!). Am I missing something? 

I think you’re missing several things, but the most glaring one is the one I outlined above.

Owen

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nanog.org/pipermail/nanog/attachments/20191009/6ad51a92/attachment.html>