Google's peering, GGC, and congestion management
Patrick W. Gilmore
patrick at ianai.net
Thu Oct 15 21:46:31 UTC 2015
On Oct 15, 2015, at 5:13 PM, Baldur Norddahl <baldur.norddahl at gmail.com> wrote:
> On 15 October 2015 at 22:00, Patrick W. Gilmore <patrick at ianai.net> wrote:
>> The reason routers do not do that is what you suggest would not work.
> Of course it will work and it is in fact exactly the same as your own
> suggestion, just implemented in the network. Besides it _is already_ a
> standard feature, it is called equal cost multipath routing. The only
> difference is dynamically changing the weights between the multipaths.
You are confused. But I think I see the source of your confusion.
Perhaps you are only considering a single port on a multi-port router with many paths to the same destination. Sure, if you want to say when Port X gets full (FSVO “full”), move some flows to the second best path. Yes, that is physically possible.
However, that is a tiny fraction of CDN Mapping. Plus you have a vast number of assumptions - not the least of which is that there _is_ another port to move traffic to. How many CDN nodes have you seen? You think most of them have a ton of ports to a slew of different networks? Or do they plonk a bunch of servers behind a single router (or switch!) connected to a single network (since most of them are _inside_ that network)?
My original point is the CDN can control how much traffic is sent to each destination. Routers cannot do this.
BTW: What you suggest breaks a lot of other things - which may or may not be a good trade off for avoiding congesting individual ports. But the idea to make identical IP path decisions inside a single router non-deterministic is .. let’s call it questionable.
>> First, you make the incorrect assumption that inbound will never exceed
>> outbound. Almost all CDN nodes have far more capacity between the servers
>> and the router than the router has to the rest of the world. And CDN nodes
>> are probably the least complicated example in large networks. The only way
>> to ensure A < B is to control A or B - and usually A.
> I make absolutely no assumptions about ingress (towards the ASN) as we have
> no control of that. There is no requirement that routing is symmetric and
> it is the responsibility of whoever controls the ingress to do something if
> the port is overloaded in that direction. In the case of a CDN however, the
> ingress will be very little. Netflix does not take much data in from their
> customers, it is all egress traffic towards the customers and the CDN is in
> control of that. The same goes for Google.
> Two non CDN peers could use the system, but if the traffic level is
> symmetric then they better both do it.
You are still confused.
I have 48 servers connected @ GigE to a router with 4 x 10G outbound. When all 48 get nailed, where in the hell does the extra 8 Gbps go?
While if I own the CDN, I can easily ensure those 48 servers never push more than 40 Gbps. Or even 20 Gbps to any single destination. Or even 10 Mbps to any single destination.
The CDN can ensure the router is -never- congested. The router itself cannot do that.
>> Second, the router has no idea how much traffic is coming in at any
>> particular moment. Unless you are willing to move streams mid-flow, you
>> can’t guarantee this will work even if sum(in) < sum(out). Your idea would
>> put Flow N on Port X when the SYN (or SYN/ACK) hits. How do you know how
>> many Mbps that flow will be? You do not, therefore you cannot do it right.
>> And do not say you’ll wait for the first few packets and move then. Flows
>> are not static.
> Flows can move at any time in a BGP network. As we are talking about CDNs
> we can assume that we have many many small flows (compared to port size).
> We can be fairly sure that traffic will not make huge jumps from one second
> to the next - you will have a nice curve here. You know exactly how much
> traffic you had the last time period, both out through the contested port
> and through the alternative paths. Recalculating the weights is just a
> matter of assuming that the next time period will be the same or that the
> delta will be the same. It is a classic control loop problem. TCP is trying
> to do much the same btw.
> You can adjust how close to 100% you want the algorithm to hit. If it
> performs badly, give it a little bit more space.
> If the time period is one second, flows can move once a second at maximum
> and very few flows would be likely to move. You could get a few out of
> order packets on your flow, which is not such a big issue in a rare event.
This makes me lean towards my original idea that you have a total of one port on one router being considered.
Perhaps that is what the OP meant. If so, sure, have at it.
If they are interested in how CDN Mapping works, not even close.
>> Third…. Actually, since 1 & 2 are each sufficient to show why it doesn’t
>> work, not sure I need to go through the next N reasons. But there are
>> plenty more.
> There are more reasons why this problem is hard to do on the servers :-).
The problem is VERY hard on the servers. Or, more precisely, on the control plane (which is frequently not on the servers themselves).
But the difference between “it's hard” and “it's un-possible” is kinda important.
More information about the NANOG