CDN Overload?

Mike Hammett nanog at ics-il.net
Tue Sep 20 19:18:14 UTC 2016


This is what I'm asking of them: 


===== 
Have you seen a CDN overloading a customer? Help me gather information on the issue. 

What CDN? 
What have you identified the traffic to be? 
What is the access network? 
Where is the rate limiting done? 
How is the rate limiting done (policing vs. queueing, SFQ, PFIFO, etc,, etc.)? 
What is doing the rate limiting? 
What is the rate-limit set to? 
Upstream of the rate-limiter, what are you seeing for inbound traffic? 
One connection or many? 
How much traffic? 
How does other traffic behave when exceeding the rate limit? 
Where is NAT performed? 
What is doing NAT? 
Shared NAT or isolated to that customer? 
Have you done a packet capture before and after the rate limiter? The NAT device? 
Would you be willing to send a filtered packet capture (only the frames that relate to this CDN) to the CDN if they want it? 



There have been reports of CDNs sending more traffic than the customer can handle and ignores TCP convention to slow down. Trying to investigate this thoroughly so we can get the CDN to fix their system. Multiple CDNs have been shown to do this. 
===== 




----- 
Mike Hammett 
Intelligent Computing Solutions 

Midwest Internet Exchange 

The Brothers WISP 

----- Original Message -----

From: "Mike Hammett" <nanog at ics-il.net> 
To: "NANOG" <nanog at nanog.org> 
Sent: Monday, September 19, 2016 12:34:48 PM 
Subject: CDN Overload? 

I participate on a few other mailing lists focused on eyeball networks. For a couple years I've been hearing complaints from this CDN or that CDN was behaving badly. It's been severely ramping up the past few months. There have been some wild allegations, but I would like to develop a bit more standardized evidence collection. Initially LimeLight was the only culprit, but recently it has been Microsoft as well. I'm not sure if there have been any others. 

The principal complaint is that upstream of whatever is doing the rate limiting for a given customer there is significantly more capacity being utilized than the customer has purchased. This could happen briefly as TCP adjusts to the capacity limitation, but in some situations this has persisted for days at a time. I'll list out a few situations as best as I can recall them. Some of these may even be merges of a couple situations. The point is to show the general issue and develop a better process for collecting what exactly is happening at the time and how to address it. 

One situation had approximately 45 megabit/s of capacity being used up by a customer that had a 1.5 megabit/s plan. All other traffic normally held itself within the 1.5 megabit/s, but this particular CDN sent excessively more for extended periods of time. 

An often occurrence has someone with a single digit megabit/s limitation consuming 2x - 3x more than their plan on the other side of the rate limiter. 

Last month on my own network I saw someone with 2x - 3x being consumed upstream and they had *190* connections downloading said data from Microsoft. 

The past week or two I've been hearing of people only having a single connection downloading at more than their plan rate. 


These situations effectively shut out all other Internet traffic to that customer or even portion of the network for low capacity NLOS areas. It's a DoS caused by downloads. What happened to the days of MS BITS and you didn't even notice the download happening? A lot of these guys think that the CDNs are just a pile of dicks looking to ruin everyone's day and I'm certain that there are at least a couple people at each CDN that aren't that way. ;-) 




Lots of rambling, sure. What do I need to have these guys collect as evidence of a problem and who should they send it to? 




----- 
Mike Hammett 
Intelligent Computing Solutions 

Midwest Internet Exchange 

The Brothers WISP 





More information about the NANOG mailing list