CDN Overload?

Bruce Curtis bruce.curtis at ndsu.edu
Thu Sep 22 21:28:17 UTC 2016


  I have seen traffic from Microsoft in Europe to single hosts on our campus that seemed to be unusually (high bps) and long.

  I don’t recall if the few multiple hosts I noticed this on over time were only on our campus wifi.

  If not perhaps the common factor is longer latency?  Both connects over wireless and connections from Europe to the US would have longer latency.

  Perhaps this longer latency combined with some other factor is triggering a but in modern TCP Congestion Control algorithms?



This mentions that there have been bugs in TCP Congestion Control algorithm implementations.   Perhaps there could be other bugs that result in the descried issue?

https://www.microsoft.com/en-us/research/wp-content/uploads/2016/08/ms_feb07_eval.ppt.pdf


I have seen cases on our campus where too small buffers on an ethernet switch caused a Linux TCP Congestion Control algorithm to act badly resulting in slower downloads than a simple algorithm that depended on dropped packets rather than trying to determine window sizes etc.  The fix in that case was to increase the buffer size.  Of course buffer bloat is also known to play havoc with TCP Congestion Control algorithms.  Just wondering if some combination of higher latency and another unknown variable or just a bug might cause a TCP Congestion Control algorithm to think it can safely try to increase the transmit rate?


> On Sep 21, 2016, at 8:29 PM, Mike Hammett <nanog at ics-il.net> wrote:
> 
> Thanks Marty. I have only experienced this on my network once and it was directly with Microsoft, so I haven't done much until a couple days ago when I started this campaign. I don't know if anyone else has brought this to anyone's attention. I just sent an e-mail to Owen when I saw yours. 
> 
> 
> 
> 
> ----- 
> Mike Hammett 
> Intelligent Computing Solutions 
> 
> Midwest Internet Exchange 
> 
> The Brothers WISP 
> 
> ----- Original Message -----
> 
> From: "Martin Hannigan" <hannigan at gmail.com> 
> To: "Mike Hammett" <nanog at ics-il.net> 
> Cc: "NANOG" <nanog at nanog.org> 
> Sent: Wednesday, September 21, 2016 8:19:35 PM 
> Subject: Re: CDN Overload? 
> 
> 
> 
> 
> 
> Mike, 
> 
> 
> I will forward to the requisite group for a look. Have you brought this to our attention previously? I don't see anything. If you did, please forward me the ticket numbers or message(s) ([email protected] is best) so wee can track down and see if someone already has it in queue. 
> 
> 
> Jared alluded to fasttcp a few emails ago. Astute man. 
> 
> 
> Best, 
> 
> 
> Martin Hannigan 
> AS 20940 // AS 32787 
> 
> 
> 
> 
> 
> On Sep 21, 2016, at 14:30, Mike Hammett < nanog at ics-il.net > wrote: 
> 
> 
> 
> 
> https://docs.google.com/spreadsheets/d/1Jdm0dOBf81kSnXEvVfI6ZJbWFNt5AbYUV8CDxGwLSm8/edit?usp=sharing 
> 
> I have made the anonymized answers public. This will obviously have some bias to it given that I mostly know fixed wireless operators, but I'm hoping this gets some good distribution to catch more platforms. 
> 
> 
> 
> 
> ----- 
> Mike Hammett 
> Intelligent Computing Solutions 
> 
> Midwest Internet Exchange 
> 
> The Brothers WISP 
> 
> ----- Original Message ----- 
> 
> From: "Mike Hammett" < nanog at ics-il.net > 
> To: "NANOG" < nanog at nanog.org > 
> Sent: Wednesday, September 21, 2016 9:08:55 AM 
> Subject: Re: CDN Overload? 
> 
> https://goo.gl/forms/LvgFRsMdNdI8E9HF3 
> 
> I have made this into a Google Form to make it easier to track compared to randomly formatted responses on multiple mailing lists, Facebook Groups, etc. 
> 
> 
> 
> 
> ----- 
> Mike Hammett 
> Intelligent Computing Solutions 
> 
> Midwest Internet Exchange 
> 
> The Brothers WISP 
> 
> ----- Original Message ----- 
> 
> From: "Mike Hammett" < nanog at ics-il.net > 
> To: "NANOG" < nanog at nanog.org > 
> Sent: Monday, September 19, 2016 12:34:48 PM 
> Subject: CDN Overload? 
> 
> 
> I participate on a few other mailing lists focused on eyeball networks. For a couple years I've been hearing complaints from this CDN or that CDN was behaving badly. It's been severely ramping up the past few months. There have been some wild allegations, but I would like to develop a bit more standardized evidence collection. Initially LimeLight was the only culprit, but recently it has been Microsoft as well. I'm not sure if there have been any others. 
> 
> The principal complaint is that upstream of whatever is doing the rate limiting for a given customer there is significantly more capacity being utilized than the customer has purchased. This could happen briefly as TCP adjusts to the capacity limitation, but in some situations this has persisted for days at a time. I'll list out a few situations as best as I can recall them. Some of these may even be merges of a couple situations. The point is to show the general issue and develop a better process for collecting what exactly is happening at the time and how to address it. 
> 
> One situation had approximately 45 megabit/s of capacity being used up by a customer that had a 1.5 megabit/s plan. All other traffic normally held itself within the 1.5 megabit/s, but this particular CDN sent excessively more for extended periods of time. 
> 
> An often occurrence has someone with a single digit megabit/s limitation consuming 2x - 3x more than their plan on the other side of the rate limiter. 
> 
> Last month on my own network I saw someone with 2x - 3x being consumed upstream and they had *190* connections downloading said data from Microsoft. 
> 
> The past week or two I've been hearing of people only having a single connection downloading at more than their plan rate. 
> 
> 
> These situations effectively shut out all other Internet traffic to that customer or even portion of the network for low capacity NLOS areas. It's a DoS caused by downloads. What happened to the days of MS BITS and you didn't even notice the download happening? A lot of these guys think that the CDNs are just a pile of dicks looking to ruin everyone's day and I'm certain that there are at least a couple people at each CDN that aren't that way. ;-) 
> 
> 
> 
> 
> Lots of rambling, sure. What do I need to have these guys collect as evidence of a problem and who should they send it to? 
> 
> 
> 
> 
> ----- 
> Mike Hammett 
> Intelligent Computing Solutions 
> 
> Midwest Internet Exchange 
> 
> The Brothers WISP 
> 
> 
> 
> 
> 
> 

---
Bruce Curtis                         bruce.curtis at ndsu.edu
Certified NetAnalyst II                701-231-8527
North Dakota State University        





More information about the NANOG mailing list