CDN Overload?

George Skorup george at cbcast.com
Tue Sep 20 06:14:26 UTC 2016


I have witnessed this issue first hand for several years. Four for sure, 
maybe five or six. The very first one I remember is a customer doing 
Usenet downloads and using what he called an "internet download manager" 
which I assumed was screwing with TCP ACKs. I believe he was a 4Mbps 
user at the time and this download manager thing was causing 2 to maybe 
2.5x his subscribed rate, as Mike says, on the upstream facing router 
interface. He shut down or uninstalled the software and it stopped. Yes, 
this customer is on PTMP fixed wireless. Traffic policing was taking 
place via MikroTik simple queue at the site router.. I could cut his 
downstream rate in half and it would follow with double still hitting 
the backhaul. I could also move his queue all the way to the border 
router and it was still there at double rate.

BTW, we still have this guy as a customer on fixed wireless. He's been 
on 25/5Mbps for over a year. And we're about to upgrade him to 50/10Mbps 
with new gear. 25/5 and 50/10 is a far cry from this claimed "slow" WISP 
service. This shit ain't cheap to get to bumfsck Illinois so farmer Joe 
can watch porn and his kids can watch Netflix at the same time. Yup, we 
have slow NLOS service too, because customers decide they want the rural 
life buried in a mile of trees while "needing" the city benefits. If you 
want the gigabits, then move outta the sticks. Running a hundred 
combined miles of fiber to get to 20 customers that want to pay less 
than $50/mo is not feasible. /rant off

Another time, maybe three years ago, we had a customer on Canopy 5.7 FSK 
at 4/1Mbps using the built-in QoS. He was watching Netflix and I saw 
8Mbps hitting the AP's ethernet interface. I thought the Canopy 
scheduler was broken. Until I looked deeper and saw that it was working 
exactly as designed.. with 50% discard rate on his VC. I want to say 
this was from LLNW at the time. I could be totally wrong about that, I 
really don't remember.

Now lets move the Windows 10 updates. A 'buried in the sticks' customer 
on Canopy 900 FSK. 1.5Mbps/384k. Multiple streams from Microsoft and 
LLNW at the same time. LLNW alone had maybe 10 streams going and was 
sending at over 15Mbps on average and at worst about 25Mbps... to a 
1.5Mbps subscriber. I could throw in a MikroTik queue upstream which 
only moved the problem as that 15-25Mbps was still hitting backhaul 
links. And when I have a 100Mbps link going into the site, 25Mbps is a lot.

We've had numerous customers call in for the last month or two with 'teh 
innernets is down, my phoen wyfy don't work either'. No, your Windows 10 
updates are overloading your service. Shut off your PC to use your 
internet service. Telling a customer those exact words is ridiculous, 
but we have to do it.

We had a known issue with a particular licensed microwave vendor's 
radios that we have in use. It was the ethernet buffer becoming 
saturated at nowhere near the RF link capacity. They put out a new 
software release and that was resolved. And that was well before this 
Windows 10 update overload stuff started.

Normal TCP congestion control behavior works perfectly fine. It's not 
the network. It's the sender not doing normal TCP stuffs. I don't know 
why the CDNs and/or Microsoft thinks this is a good idea, but to me, it 
looks like a DDoS. I'm on some of the same lists as Mike and we know of 
many others reporting similar issues. A couple to the tune of 50-100Mbps 
overload destined for 5 or 10Mbps tier subscribers. So thanks to Mike 
for trying to get a conversation going on this topic. And it's not just 
us red headed step children WISPs.

On 9/19/2016 10:05 PM, Mike Hammett wrote:
> http://www.theregister.co.uk/2016/06/08/is_win_10_ignoring_sysadmins_qos_settings/
>
> This explains the recent situations (well, not really an explanation, but a bit more information from other people). Not so much for the ones going back a year or two.
>
>
>
>
> -----
> Mike Hammett
> Intelligent Computing Solutions
>
> Midwest Internet Exchange
>
> The Brothers WISP
>
> ----- Original Message -----
>
> From: "Mike Hammett" <nanog at ics-il.net>
> To: "NANOG" <nanog at nanog.org>
> Sent: Monday, September 19, 2016 12:34:48 PM
> Subject: CDN Overload?
>
> I participate on a few other mailing lists focused on eyeball networks. For a couple years I've been hearing complaints from this CDN or that CDN was behaving badly. It's been severely ramping up the past few months. There have been some wild allegations, but I would like to develop a bit more standardized evidence collection. Initially LimeLight was the only culprit, but recently it has been Microsoft as well. I'm not sure if there have been any others.
>
> The principal complaint is that upstream of whatever is doing the rate limiting for a given customer there is significantly more capacity being utilized than the customer has purchased. This could happen briefly as TCP adjusts to the capacity limitation, but in some situations this has persisted for days at a time. I'll list out a few situations as best as I can recall them. Some of these may even be merges of a couple situations. The point is to show the general issue and develop a better process for collecting what exactly is happening at the time and how to address it.
>
> One situation had approximately 45 megabit/s of capacity being used up by a customer that had a 1.5 megabit/s plan. All other traffic normally held itself within the 1.5 megabit/s, but this particular CDN sent excessively more for extended periods of time.
>
> An often occurrence has someone with a single digit megabit/s limitation consuming 2x - 3x more than their plan on the other side of the rate limiter.
>
> Last month on my own network I saw someone with 2x - 3x being consumed upstream and they had *190* connections downloading said data from Microsoft.
>
> The past week or two I've been hearing of people only having a single connection downloading at more than their plan rate.
>
>
> These situations effectively shut out all other Internet traffic to that customer or even portion of the network for low capacity NLOS areas. It's a DoS caused by downloads. What happened to the days of MS BITS and you didn't even notice the download happening? A lot of these guys think that the CDNs are just a pile of dicks looking to ruin everyone's day and I'm certain that there are at least a couple people at each CDN that aren't that way. ;-)
>
>
>
>
> Lots of rambling, sure. What do I need to have these guys collect as evidence of a problem and who should they send it to?
>
>
>
>
> -----
> Mike Hammett
> Intelligent Computing Solutions
>
> Midwest Internet Exchange
>
> The Brothers WISP
>
>




More information about the NANOG mailing list