"Does TCP Need an Overhaul?" (internetevolution, via slashdot)

Kevin Day toasty at dragondata.com
Sat Apr 5 13:40:54 UTC 2008



On Apr 5, 2008, at 7:49 AM, Paul Vixie wrote:

>> You've also got fast retransmit, New Reno, BIC/CUBIC, as well as host
>> parameter caching to limit the affect of packet loss on recovery  
>> time.  I
>> don't doubt that someone else could do a better job than I did in  
>> this
>> field, but I'd be really curious to know how much of an effect a
>> intermediary router can have on a TCP flow with SACK that doesn't  
>> cause more
>> packet loss than anyone would put up with for interactive sessions.
>
> my takeaway from the web site was that one of the ways p2p is bad is  
> that
> it tends to start several parallel tcp sessions from the same client  
> (i guess
> think of bittorrent where you're getting parts of the file from  
> several folks
> at once).  since each one has its own state machine, each will try  
> to sense
> the end to end bandwidth-delay product.  thus, on headroom-free  
> links, each
> will get 1/Nth of that link's bandwidth, which could be (M>1)/Nth  
> aggregate,
> and apparently this is unfair to the other users depending on that  
> link.

This is true. But it's not just bittorrent that does this. IE8 opens  
up to 6 parallel TCP sockets to a single server, Firefox can be  
tweaked to open an arbitrary number (and a lot of "Download  
Optimizers" do exactly that), etc. Unless you're keeping a lot of  
state on the history of what each client is doing, it's going to be  
hard to tell the difference between 6 IE sockets downloading cnn.com  
rapidly and bittorrent masquerading as HTTP.

>
> i guess i can see the point, if i squint just right.  nobody wants  
> to get
> blown off the channel because someone else gamed the fairness  
> mechanisms.
> (on the other hand some tcp stacks are deliberately overaggressive  
> in ways
> that don't require M>1 connections to get (M>1)/Nth of a link's  
> bandwidth.
> on the internet, generally speaking, if someone else says fairness  
> be damned,
> then fairness will be damned.
>

Exactly. I'm nervously waiting for the first bittorrent client to have  
their own TCP engine built into it that plays even more unfairly. I  
seem to remember a paper that described where one client was sending  
ACKs faster than it was actually receiving the data it from several  
well connected servers, and ended up bringing enough traffic in to  
completely swamp their university's pipes.

As soon as P2P authors realize they can get around caps by not playing  
by the rules, you'll be back to putting hard limits on each subscriber  
- which is where we are now. I'm not saying some fancier magic  
couldn't be put over top of that, but that's all depending on everyone  
to play by the rules to begin with.

> however, i'm not sure that all TCP sessions having one endpoint in  
> common or
> even all those having both endpoints in common ought to share fate.   
> one of
> those endpoints might be a NAT box with M>1 users behind it, for  
> example.
>
> in answer to your question about SACK, it looks like they simulate a  
> slower
> link speed for all TCP sessions that they guess are in the same flow- 
> bundle.
> thus, all sessions in that flow-bundle see a single shared contributed
> bandwidth-delay product from any link served by one of their boxes.

Yeah, I guess the point I was trying to make is that once you throw  
SACK into the equation you lose the assumption that if you drop TCP  
packets, TCP slows down. Before New Reno, fast-retransmit and SACK  
this was true and very easy to model. Now you can drop a considerable  
number of packets and TCP doesn't slow down very much, if at all. If  
you're worried about data that your clients are downloading you're  
either throwing away data from the server (which is wasting bandwidth  
getting all the way to you) or throwing away your clients' ACKs. Lost  
ACKs do almost nothing to slow down TCP unless you've thrown them  
*all* away.

I'm not saying all of this is completely useless, but it's relying a  
lot on the fact that the people you're trying to rate limit are going  
to be playing by the same rules you intended. This makes me really  
wish that something like ECN had taken off - any router between the  
two end-points can say "slow this connection down" and (if both ends  
are playing by the rules) they do so without wasting time on  
retransmits.

-- Kevin






More information about the NANOG mailing list