fixing TCP buffers (Re: packet reordering at exchange points)

Richard A Steenbergen ras at e-gerbil.net
Tue Apr 9 23:17:44 UTC 2002


On Tue, Apr 09, 2002 at 10:51:27PM +0000, E.B. Dreger wrote:
> 
> But how many simultaneous connections?  Until TCP stacks start
> using window autotuning (of which I know you're well aware), we
> must either use suboptimal windows or chew up ridiculous amounts
> of memory.  Yes, bad software, but still a limit...

Thats precisely what I ment by bad software, as well as the server code 
that pushes the data out in the first place. And for that matter, the 
receiver side is just as important.

> It would be nice to allocate a 32MB chunk of RAM for buffers,
> then dynamically split it between streams.  Fragmentation makes
> that pretty much impossible.
> 
> OTOH... perhaps that's a reasonable start:
> 
> 1. Alloc buffer of size X
> 2. Let it be used for Y streams
> 3. When we have Y streams, split each stream "sub-buffer" into Y
>    parts, giving capacity for Y^2, streams.

You don't actually allocate the buffers until you have something to put in
them, you're just fixing a limit on the maximum you're willing to 
allocate. The problem comes from the fact that you're fixing the limits on 
a "per-socket" basis, not on a "total system" basis.

> Aggregate transmission can't exceed line rate.  So instead of
> fixed-size buffers for each stream, perhaps our TOTAL buffer size
> should remain constant.
> 
> Use PSC-style autotuning to eek out more capacity/performance,
> instead of using fixed value of "Y" or splitting each and every
> last buffer.  (Actually, I need to reread/reexamine the PSC code
> in case it actually _does_ use a fixed total buffer size.)
> 
> This shouldn't be terribly hard to hack into an IP stack...

Actually here's an even simpler one. Define a global limit for this,
something like 32MB would be more then reasonable. Then instead of
advertising the space "remaining" in individual socket buffers, advertise
the total space remaining in this virtual memory pool. If you overrun your
buffer, you might have the other side send you a few unnecessary bytes
that you just have to drop, but the situation should correct itself very
quickly. I don't think this would be "unfair" to any particular flow, 
since you've eliminated the concept of one flow "hogging" the socket 
buffer and leave it to TCP to work out the sharing of the link. Second 
opinions?

-- 
Richard A Steenbergen <ras at e-gerbil.net>       http://www.e-gerbil.net/ras
PGP Key ID: 0x138EA177  (67 29 D7 BC E8 18 3E DA  B2 46 B3 D8 14 36 FE B6)



More information about the NANOG mailing list