TCP time_wait and port exhaustion for servers

Ray Soucy rps at maine.edu
Thu Dec 6 13:58:10 UTC 2012


> net.ipv4.tcp_keepalive_intvl = 15
> net.ipv4.tcp_keepalive_probes = 3
> net.ipv4.tcp_keepalive_time = 90
> net.ipv4.tcp_fin_timeout = 30

As discussed, those do not affect TCP_TIMEWAIT_LEN.

There is a lot of misinformation out there on this subject so please
don't just Google for 5 min. and chime in with a "solution" that you
haven't verified yourself.

We can expand the ephemeral port range to be a full 60K (and we have
as a band-aid), but that only delays the issue as use grows.  I can
verify that changing it via:

echo 1025 65535 > /proc/sys/net/ipv4/ip_local_port_range

Does work for the full range, as a spot check shows ports as low as
2000 and as high as 64000 being used.

While this works fine for the majority of our sites as they average
well below that, for a handful peak hours can spike above 1000
connections per second; so we would really like to see something
closer to an ability to provide closer to 2000 or 2500 connections a
second for the amount of bandwidth being delivered through the unit
(full gigabit).

But ideally we would find a way to significantly reduce the number of
ports being chewed up for outgoing connections.

On the incoming side everything just makes use of the server port
locally so it's not an issue.

Trying to avoid using multiple source addresses for this as it would
involve a fairly large configuration change to about 100+ units; each
requiring coordination with the end-user, but it is a last resort
option.

The other issue is that this is all essentially squid, so a drastic
re-design of how it handles networking is not ideal either.




On Thu, Dec 6, 2012 at 8:25 AM, Kyrian <kyrian at ore.org> wrote:
> On  5 Dec 2012, rps at maine.edu wrote:
>
>> > Where there is no way to change this though /proc
>
>
> ...
>
>> Those netfilter connection tracking tunables have nothing to do with the
>> kernel's TCP socket handling.
>>
> No, but these do...
>
> net.ipv4.tcp_keepalive_intvl = 15
> net.ipv4.tcp_keepalive_probes = 3
> net.ipv4.tcp_keepalive_time = 90
> net.ipv4.tcp_fin_timeout = 30
>
> I think the OP was wrong, and missed something.
>
> I'm no TCP/IP expert, but IME connections go into TIME_WAIT for a period
> pertaining to the above tuneables (X number of probes at Y interval until
> the remote end is declared likely dead and gone), and then go into FIN_WAIT
> and then IIRC FIN_WAIT2 or some other state like that before they are
> finally killed off. Those tunables certainly seem to have actually worked in
> the real world for me, whether they are right "in theory" or not is possibly
> another matter.
>
> Broadly speaking I agree with the other posters who've suggested adding
> other IP addresses and opening up the local port range available.
>
> I'm assuming the talk of 30k connections is because the OP's proxy has a
> 'one in one out' situation going on with connections, and that's why your
> ~65k pool for connections is halved.
>
> K.
>
>



-- 
Ray Patrick Soucy
Network Engineer
University of Maine System

T: 207-561-3526
F: 207-561-3531

MaineREN, Maine's Research and Education Network
www.maineren.net



More information about the NANOG mailing list