Our server is overloaded with TCP/IP sessions, we have 1200 - 1500 of them. Most of them are hanging in TIME_OUT state. It turns out that a connection in TIME_OUT state occupies a socket until 60 second time-out is elapsed.
The problem is that the server gets unresponsive and many clients are not getting served.
I have made a simple test: download an XML file from the server with Internet Explorer 8.0 The download finishes in a fraction of second. But then I see that the TCP/IP connection is hanging in TIME_OUT state for 60 seconds.
Is there any way to get rid of TIME_OUT waiting or make it less to free the socket for new connections?
I understand why TCP/IP connection enters TIME_OUT state, but I don't understand why Internet Explorer does not close the connection after the XML file download is over.
The details.
Our server runs web service written in Perl (mod-perl). The service provides weather data to clients. Client is a Flash appication (actually Flash ActiveX control embedded in Windows application).
OS: Ubuntu
Apache "Keep Alive" option is set to 0
This is a setting in your TCP stack. Since we don't know what platform you're on we can't say exactly what it's called and how to change it.
UPDATE
So you're using Ubuntu. You can use
sysctl
to reduce thenet.inet.tcp.msl
value to half the desiredTIME_WAIT
duration (in milliseconds -- seeman -S 4 tcp
), e.g.sysctl net.inet.tcp.msl=2500
. Beware of the implications of doing so with respect to wandering packets that may arrive after theTIME_WAIT
period elapses.I assume you mean
TIME_WAIT
. The peer that initiates the active close is the one that entersTIME_WAIT
(see state transition diagram here) so if you can have your client close the connection then you'll move theTIME_WAIT
off to the client. See this answer for more details and a link to a good article aboutTIME_WAIT
problems and how to solve them.Another alternative, if you cant have the client issue the active close, is to reset the connection by setting linger to false before closing it. This causes an
RST
to be sent rather thanFIN
.The server being unresponsive likely has nothing to do with the number of connections in TIME_WAIT state. It's not clear what you mean by "occupies a socket" -- the server should long since has
close
d the socket at that point. The system should be able to handle tens of thousands of connections in TIME_WAIT state.