I know there are quite a few SE questions on this, and I believe I read as many of them as it matters before coming to this point.
By "server-side TIME_WAIT
" I mean the state of a server-side socket pair that had its close() initiated on the server side.
I often see these statements that sound contradictory to me:
- Server-side
TIME_WAIT
is harmless - You should design your network apps to have clients initiate close(), therefore having client bear the
TIME_WAIT
The reason I find this contradictory is because TIME_WAIT
on the client can be a problem -- the client can run of out available ports, so in essence the above is recommending to move the burden of TIME_WAIT
to the client side where it can be problem, from the server side where it's not a problem.
Client-side TIME_WAIT
is of course only a problem for limited number of use cases. Most of client-server solutions would involve one server and many clients, clients usually don't deal with high enough volume of connections for it to be a problem, and even if they do, there is a number of recommendations to "sanely" (as opposed to SO_LINGER
with 0 timeout, or meddling with tcp_tw sysctls) combat client-side TIME_WAIT
by avoiding creating too many connections too quickly. But that's not always feasible, for example for class of applications like:
- monitoring systems
- load generators
- proxies
On the other side, I don't even understand how server-side TIME_WAIT
is helpful at all. The reason TIME_WAIT
is even there, is because it prevents injecting stale TCP
fragments into streams they don't any longer belong to. For client-side TIME_WAIT
it's accomplished by simply making it impossible to create a connection with the same ip:port
pairs that this stale connection could have had (the used pairs are locked out by TIME_WAIT
). But for the server side, this can't be prevented since the local address will have the accepting port, and always will be the same, and the server can't (AFAIK, I only have the empirical proof) deny the connection simply because an incoming peer would create the same address pair that already exists in the socket table.
I did write a program that shows that server-side TIME-WAIT are ignored. Moreover, because the test was done on 127.0.0.1, the kernel must have a special bit that even tells it whether it's a server side or a client side (since otherwise the tuple would be the same).
Source: http://pastebin.com/5PWjkjEf, tested on Fedora 22, default net config.
$ gcc -o rtest rtest.c -lpthread
$ ./rtest 44400 s # will do server-side close
Will initiate server close
... iterates ~20 times successfully
^C
$ ss -a|grep 44400
tcp TIME-WAIT 0 0 127.0.0.1:44400 127.0.0.1:44401
$ ./rtest 44500 c # will do client-side close
Will initiate client close
... runs once and then
connecting...
connect: Cannot assign requested address
So, for server-side TIME_WAIT
, connections on the exact same port pair could be re-established immediately and successfully, and for client-side TIME-WAIT
, on the second iteration connect()
righteously failed
To summarize, the question is two fold:
- Does server-side
TIME_WAIT
really not do anything, and is just left that way because theRFC
requires it to? - Is the reason the recommendation is for client to initiate close() because the server
TIME_WAIT
is useless?
In TCP terms server side here means the host that has the socket in LISTEN state.
RFC1122 allows socket in TIME-WAIT state to accept new connection with some conditions
For exact details on the conditions, please see the RFC1122. I'd expect there also must be a matching passive OPEN on the socket (socket in LISTEN state).
Active OPEN (client side connect call) does not have such exception and must give error when the socket is in TIME-WAIT, as per RFC793.
My guess for the recommendation on client (in TCP terms the host performing active OPEN i.e. connect) initiated close is much the same as yours, that in the common case it spreads the TIME-WAIT sockets on more hosts where there is abundance of resources for the sockets. In the common case clients do not send SYN that would reuse TIME-WAIT sockets on server. I agree that to apply such recommendation still depends on the use case.
This is probably the clearest example of what TIME-WAIT actually does and more importantly why it's important. It also explains why to avoid some of the 'expert' tips on Linux machines to 'reduce' TIME-WAIT's.
A tcp session is identified by the tupple (sourceIP, sourcePort, destIP, destPort). Hence the TIME_WAIT does work on every tcp connection.
Regarding the closing side, in some scenarios, closing from the client side can reduce TIME_WAIT sockets on the server, thus slightly reducing memory. In cases when socket space can be exhausted (due to ephemeral port depletion) (e.g. greedy clients with many connections to the same server), this problem should be solved in any side.
You can never be sure with an unreliable protocol, that you have received the last message from your peer device, therefore it is dangerous to assume that your peer has hung up the phone rather suddenly. It is a major disadvantage of TCP protocol that only 65000 or so ports can be open simultaneously. But the way to overcome this would be to move to a server farm, which scales better with the load, than by recycling port numbers quickly. At the client end it is highly unlikely that you will run out of ports if it is a basic workstation.