A new (NAT) firewall appliance was recently installed at $WORK. Since then, I'm getting many network timeouts and interruptions, especially for operations which would require the server to think for a bit without a response (svn update, rsync, etc.). Inbound SSH sessions over VPN also timeout frequently.
That clearly suggests I need to adjust the TCP (and ssh) keepalive time on the servers in question in order to reduce these errors.
But what is the appropriate value I should use?
Assuming I have machines on both sides of the firewall between which I can make a connection, is there a way to measure what the time limit on TCP connections might be for this firewall?
In theory, I would send a packet with gradually increasing intervals until the connection is lost. Any tools that might help (free or open source would be best, but I'm open to other suggestions)?
The appliance is not under my control, so I can't just get the value, though I am attempting to ask what it currently is and if I can get it increased.
I'm thinking that you just need to connect from one machine to the other while running a packet capture on one of the machines. Make an FTP, HTTP, SSH, etc. session and just let it sit there until it times out.
I'm not sure what you mean when you say "In theory, I would send a packet with gradually increasing intervals until the connection is lost", but I don't think you need to do anything other than make a connection, capture the traffic, and let it sit until it times out. Timeouts occur on idle sessions and if you send data to the other end it will probably reset the timer as the session will no longer be idle.
When it does time out, look at the timestamp of the capture from the first packet (beginning of the three way handshake) until the connection is terminated (you may or may not see a RST).
Barring any application layer timeouts (depending on what type of connection you make) this should give you an idea of what the timeout setting is configured to.
Maybe the simplest way to know correct timeout value is just ask network administrator to tell you configured settings for new NAT appliance?
I tried making an ssh outbound connection, but I had to do more than just let it sit there. Without interaction, it will appear valid indefinitely, but it will stop accepting input after a certain idle time.
So I tried running:
It then sat idle for more than 5 minutes. At this point, I pressed a key and got:
I probably should have used likely intervals plus a few seconds, but I'm pretty sure the timeout is between 240 and 300 seconds.
Network admins reported the timeout is set to 60 minutes, but this is clearly not the case. The remote side gets a connection closed much sooner, but outbound connections will just hang on my side. This is very frustrating for outbound connections for which the remote side has to think for a bit before responding (svn update, ftp with large remote directories, etc.)