My Ubuntu 11.04 server on the internet has some strange behavior since a few days. It runs perfectly fine with some Java web applications. Then, suddenly it does not accept connections anymore. When I try to ssh or to http-connect my server I get no response, until I get timeout. But ping works perfectly. nmap also works:
Starting Nmap 5.21 ( http://nmap.org ) at 2011-08-29 10:52 CEST
Nmap scan report for ...
Host is up (0.020s latency).
Not shown: 994 closed ports
PORT STATE SERVICE
22/tcp open ssh
25/tcp open smtp
53/tcp open domain
443/tcp open https
3000/tcp open ppp
3128/tcp open squid-http
After reboot, everything works again for some hours.
What could this be? Or how to analyse this problem?
This really does look like you are running out of memory, with no swap on the system. If a linux system runs out of memory, it cannot accept TCP connections anymore because the connection needs memory to be established. ICMP might not need anything since there is not state to maintain.
Check your memory settings everywhere, and make sure you do not allocate more than 70% of the total memory to the JVM (-Xms and -Xmx options).
Activate a swap if not yet done, you can create a basic swap file somewhere on the disk:
If after that your system hangs again, it's time for some low level monitoring.
You should look at your Fail2ban service, I've face the same problem with a hosted linux box, and that came from the fail2ban pre-installed config file.
Or, it could be a DoS as said mailq.
-Xmx (max heap size) is not all memory allocated for the jvm, another sizable amount is allocated for PermSize (-XX:MaxPermSize) and some more for internal usage. Use top or ps to find out how much your jvm is using and leave room for os+buffers (1Gb plus 150k per concurrent connection is a good start).