I have a problem with apache 2.2 on my ubuntu 6.06 LTS server, some old rails sites are producing seg faults and all sorts of madness which seems to be eventually dragging down apache. I am migrating them to a 8.04 installation with nginx and passenger, where the bug has been squashed - but that takes time, until then I have tried to setup monit to rescue apache whenever it stops responding:
if failed host www.site.com port 80 protocol http
and request "/" with timeout 5 seconds for 2 cycles
then restart
50% of the time, that restarts apache successfully and saves the day, however, the other 50% of the time apache dies and monit does nothing. When I check monit status, it shows a -1 for the response time here:
port response time 0.061s to www.site.com:80/ [HTTP via TCP]
Where 0.061s would be the -1. I can't seem to find any documentation explaining the -1, or why -1 seems to slip by the failed statement.
Is there anything I can do to make sure monit catches 100% of failures? or can anyone shed light on the -1 and how to deal with it?
What happens if you reduce the number of cycles required for a fail ? Possibly your site is flapping, and you never get two consecutive fails.