I just started to get a Nagios warning from our build server, stating that the number of processes has exceeded the limit. Looking at our Munin graphs, I can see that the number of processes has increased steadily from 280 in December to the current value of 430.
I'm wondering how I can go about identifying the causes of the increased number of processes, so that I can restart services or adjust their configuration as necessary.
Server details: CentOS 5.1, the main things running are our Hudson build server which runs under Tomcat, and an Apache httpd server which is mainly just a proxy for Hudson. I've tried restarting httpd and Tomcat, but the number of processes stayed the same. "top" says that only one of the processes is active; the rest are sleeping.
Try this out on a regular basis to see how process counts go up and down for a "certain" named process. It disregards PID and just looks at the end of the line beyond the cpu time.
This works on a RHEL box. You might put it in cron after getting a baseline of what the starting process list looks like.