i have process running on windows and linux , i need watch dog that will email me if some process is down more then N seconds/minutes and also this watch dog will try to start it after N time and N trys , is there such thing ?
i have process running on windows and linux , i need watch dog that will email me if some process is down more then N seconds/minutes and also this watch dog will try to start it after N time and N trys , is there such thing ?
If it is a service in Windows (sounds like it should be) you can use the Recovery Tab to restart it and set a script to email you
http://thommck.files.wordpress.com/2011/03/image1.png
For Linux:
Here is a simple bash script to see if a process is running http://www.savelono.com/linux/bash-a-simple-script-to-check-if-a-process-is-running.html
Nagios is a solution that will do it for both Environments but takes a bit of set up.
You may be able to use Nagios for this. Nagios will definitely give you the up/down notifications that you are interested in. You can specify how long before it notifies you by modifying the recheck and notification intervals. There are also addons you can download that can also trigger a script if an application is down. I have not used the scripting portion of this personally but have read about them in a few locations.
Ex. Process X stops running in Linux. Once Nagios determines this server has stopped for Y minutes it will then execute a predefined script such as "/sbin/service service_X restart"
While Nagios is good to monitor processes and send you notifications, it lacks the abilities to execute actions on failure (without addons) and it's a bit complex to setup.
Monit can perform some actions in error situations and it's much easier to setup. So you can set it up to restart processes if they crashed or using to much resources.
It doesn't provide a central interface to manage multiple hosts like nagios does though. M/Monit does but isn't free.
For linux there is the
snmpd
. For windows this is also available. At least in linux - if compiled with the "right" extensions you can define triggered actions for watching processes.