I use Solaris SMF to monitor, report, and automatically restart processes after a crash on Solaris systems. Is there anything similar either as portable open source or in the Linux kernel? For those not familiar with SMF, this is the functionality I'm interested in:
System runs a script to start the service and then keeps track of all processes it created even if they create their own process group. If they all die, it runs a stop script and then the start script again.
Automatically do a stop/start cycle on command waiting for all processes to stop before initiating the start.
A service dependency tree with crash handling rules. Such as service "A" must be running before service "B" may start. If "A" goes down then "B" must be stopped.
Get a list of services that are currently not running due to their start script failing.
You are looking for Monit There are others too, but i've only used monit. Good stuff.
Fedora 15 will come with systemd. The author even mentions SMF heavily.