We have a fairly complex systemd service with Type=notify
. Very occasionally, the service will hang on startup/restart, and enter the failed
state after systemd failes to receive a sd_notify
call. We would like to restart the service in these cases - chances are, the second time it will start correctly.
However, the systemd.service(5) manual page says:
When the death of the process is a result of systemd operation (e.g. service stop or restart), the service will not be restarted.
Is there a way to overcome this restriction in the systemd configuration? Otherwise we will have to monitor the daemon status and manually restart it everytime it's stuck, or develop some kind of supervisor script for this, which may introduce more failure points into the system.
No, there's no possibility.
This is a safety measure to avoid the process going into a restart limbo in case of a corrupt config file after a manual intervention.
There's the Restart= option, but for it to work the process has to fire up right once before.
So the better way for you would be to check why your service will sometimes hang on startup or restart and solve this problem.
If you are unable to do so, you could write a simple shell wrapper with an endless loop to start the service or, which might be better, setup a local instance of a service monitoring program like Monit.