I have a Pacemaker cluster that controls several resources of type ocf:heartbeat:IPaddr2
and one of type ocf:heartbeat:nginx
. Since an upgrade to Debian 12 it can no longer start Nginx. What happens is that Pacemaker tries to bring up Nginx on one side, gives up after 40 seconds, then tries on the other side, and again gives up after 40 seconds. During both 40 second intervals Nginx appears to work normally. After these attempts the Nginx resource remains stopped and the error message reads "nginx start on ... could not be executed (Timed Out: Resource agent did not complete within 40s)".
I've gone through various log files but could not find a cause yet. How can I further diagnose and solve this issue? (As a workaround, I can start Nginx outside Pacemaker's control with systemctl start nginx.service
. This is suboptimal compared to the cluster's normal operation, but does show that the configuration is intact.)
Let me suggest another workaround. Have Pacemaker itself run this
systemctl start nginx.service
or... stop ...
. I successfully used this in production (it wasn't a "workaround" for me).Pacemaker can naturally control services through systemd, using its units as explained in the documentation.
To use that, instead of
ocf:heartbeat:nginx
resource, createsystemd:nginx
. In the CIB it looks like<primitive id="nginx" class="systemd" type="nginx" />
Note that Pacemaker will request start or stop of the systemd unit; you need to disable this service autostart in systemd using
systemctl disable nginx.service
.