This probably has a simple explanation, but I certainly can't think of it.
I've got corosync installed (via yum), with it's default init script. Something is strange on this particular CentOS installation as I often need to manually link /etc/rc.d/init.d/ to /etc/init.d.
The issue is that it fails when run via it's symbolic link, yet it runs fine through /etc/rc.d/init.d
What's even weirder is it fails to run if run using the full path, and only if actually run in the /etc/rc.d/init.d directory.
Example:
[~]# /etc/rc.d/init.d/corosync start
Starting Corosync Cluster Engine (corosync): [FAILED]
[~]# service corosync status
corosync is stopped
[~]# cd /etc/rc.d/init.d/
[init.d]# /etc/rc.d/init.d/corosync start
Starting Corosync Cluster Engine (corosync): [FAILED]
[init.d]# corosync start
[init.d]# service corosync status
corosync (pid 1985) is running...
Any explanation?
Edit:
Not sure what I've changed exactly, but it now works when started from /rc.d/init.d, but not with service corosync start.
[root@server2 mirror]# /etc/rc.d/init.d/corosync start
Starting Corosync Cluster Engine (corosync): [ OK ]
[root@server2 mirror]# /etc/init.d/corosync start
Starting Corosync Cluster Engine (corosync): [FAILED]
[root@server2 mirror]# service corosync start
Starting Corosync Cluster Engine (corosync): [FAILED]
edit 2:
Made a symbolic link from /etc/rc.d/init.d to /etc/init.d .. and now it works when run via service corosync start.. yet doesn't start on boot, argh.
Edit 3:
It's working with every command except on boot.
I've changed the run level to 99, which it still fails on, and I've changed the path inside the script to the absolute path : /usr/sbin/corosync
I've also done diffs of the environmental variables:
On service corosync start:
_=/bin/env
LANG=en_US.UTF-8
PATH=/sbin:/usr/sbin:/bin:/usr/bin
PWD=/
SHLVL=1
TERM=xterm
On boot:
_=/bin/env
LANG=en_US.UTF-8
PATH=/sbin:/usr/sbin:/bin:/usr/bin
PWD=/
SHLVL=2
TERM=linux
CONSOLETYPE=vt
LANGSH_SOURCED=1
previous=N
PREVLEVEL=N
runlevel=3
RUNLEVEL=3
UPSTART_EVENTS=runlevel
UPSTART_INSTANCE=
UPSTART_JOB=rc
Boot log:
Starting Corosync Cluster Engine (corosync): [FAILED]
So now the script works when the system is already up but not during boot.
Is there perhaps a third version of the corosync script? Is the version in /etc/rc2.d/ linked to the one in /etc/init.d/ or is it different?
Incidentally, we have got this far without even considering your cluster setup. If this is part of a cluster, there may be clues in the log files of the other nodes.
Try to debug the init script with
-x
Also try to use
service start|stop|status corosync
.If
corosync start
works in any directory but/etc/rc.d/init.d/corosync start
fails, then probably you are running two different scripts. Run:which corosync
If not, check inside the corosync script for relative paths that should be absolute paths.
So the problem now is that
/etc/init.d/corosync start
works with bash -x but not without, and not on boot. Is that right?Does
bash /etc/init.d/corosync start
(without the-x
work)?Probably there is an environment variable set in your profile which is not there when the system runs the script during the boot process. Add a line like
env |sort > /tmp/env.$$
to/etc/init.d/corosync
and thendiff
the resulting files.One other possibility is a hidden dependence on another service which starts later in the boot process. Try changing to
S99...
I had the same issue...
check:
# getenforce
if it returns "Enforcing" then you have to disable SELinux in file: /etc/selinux/config
and dynamicaly:
# setenforce 0