If want to check if a process is running and start it if not. My script below is buggy and always says that a process is running. What is wrong?
$ ./check_n_run thisisnotrunning
./check_n_run: thisisnotrunning is already running
Here is the script:
$ cat check_n_run
#!/bin/sh
USAGE="usage: $0 processname"
if [ $# -ne 1 ] ; then
echo "$USAGE" >&2
exit 1
fi
ps ax | grep -v grep | grep $1> /dev/null
if [ $? -eq 1 ]
then
echo "$1 not running"
# start here
else
echo "$0: $1 is already running" >&2
fi
exit 0
The problem in your script is that (with the shell you're using) in a pipeline, each command runs in a separate subshell, and none of their statuses is propagated to the parent process. So after
command1 | command2
,$?
is always 0.Even if you fixed it, your script is highly unreliable: it will match processes with a name that contains your process as a substring. Linux provides the
pidof
command that does exactly what you're trying to do.However this is still not ideal, because there could be another process with the same name. It would be better to use a proper service supervisor, such as Debian/Ubuntu's
start-stop-daemon
, or an upstart service.lockfile
(from procmail), mentioned by Wrikken, is also a possibility.As long as ps / grep doesn't error out, $? whould be 0.
edit: according to Gilles (I have neither shell available a.t.m.):
I usually tend to use
lockfile
in the startup of those processes...Although your script could be altered to:
Problem: if your system is anything like mine, the output of
ps ax
includes not only the process name, but the entire command line used to run it. So when you runthe output of
ps ax
will literally include that line, sogrep
will always find a result forthisisnotrunning
. That explains why your script always reports that the program is running.To get around that, there are a few options. As Gilles mentioned, the best one is to use
start-stop-daemon
to start and stop your script. If that's not a possibility, you can usepidof
to detect whether the given executable is running. And if for some reason that were not available, you could usewhich only prints processes with the given command name.
I think you tried the hard way. I don't know what distribution you use, but try to see the path
/var/run
. It's a directory including the pid of every process.Just try
ls *something*
. If something is return, then your process is running. Otherwise it's not.