Ping a Specific Port

Question

kenn

Asked: 2011-05-03 09:36:27 +0800 CST2011-05-03 09:36:27 +0800 CST 2011-05-03 09:36:27 +0800 CST

ntpd crashes without removing pid file

772

We occasionally have seen time differences on our servers, and confirmed that:

ntpd crashed without any traceable logs
ntpq process was dead, but pid existed at /var/run/ntpd.pid
/etc/init.d/ntp restart then ntpq -p, problem solved

At first, ntpq -p returned ntpq: read: Connection refused, so I went ahead and ps aux | grep ntp returned no ntp process, while other working hosts returned something like /usr/sbin/ntpd -p /var/run/ntpd.pid -u 101:103 -g. It seemed that ntpd actually crashed since no logs seen in /var/log/messages, but it's possible that it happened too long ago and that part in the log was already rotated.

So I went on to /etc/init.d/ntp restart and was told that the stale pid existed:

Stopping NTP server: ntpdstart-stop-daemon: warning: failed to kill 2124: No such process`.
Starting NTP server: ntpd.

but everything got back in place.

We're on Debian 6 Squeeze but the problem has been around since Debian 5 Lenny. We installed ntp using aptitude install ntp. Servers are on Linode VPS (= Xen virtualization), so we asked them but they said they had no experience like this.

Also, though I don't know if it's just a coincidence or not, it seems that it happens only on application servers (Ruby on Rails) so far.

Thing is, since the pid file remains when ntpd crashes, it's pretty hard to detect the crash and restart with monit or alike. Should I call /etc/init.d/ntp restart every once in a while by cron?

Any experiences, solutions, thoughts?

1 Answers

Voted

DerfK · Answer 1 · 2011-05-03T10:19:55+08:00

DerfK

2011-05-03T10:19:55+08:002011-05-03T10:19:55+08:00

If you're using monit, their FAQ says that monit checks to make sure that the pid in the pid file is valid in order to detect situations where the program crashes and leaves its pid file behind.

If you're not using monit, then perhaps you can find a monitoring script that communicates with ntpd directly (nagios has several ntp plugins that you might be able to use/reuse)? If you can't communicate with it, then it has probably crashed.

1

ntpd crashes without removing pid file

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Resolve host name from IP address

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?