Ping a Specific Port

Question

Josh Nankin

Asked: 2011-08-23 08:19:06 +0800 CST2011-08-23 08:19:06 +0800 CST 2011-08-23 08:19:06 +0800 CST

monitoring nfs with monit

772

I'd like to monitor NFS mounts and the NFS server process using Monit.

On the server, I'd need a PID file, but I can't seem to find a way of getting that created with existing configuration files. Is there a way to do this, or has anyone monitored the server in a different way (checking if port 53 is active, etc).

On clients, I was thinking of making Monit simply look for a specific file in an NFS mount, and if it's accessible, all is well. Problem is, if the NFS server does go down, file requests usually hang (perhaps even indefinitely, not sure). How would one get around this issue with monit?

Any configuration examples would be greatly appreciated!

6 Answers

Voted

Christian · Answer 1 · 2012-03-14T01:06:09+08:00

Christian

2012-03-14T01:06:09+08:002012-03-14T01:06:09+08:00

As for the "hanging" of the Monit process during NFS server faults, this can be circumvented by two methods.

You change the NFS mount options from hard to soft, which causes the NFS layer to issue an I/O error to the accessing application after retrans retries. As this can introduce other problems with respect to data integrity (your writing applications need to be able to cope with I/O errors or at least exit cleanly, without corrupting the file written), you may also try to:
asynchronize your check (disentangle it) from Monit. You may define a cronjob regularly checking your NFS-mounted file and writing another "NFS state file" eg. to /tmp. That way, just the cronjob will hang (and not your Monit client) if the NFS server goes away. Your Monit check now just checks this second-stage "NFS status file" AND whether it is much older than the cronjob's frequency (which would indicate such hanging of NFS).

Hope this helps!

2

claasz · Answer 2 · 2012-06-07T01:15:01+08:00

claasz

2012-06-07T01:15:01+08:002012-06-07T01:15:01+08:00

The general approach would be (assuming none of the Monit built-in rules are applicable)

Find out how you would do the checks manually
Write shell scripts performing these checks, returning 0 for 'success' and 1 for 'failure'

Let Monit test those scripts (example is from official documentation):

check program myscript with path "/usr/local/bin/myscript.sh"
   if status != 0 then alert

For the specific problem, this could mean

Server: It probably depends on your OS, linux distro, NFS 3 or 4 etc, but it should be easy to figure out. E.g. on Ubuntu 12.04, I would test whether NFS server is running via
```
$ service portmap status
$ service nfs-kernel-server status
```
Create a shell script returning 0 if both commands return 'running'.
Client: To check whether a certain NFS share is currently mounted, I mostly use df -h. So the corresponding shell script would look like
```
#! /bin/bash
df -h | grep -q thesharename
```

2

devicenull · Answer 3 · 2011-10-09T15:38:29+08:00

devicenull

2011-10-09T15:38:29+08:002011-10-09T15:38:29+08:00

Did you check the init scripts for nfs already? I'd suspect that they are creating a pid file and sticking it somewhere for future restart or stop operations. If not, it should be pretty simple to modify them to do so.

As far as checking the mount goes, take a look at section 4.3.1 at http://nfs.sourceforge.net/nfs-howto/ar01s04.html#mounting_remote_dirs . If you mount it with the 'soft' option you will get behavior that lets you monitor it, but this should not be used for the actual mount. Perhaps you want a second mount just for monitoring?

1

Alarig · Answer 4 · 2020-04-11T01:26:49+08:00

Alarig

2020-04-11T01:26:49+08:002020-04-11T01:26:49+08:00

I’m directly using the df test without a specific script:

check program nfs-var with path "/bin/df -t nfs4 /var"
        if status != 0 then alert
        if status = 1 then exec "/bin/mount /var"

1

Jacques · Answer 5 · 2018-01-12T15:48:54+08:00

Jacques

2018-01-12T15:48:54+08:002018-01-12T15:48:54+08:00

I wanted to reply to claasz, but I do not have enough reputation point. The idea of using an external script is very good, because it provides flexibility and suggesting to use portmap or rpcinfo to check for nfs server availability is quite smart.

I have found a script on Github from Thibaut Madelaine that I think should be interesting to many who face the same problem. He uses rpcinfo like this rpcinfo -u 123.456.789.12 nfs 3 where 123.456.789.12 is the ip address of your nfs server.

If all is good, the response will instantly be something like program 100003 version 3 ready and waiting and if it failed 123.456.789.12: RPC: Program not registered. Of course the response may vary depending on your system flavour I guess.

0

Firze · Answer 6 · 2019-05-23T22:02:09+08:00

Firze

2019-05-23T22:02:09+08:002019-05-23T22:02:09+08:00

Create a script called test-mount.sh to test mount. I am using file create and delete as I find just reading a file unreliable.

set -e
/bin/touch /my-mounted-dir/test/mount.test
/bin/rm /my-mounted-dir/test/mount.test
exit 0

set -e tells the script to stop execution and return error if any command fails.
Use touch to create a file.
Remove the file.
exit 0 will tell monit script succeeded.

Create test on monit config. This will run the test-mount.sh and if it fails it will run remount-data.sh. You can replace this with anything you want to do in case of a failed mount.

    check program test-mount with path /root/test-mount.sh timeout 5 seconds 
      if status != 0 then exec "/root/remount-data.sh"

0

monitoring nfs with monit

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Resolve host name from IP address

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?