On a box recently upgraded from SLES 9.3 to 10.2, I'm seeing the following issue:
Prior to the upgrade, an NFS mount (defined via yast, i.e., it appeared in /etc/fstab
) worked correctly. Following the upgrade, however, it is failing. A network trace shows that it is making the initial connection to the NFS server over TCP (for the portmapper RPC), but then it switches to UDP for the subsequent MOUNT call; since the NFS server doesn't allow UDP (with good reason, due to the possible issues with data corruption, as in nfs(5)
), the connection will not go through.
Adding the TCP option (whether in fstab, or at the command line, etc.) has no effect.
In the course of troubleshooting this, I've found that /var/adm/messages is reporting the following as occurring during boot:
Failed services in runlevel 3: network
(I should note that despite this error message, apparently at least some network services are started, since the box is accessible via SSH.)
My questions, then:
- What should I be looking at to determine the cause of the service startup failure?
- Would this indeed be likely to cause the problem with NFS described above?
- If the answer to (2) is no, then any suggestions on what to look for?
Editing to add some information relating to the answers below.
It turns out that the network service is failing on bootup because one of the interfaces (there are two on this box) uses DHCP, and that's not available yet at this time. So I've disabled it for now, stopped/restarted the network service and the NFS client services, but still get the same results.
There's no firewall on the client side. Also, iptables -L on the client side shows that everything is accepted; and there are no entries in /etc/hosts.allow or /etc/hosts.deny.
On the NFS server side, nothing has changed. The remote nfsserver is indeed advertising that it allows both TCP and UDP for all of the NFS services (though there is an iptables rule blocking UDP).
/etc/fstab entry is pretty basic - what you'd get from setting it up in yast:
x.x.x.x:/volume /localdir nfs defaults 0 0
rpcinfo -p for the client box shows only portmapper v2 running, advertising both TCP and UDP. For the server, it shows all of the usual services:
program vers proto port
100000 2 tcp 111 portmapper
100000 2 udp 111 portmapper
100024 1 udp 4047 status
100024 1 tcp 4047 status
100011 1 udp 4049 rquotad
100021 1 udp 4045 nlockmgr
100021 3 udp 4045 nlockmgr
100021 4 udp 4045 nlockmgr
100021 1 tcp 4045 nlockmgr
100021 3 tcp 4045 nlockmgr
100021 4 tcp 4045 nlockmgr
100005 1 udp 4046 mountd
100005 1 tcp 4046 mountd
100005 2 udp 4046 mountd
100005 2 tcp 4046 mountd
100005 3 udp 4046 mountd
100005 3 tcp 4046 mountd
100003 2 udp 2049 nfs
100003 3 udp 2049 nfs
100003 2 tcp 2049 nfs
100003 3 tcp 2049 nfs
The mount call, with the /etc/fstab entry above, is simply:
mount /localdir
although I've also tried it with various options such as tcp, v3, etc.
Both the /etc/fstab entry (hence the mount) and the rpcinfo -p call are using the IP address, so there are no DNS resolution issues involved.
Check to make sure
/etc/hosts.deny
does not contain an entry formountd
, and checkhosts.allow
, for similar reasons. For what it's worth, I usually clear outhosts.deny
and useiptables
to control access.Use
rpcinfo -p nfsserver
to ensure thatmountd
is indeed advertising TCP — there's an option-n
to disable TCP-listening, which (IIRC on SuSE) would likely be set in/etc/sysconfig/nfs
or thereabouts.as i understand your question, you can do the following:
rpcinfo
from the client to the serverbut you can't mount a filesystem from the nfs server on the nfs client and you do not get any error message.
what is the difference between your
rpcinfo
andmount
calls? do you use ip adress in one and fqdn in the other? could you please post both commands with output and returncode?A couple things. First off, you state at the start that
since the NFS server doesn't allow UDP
, and then in your edit mentionThe remote nfsserver is indeed advertising that it allows both TCP and UDP for all of the NFS services
. This seems a little odd. Why does the server advertise something that it doesn't allow?Secondly, are you attempting to use NFS version 2 or version 3? Version 2 supports UDP only, whereas you need version 3 for TCP. Perhaps manually specifying version 3 in the mount options will help? (vers=3) If it's defaulting to 2, then even specifying TCP won't do you any good.
I've also had issues with newer clients attempting to use version 4, when the server didn't quite support it. Your SLES upgrade may have resulted in a different default version. All the more reason to specify it explicitly.
Why don't you post the entry in /etc/fstab as well?
service network restart
and see what messages you get. There should be some information there.Try setting things explicitly and see where that gets you. For instance, in /etc/fstab:
This should at least bypass the portmapper and explicitly try connecting to the TCP ports that you list above and make it easier to tcpdump trace each channel during your debugging.
For reference, in case anyone else comes across this question and wants an answer:
I finally opened a ticket with Novell on this. It turns out that this is a known bug in SLES 10.2 (491140: mount ignores "proto=" for "nfs"), and there is a patch for it (util-linux-2.12r-35.35.2.x86_64.rpm). With that installed, the mount works as expected, and all requests are made over TCP. (Novell support also informed me that this is merged in SLES 10.3.)