This problem is basically driving me insane, at this point. I have an Ubuntu 16.04 NFS server that was working fine with this configuration:
/etc/fstab:
UUID=b6bd34a3-f5af-4463-a515-be0b0b583f98 /data2 xfs rw,relatime 0 0
/data2 /srv/nfs/cryodata none defaults,bind 0 0
/usr/local /srv/nfs/local none defaults,bind 0 0
and
/etc/exports
/srv/nfs 192.168.159.31(rw,sync,fsid=0,crossmnt,no_subtree_check)
/srv/nfs/cryodata 192.168.159.31(rw,sync,no_subtree_check)
/srv/nfs/local 192.168.159.31(rw,sync,no_subtree_check)
This has all been working fine for months on the one nfs client using this configuration so far using these client side /etc/fstab entries:
kraken.bio.univ.edu:/local /usr/local nfs4 _netdev,auto 0 0
kraken.bio.univ.edu:/cryodata /cryodata nfs4 _netdev,auto 0 0
However, since this is a very large storage server, it was decided that it needs to accommodate several labs. So, I moved all the stuff that had been scattered across the /data2 partition into a /data2/cryodata subdirectory, and updated /etc/fstab on the server and /etc/exports as follows:
/etc/fstab:
...
/data2/cryodata /srv/nfs/cryodata none defaults,bind 0 0
/data2/xray /srv/nfs/xray none defaults,bind 0 0
/data2/EM /srv/nfs/EM none defaults,bind 0 0
/usr/local /srv/nfs/local none defaults,bind 0 0
and
/etc/exports
/srv/nfs 192.168.159.31(rw,sync,fsid=0,crossmnt,no_subtree_check)
/srv/nfs/cryodata 192.168.159.31(rw,sync,no_subtree_check)
/srv/nfs/EM 192.168.159.31(rw,sync,no_subtree_check)
/srv/nfs/xray 192.168.159.31(rw,sync,no_subtree_check)
/srv/nfs/local 192.168.159.31(rw,sync,no_subtree_check)
This simply does not work! When I try to mount the new mount on the client using the same client /etc/fstab entry:
{nfs client} /etc/fstab:
...
kraken.bio.univ.edu:/local /usr/local nfs4 _netdev,auto 0 0
kraken.bio.univ.edu:/cryodata /cryodata nfs4 _netdev,auto 0 0
.
# mount -v /cryodata
mount.nfs4: timeout set for Sat Feb 24 09:24:38 2018
mount.nfs4: trying text-based options 'addr=192.168.41.171,clientaddr=192.168.159.31'
mount.nfs4: mount(2): Stale file handle
mount.nfs4: trying text-based options 'addr=192.168.41.171,clientaddr=192.168.159.31'
mount.nfs4: mount(2): Stale file handle
mount.nfs4: trying text-based options 'addr=128.83.41.171,clientaddr=129.116.159.31'
...
The /usr/local continues to mount without problems. The first time I tried this I did forget to unexport/export the filesystems using exportfs -var
before making changes, but since then I've switched back and forth, being careful to unexport and umount everything, with several server reboots in between. The original mount of a bind mount of the entire partition always works, and the bind mount of a subdirectory fails with the stale nfs handle message every time. I've tried enabling other nfs clients that have never mounted these partitions and get exactly the same error message: in this case it is definitely a server side problem. I've checked /var/lib/nfs/etab to make sure it's cleared out between mount attempts, etc.
I thought the technique of bind mounting into an nfs server root directory resolved all these kinds of issues, but apparently not? The odd thing is /usr/local is a subdirectory of another partition, and it always mounts fine. It is on an ext3 md raid 1, although I can't imagine this matters.
I've spent hours on this and have almost broken google looking for a solution to no avail.
Notice that I am only exporting bind mounted filesystems. This section from the exports man page is relevant:
My faulty assumption was that bind mounted filesystems have some kind of UUID that NFS can use automatically; and assumption reinforced by the fact that both these bind-mounted exports worked fine without an fsid:
However, this results in inconsistent behavior. I added a bind mounted /opt:
resulted in inconsistent behavior; i.e. could change the export IP and mount on one machine, but get permission denied on another. The solution was to add an fsid:
So the solution is to always add an fsid to export bind mounted filesystems.