I have a server that holds ZFS snapshots that I export via NFS to the servers they are backing up so you can restore via a custom application written in-house. The issue is as follows:
NOTE: I am not using ZFS built-in NFS for a reason, so please don't tell me to use that!
This is all NFS v4
The host is running CentOS 6.2
The client is running CentOS 5.7
I have 8 nfs servers started by default on the host.
On the backup server that holds the NFS shares, I can traverse the directory structure as deep as needed and see all expected files.
On the client, I can traverse the filesystem, but sometimes, and it really seems random, when I go 2 or more directories deep, I end up seeing the files from another server.
Here is an example:
[NFSSERVER /nfs/share]# ls -l
total 60
drwx--x--x 30 root root 4096 Feb 25 00:15 20120225
drwx--x--x 30 root root 4096 Feb 26 00:05 20120226
drwx--x--x 30 root root 4096 Feb 27 00:06 20120227
.....
so on
[NFSCLIENT /app/backups]# ls -l
total 60
drwx--x--x 30 nobody nobody 4096 Mar 2 00:25 20120225/
drwx--x--x 30 nobody nobody 4096 Mar 2 00:25 20120226/
drwx--x--x 30 nobody nobody 4096 Mar 2 00:25 20120227/
......
so on
You can see those are identical, as they should be.
This is where the problem starts. If i go into:
[NFSCLIENT /app/backups/20120225/home] # ls -l
When I run this ls -l on the client sometimes I see the proper files, sometimes I see the home dir of another server.
If I got to [NFSSERVER /nfs/share/20120225/home]# ls -l
When I run this ls -l I see the proper files. If I delete a folder in /nfs/share/ I can see the result on the client immediately. It is only when i go deeper that I see these "cross-mounted" filesystems.
Here is a portion of my /etc/exports (hostnames changed)
/nfs *.domain.com(fsid=0,ro,nohide,no_root_squash)
/nfs/server1/20120308 *.domain.com(ro,nohide,no_root_squash)
/nfs/server1/20120309 *.domain.com(ro,nohide,no_root_squash)
/nfs/server1/20120310 *.domain.com(ro,nohide,no_root_squash)
/nfs/server1/20120311 *.domain.com(ro,nohide,no_root_squash)
/nfs/server2/20120308 *.domain.com(ro,nohide,no_root_squash)
/nfs/server2/20120309 *.domain.com(ro,nohide,no_root_squash)
/nfs/server2/20120310 *.domain.com(ro,nohide,no_root_squash)
/nfs/server2/20120311 *.domain.com(ro,nohide,no_root_squash)
/nfs/server3/20120204 *.domain.com(ro,nohide,no_root_squash)
/nfs/server3/20120205 *.domain.com(ro,nohide,no_root_squash)
/nfs/server3/20120206 *.domain.com(ro,nohide,no_root_squash)
/nfs/server3/20120207 *.domaincom(ro,nohide,no_root_squash)
IF I remove all lines from etc exports EXCEPT the one that is cross-mounting, then reload the exports file (ie, only leaving one entry in /etc/exports), it shows all of the proper directories on the client machine.
So, stale NFS handles? More NFS servers running by default? Something else? Any ideas? I've been banging my head for a couple weeks now on this one.
UPDATE
This is the line of code my script runs that is setting up the directories that are being exported:
mount -t ext4 -o noload,ro /dev/zvol/backups/$HOST@$DATE"-00" /nfs/$HOST/$DATE
The /nfs/$HOST/$DATE folders are the ones being exported (as you can see in the exports file above)