I have a Debian squeeze (2.6.32-5-amd64) which is at the same time a NFS4 server and client (it mounts itself through NFS4). The local directory that leads directly to disk is /nfs4exports/mydir
, whereas /nfs4mounts/mydir
is the same thing mounted through NFS, using the machine's external IP address. Here is the line from fstab
:
192.168.1.75:/mydir /nfs4mounts/mydir nfs4 soft 0 0
I have an application that writes many small files. If I write directly to /nfs4exports/mydir
, it writes thousands of files per second; but if I write to /nfs4mounts/mydir
, it writes 4 files per second or so. I can greatly increase speed if I add async
to /etc/exports
. (Writing a single large file to the NFS-mounted directory goes at more than 100 MB/s.)
I examine the server statistics and I see that whenever a file is written, it is "committed" (this also happens with NFSv3):
root@debianvboxtest:~# mount -t nfs4 192.168.1.75:/mydir /mnt
root@debianvboxtest:~# nfsstat|grep -A 2 'nfs v4 operations'
Server nfs v4 operations:
op0-unused op1-unused op2-future access close commit
0 0% 0 0% 0 0% 10 4% 1 0% 1 0%
root@debianvboxtest:~# echo 'hello' >/mnt/test1056
root@debianvboxtest:~# nfsstat|grep -A 2 'nfs v4 operations'
Server nfs v4 operations:
op0-unused op1-unused op2-future access close commit
0 0% 0 0% 0 0% 11 4% 2 0% 2 0%
Now in the RFC, I read this:
The COMMIT operation is similar in operation and semantics to the POSIX fsync(2) system call that synchronizes a file's state with the disk (file data and metadata is flushed to disk or stable storage). COMMIT performs the same operation for a client, flushing any unsynchronized data and metadata on the server to the server's disk or stable storage for the specified file.
I don't understand why the client commits. I don't think that the "echo" shell built-in command runs fsync
; if echo
wrote to a local file and then the machine went down, the file might be lost. In contrast, the NFS client appears to be sending a COMMIT upon completion of the echo
. Why?
I am reluctant to use the async
NFS server option, because it would apparently ignore COMMIT. I feel as if I had a local filesystem and I had to choose between syncing every file upon close and ignoring fsync
altogether. What have I understood wrong?
because this is how NFS works, and is exactly how it should work since it's a synchronous protocol. What you need to make sure is that the file system that is exported is backed by LUNs that have NVRAM/BBWC protection and properly handle fsync() - ie ignore that, and mask SCSI FUA flags and SCSI_CACHE_SYNCHRONIZE commands. Also make sure that the file system has no barriers enabled if it's backed by BBWC/NVRAM.
This way NFS keeps it's synchronous semantics and is equivalent to running fsync() after every write, but you get the performance of running asynchronously.