I've got a process that:
- writes a new '.tmp' file.
- uses a
rename()
syscall to replace an existing file. - This file is being accessed from a remote NFS client.
We do this, because we want atomic file updates, and the rename()
spec says:
If newpath already exists, it will be atomically replaced, so that there is no point at which another process attempting to access newpath will find it missing. However, there will probably be a window in which both oldpath and newpath refer to the file being renamed.
We rely on this behaviour.
But here's the gotcha - just recently, since migrating to a new NetApp (Cluster mode, from 7 mode) - we've had a process that very occasionally falls over with ENOENT
- no such file or directory.
By 'very occasionally' I mean - 4 or 5 times in the last few weeks, on a process that happens every 5 minutes or so.
I'm investigating with the vendor as to whether this might be a bug or not with their NFS server.
But what I'm actually trying to figure out is whether that atomicity guarantee is actually applicable to NFS. Is anyone able to clarify for me if rename()
's atomicty guarantee applies to multi-client NFS scenarios? I'm not actually sure that this feature is one that has been working, but was never guaranteed to in the first place.
From: RFC1813
Procedure RENAME renames the file identified by from.name in the directory, from.dir, to to.name in the directory, to.dir. The operation is required to be atomic to the client.
In case it's relevant, we've got SL 6.5 clients hitting NFS datastores on ONTAP-CDOT 8.3.
Avoiding race conditions in NFS
This is always a fun challenge and the only work around I know of without rewritting applications is to mount the share with the options
sync
and change the NFS server to useno_wdelay
. I don't recall how to set no_wdelay in NetApp.The downside of this method is that if you have a lot of simultanious writes to this share, they will get exponentially slower. You may want to ask NetApp how to set no_wdelay on that share, or just describe the problem to them. They may have better ideas. I have not touched a NetApp in at least 8 years.