An NFS server (it's some sort of NAS appliance, I do not know what brand, nor anything about how it is configured) has a single filesystem exported and mounted on a Linux client. Server and client are both on the same local network. The remote filesystem is laid out like this:
/nfs/server/mount/data1/
client_uuid_1/
data_20170101.gz
data_20170102.gz
...
client_uuid_2/
data_20170101.gz
data_20170102.gz
...
...
data2/
client_uuid_3/
data_20170101.gz
data_20170102.gz
...
...
In total, there are order of 40,000 client_uuid_N directories and order of 300,000 data files. I noticed that a script that has to scan all of the directories for new data files was very slow just to do a glob operation, so I investigated further and I found this bizarre phenomenon:
The first time I run find /nfs/server/mount/data[12] -name data_*.gz > /dev/null
after remounting the file system, it completes in roughly five minutes. That's already painfully slow and suggests some sort of problem, but it gets weirder.
The second time I run that command, it takes four times longer -- nearly twenty minutes. This is exactly backward from what I would expect.
Why might this be happening?
The
find
command scans directory for entries as well as gets files attribute to discover sub-directories and recurrently scan them as well. When you runfind
first time nfs client will issue READDIR operations which in addition to getting directory listing asks files attributes as well. This is quite efficient, as you have only few nfs requests going over the network. On the second run, as directory is not changed, cached listing is used. However, client checks for file attributes changes for each file, which increases total execution time. Technically, filesystem object type can't be changed (file will never become a directory), but nfs client is not aware of which attributes are actually needed by application asstat
call, used byfind
command queries for all file attributes.It sounds counter intuitive, but if you drop file system cache (
echo 3 > /proc/sys/vm/drop_cache
on linux), then you will get a better performance on the second run as well.There was a discussion about it on linux nfs mailing list, if you want to dive deeper into technical details of linux implementation.