I have a compute and storage node. The home directory is shared via NFS.
I'm having a problem with the IO when logged from the compute node. It runs incredibly slow. For example a "$wget" or "$svn co" will take several orders of magnitude from the compute node than from the storage node or a "$make" due to creating the files/installing will run slow. Presumably this is due to the IO writing from the compute to the storage node. I've run several tests to convince myself that it is indeed the IO.
Both nodes are connected through a gigabit switch and it's a straightforward install.
I'm at a loss on how to troubleshoot this because I'm very new to NFS setup etc. Any advice on how to even start trouble-shooting this will be greatly appreciated.
Thanks!
Anything with a bunch of small files will be slow via NFS vs. a local filesystem.
First off you could start looking at tuning your NFS options. I found some of these resources:
The IBM article is particularly important, as you tune the available settings on the clients you can use
nfsstat
on the server to see what's going on.Other Options to look into:
One thing you can consider is caching your NFS mounts: How can I cache NFS shares on a local disk?
Since that requires changes on the client side (you don't have to touch the server), you might want to grab a node, mount it with caching enabled, and then do a few runs of some builds to see if it improves the situation.
One thing I've done in this situation when deploying large NFS /home systems is to provide some space on the client machine(s) that is on local disk for users to use for IO intensive tasks, like
/var/$username
, but depending on your space requirements that might not be an option for you.Caveats
It sounds like you're doing pulling and compiling via NFS homes, when I did this at a University where the situation is "Okay everyone in this 50 person class, pull this source and type make!" this can totally crush responsiveness to a machine depending on what people are building. You'll want to experiment with your NFS settings and measure performance to squeeze the most you can out of it.
Step 1 is to make sure you're using NFSv4; that sends small writes/syncs in batches, making it much more efficient for operations that involve lots of small files.