The cluster resource manager Torque typically allocates compute nodes on an exclusive basis. However, when you have a lot of small jobs (like we do) running against multi-core compute nodes, this can result in a lot of wasted resources. Is there any way to configure Torque to allow non-exclusive allocation of the cores on a compute node?
(These jobs are all embarrassingly parallel, so we aren't concerned about contention for the shared network resource. We can't switch schedulers as our customer's job-scripts are all in PBS/Torque.)
OK, this actually turned out to be an issue with Maui. I'm throwing an answer here so others don't have to waste a day. :)
First: make your your Torque
nodes
file lists the nodes with np arguments, i.e.nodename np=8
. This will make sure the resource manager is aware of the correct number of processors.Second, for Maui: make sure your maui.cfg file includes the line
NODEACCESSPOLICY SHARED
. Then non-exclusive scheduling should work.