My lab is currently working with an ubuntu based distribution in 20.04 on a machine with 72 cores. We'd like to setup a job scheduler so we could submit computing jobs to each of the different cores. I've gone through the slurm quickstart but have not managed to get it to work. I can't get the daemon to start. Previously my lab used the TORQUE job scheduler by installing an older version as described here. I have yet to try installing TORQUE on this machine because the packages aren't showing up with apt-get and installing all the dependencies seems difficult manually. Has anyone managed to get a job scheduler to work?
Ubuntu has ionice
, but as far as I can tell, it does absolutely nothing.
I suspect this is because Ubuntu replaced cfq with deadline and deadline doesn't support priorities.
Is there any possible way to have prioritized I/O on Ubuntu anymore?
EDIT: The context is that I have a database restore that easily consumes all my I/O and renders my system unusable until it has finished. I'd like it to remain usable for other tasks.
You can not choose CFQ scheduler in Kubuntu 19.04 since it has been removed from the 5.0 kernel. In my case I need CFQ because it gives the best performance with my rotating hard drive when running a virtual machine with Windows 10 as guest S.O, the other schedulers make the system unusable. Kubuntu 19.04 in the installation by default only offers 2 elevators mq-deadline and none that in my case offer a performance much worse than CFQ.
sudo cat/sys/block/sda/queue/scheduler
mq-deadline none
Only I have to try with 2 other schedulers that do not appear in the installation by default, these scheduler are BFQ and Kyber.
Next I will describe how to enable the BFQ and Kyber modules
1) First verify that the modules exist in the system with the following commands:
sudo modprobe bfq
sudo modprobe kyber-iosched
if there was no error you can verify that the modules are loaded with the command
sudo cat/sys/block/sda/queue/scheduler
which must return
mq-deadline [bfq] kyber none
2) make these modules load with the system start:
sudo -i
echo kyber-iosched > /etc/modules-load.d/kyber-oisched.conf
echo bfq > /etc/modules-load.d/bfq.conf
3) The next step is to tell the system which scheduler to use, then a file is created if it does not exist:
/etc/udev/rules.d/60-scheduler.rules
with the following lines
# set cfq scheduler
ACTION=="add|change",KERNEL=="sd[a-z]",ATTR{queue/rotational}=="1",ATTR{queue/scheduler}="bfq"
ACTION=="add|change",KERNEL=="sr[0-9]",ATTR{queue/rotational}=="1",ATTR{queue/scheduler}="bfq"
if instead of BFQ you want to try kyber, replace the last word of the line where it says "bfq" with "kyber"
4) make the system recognize the changes
sudo udevadm control --reload; sudo udevadm trigger
and the changes are verified with
sudo cat/sys/block/sda/queue/scheduler
mq-deadline kyber [bfq] none
Finished.
Sources:
https://community.chakralinux.org/t/how-to-enable-the-bfq-i-o-scheduler-on-kernel-4-12/6418
https://unix.stackexchange.com/questions/375600/how-to-enable-and-use-the-bfq-scheduler#376136
Being unable to ssh into a machine I connected it to a monitor and found the following:
The machine is running Ubuntu Server 18.04 LTS and is a first generation 8 core Ryzen 1700. I've restarted the machine since and it works fine but am not sure what caused this in the first place and want to avoid it happening again.