When you have LVM, you have an entry for a scheduler in /sys/block
for your physical volumes, but also for each individual logical volume, and the raw device.
We have a Debian 6 LTS x64, kernel 2.6.32 system running Xen hypervisor 4.0 (3Ware 9650 SE hardware RAID1). When running virtual machines on each logical volume, on which one do you need to set the scheduler if you want to influence how they get scheduled by the OS? If you set the logical volume to deadline
, will that even do anything when the physical volume is set to cfq
? And if you do set it do deadline on the logical volume, will those deadlines be honoured even when the disk is slowing down because of IO on other LV's that are set to cfq
?
Question relates to IO on VMs slowing down other VMs too much. All guests use noop as scheduler internally.
Edit: according to this, in a multipath environment, only the DM's scheduler will take effect. So if I want to handle IO between virtual machines in a deadline
manner, I have to set the DM path of the physical volume (dm-1 in my case) to deadline
. Is that right? There is also a scheduler for sdc, which is the original block device of my dm-1. Why doesn't shouldn't it be done on that?
edit2: but then someone says in the comments that dm-0/1 doesn't have a scheduler in newer kernels:
famzah@VBox:~$ cat /sys/block/dm-0/queue/scheduler
none
On my system (Debian 6, kernel 2.6.32), I have:
cat /sys/block/dm-1/queue/scheduler
noop anticipatory [deadline] cfq
Question is also, do I have a multipath setup? pvs
shows:
# pvs
PV VG Fmt Attr PSize PFree
/dev/dm-0 universe lvm2 a- 5,41t 3,98t
/dev/dm-1 alternate-universe lvm2 a- 1,82t 1,18t
But they were created with /dev/sd[bc]. Does that mean I have multipath, even though it's a standard LVM setup?
The main question, I guess, is do I have to set the scheduler on sdc or dm-1? If I do iostat, I see a lot of access on both:
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sdc 0,00 0,00 13,02 25,36 902,71 735,56 42,68 0,08 2,17 0,73 2,79
dm-1 82,25 57,26 12,97 25,36 902,31 735,56 42,72 0,18 4,73 0,84 3,23
So, what is what and who is the boss? If it's sdc, I can tell you that setting it to deadline doesn't do a thing for the performance of my VMs. Looking at the difference in the 'requests merged' columns (first two), I'd say it's dm-1 that controls the scheduling.
So, the answer turned out to be simply: the underlying device. Newer kernels only have 'none' in
/sys/block/*/queue/scheduler
when there is no scheduler to configure.However, for a reason unknown to me, the devices on this server are created as multipath devices, therefore my fiddling with the scheduler on
/dev/sd[bc]
never did anything in the past. Now I setdm-1
anddm-0
to deadline with aread_expire=100
andwrite_expire=1500
(much more stringent that normal) and the results seem very good.This graph shows the effect on disk latency in a virtual machine, caused by another virtual machine with an hourly task:
You can clearly see the moment where I changed the scheduler parameters.
Hmm, Debian...
Well, I can share how Redhat approaches this with their tuned framework. There are profiles for "virtual-host" and "virtual-guest". The profile descriptions are explained in detail here, and the following excerpt shows which devices are impacted. The "dm-*" and "sdX" devices have their schedulers changed.
Also see:
CentOS Tuned Equivalent For Debian and Understanding RedHat's recommended tuned profiles
as vmware recommends, is better to use noop scheduler, if your guest are using file as virtualdisk, in this way your guest pass the IO to your host directly withouth reorganize the IO twice in your guest and in your physical host