I'm analyzing a problem where performance of CPU-bound workloads inside virtual machines often (not always) is way below what we would expect based on the underlying hardware.
We're using Hyper-V on Windows Server 2012 R2. The server has dual Intel Xeon E5-2643 v2 @ 3.50 GHz.
Here are some figures that seem to be relevant:
- Hyper-V Hypervisor Logical Processor, % Total Run Time, Instance _Total: Average 20%
- Hyper-V Hypervisor Virtual Processor, CPU Wait Time Per Dispatch, Instance _Total: Average 20000 (this number seems to be totally on the safe side, so it doesn't seem like the hypervisor has to "steel" from virtuals CPU to schedule time to logical CPUs of another VM; seems to translate in an overhead of 2%)
- Hyper-V Hypervisor Logical Processor, % of Max Frequency, Instance _Total: Average 34%
- CPU-Z tool shows most of the time around 1200 MHz for Core #0 of both processors (pretty much matches the % of Max Frequency reported by Performance Monitor)
On a desktop with only a few cores, core speed goes up immediately as soon a CPU-bound activity starts.
On our Hyper-V hosts however, core speed seems to go up only if overall system load seems to be high for a few seconds. Now e.g. if you have a VM with 4 virtual CPUs out of a total of 24 physical (with Hyperthreading turned on), and this VM needs CPU power and Task Manager inside the VM shows nearly 100% CPU usage, most of the time the clock speed of the physical CPU won't go up and performance is bad.
Obviously this is unwanted behavior. Think of a database server that needs 3x the time to answer a query because the server has not "enough" load to step up CPU frequency. That doesn't make any sense.
I found a blog post describing the exact same behavior for VMWare and Cisco blades, from 2011. I didn't find information on this anywhere else.
I was actually able to get rid of this behavior by switching to the Windows "High performance" power plan in powercfg.cpl
, at the cost of around 30% higher power usage. I actually get better and more consistent performance and Performance Monitor shows lower load figures.
(On an older server, I found an additional setting "Processing power management | Minimum processor state" which could be set to 100% without disabling all other power saving options. The new ones show only "system cooling policy" which is on "Active" even for the "Balanced" plan, so my only option was to choose "High performance".)
Is this really best practice for Hyper-V hosts, or is there any other workaround? If SpeedStep is really a problem, I wonder why they even build it into server CPUs and enable it by default and why I never read about this setting in a Hyper-V configuration guide?