I've noticed that the CPU usage on our 8-CPU database server, running SQL Server 2008, is not balanced at all.
Here are the 1 day averages for a random day a while back, which is typical and consistently asymmetric:
9, 15, 10, 21, 18, 21, 14, 9
(only thumbnail here, because the image is really tall, but click through for the full size image)
Compared with our 4-CPU web servers, which are all almost exactly and perfectly balanced all the time, that struck me as odd.
Now, this is a dedicated server, so the only thing running on it is SQL Server 2008 (and the built-in full text indexing, which we use pretty heavily), so I'm not sure why the CPU usage would be so asymmetric. Thoughts?
The scales are different on all of them, other than a spike on 4 of the graphs your averages would all be about 10-25%.
How are your files / filegroups set up?
I'll plagiarize myself:
One more thought on IO: we were careful to set up our biggest most often used tables to be on filegroups with multiple files in them. One of the performance enhancements of this is that SQL will thread requests to each file in the filegroup - so if BigOverUsedTable is on FileGroup1 and FileGroup1 has four files in it and your DB has 8 cores, it will actually use four cores to do "select big number crunching nasty query from BigOverUsedTable" - whereas otherwise, it will only use one CPU. We got this idea from this MSDN article:
http://msdn.microsoft.com/en-us/library/ms944351.aspx
From TFA:
"Filegroups use parallel threads to improve data access. When a table is accessed sequentially, the system creates a separate thread for each file in parallel. When the system performs a table scan for a table in a filegroup with four files, it uses four separate threads to read the data in parallel. In general, using multiple files on separate disks improves performance. Too many files in a filegroup can cause too many parallel threads and create bottlenecks."
We have four files in our filegroup on an 8 core machine due to this advice. It's working out well.
Edit: this has another (possibly) better answer now. The graphs were off on scale - if you look closely, each processor is actually about 20% loaded as uzbones points out.
Edit: We can actually tell that using multiple file filegroups helps, because we didn't put all our tables in the filegroup with four files. Big queries on the "single file" filegroup only use one CPU, but queries on the table in the four file filegroup hit 4 CPU's.
Check this out:
http://blogs.technet.com/mat_stephen/archive/2005/02/02/365325.aspx
SQL might only be writing to a handful of files and each processor is using each file.
The first thing I check for stuff like that is drivers. I've had a lot of problems with network teaming and iSCSI MPIO drivers sticking on specific cores. I bet that's not the issue here though since it looks like it's happening across 4 cores - I usually only see that with 2 cores. I'll ask around to see if anybody's seen it this wide.
I've also seen it with NUMA boxes where there's a memory mismatch - say half the cores are hooked up to 16gb of ram and the others are hooked up to 8. Google for IBM x460 NUMA if you wanna see some funny info about that. The 460 and related models let you daisy chain several servers together to create big iron - kinda relevant with the scale up vs out blog entry. They're awesome machines.
Because flushing CPU caches is so incredibly expensive that the kernel tries to avoid it at all costs.
(Note: At least Linux does; I'd be surprised if Windows didn't have the same behaviour)