I have a Solaris 10 x86_64 system running MySQL 5.5. Under heavy usage times, we're getting very slow responses from the database: slow queries running into minutes that normally return in sub-second times. CPU utilization is in the 60-70% range. Load average regularly gets well into the 20s, infrequently into the 40s, and I've seen it up to the 50s. (Two four-core CPUs with HyperThreading enabled.) It acts like an I/O problem, as if it's waiting on disks to write, but I'm not seeing any indications that there is any actual I/O problem. Average disk wait times are consistently 0, average wait queues are in the 0.2-0.3 range, and disk busy percentages occasionally creep into the 15% area. (All of this as according to sar.)
The storage is a zfs zpool of 5 zdev mirrors of two SAS drives. I do not have an intent log device, but I don't see that as being an issue with this workload.
What am I missing?
Wanted to give you a more Solaris-like answer:
On a multiprocessor/multicore box you cannot really use CPU load for much. Initially use
mpstat
rather thanprstat
/top
if you want to see if one of your cores occasionally gets into full utilization.If
mpstat
has 8 lines of output it means you have 8 CPU cores and then any process that you see inprstat
consuming more than 12.5% CPU resources (100/8) is likely to be CPU bound. To test if this is really so you can useprstat -L -p <pid>
to see if any individual thread of that process hits 12.5% because then you know for sure that the process is CPU bound. You have quite a few available CPU cores on your box yet you should always remember that any single processing thread can only live on one CPU core. In order for MySQL to utilize your box it becomes a matter of how good it is to divide the work on multiple threads. If there's a single hot thread in MySQL then your powerful machine doesn't help much.Also you can find quite a few people that will advise to turn Intel HyperThreading off on Linux/Solaris server-workload because it actually performs better without it. YMMM. From what I understand Intel HyperThreading is excellent for the desktop type of workload but for a server that is only supposed to do one thing and do that fast it may work detrimental to performance.
There are at least a dozen routes you can go from here so it is somewhat difficult to advise you before you have more facts.
You mention that the load average is regularly into the 20s and sometimes into the 40s or 50s. Then you mention you have 16 available processors ((2cpus x 4 cores) x 2) so a load average in the 20s means your processes are fighting for CPU time and in the 40s or 50s you have a large amount of wait.
CPU utilization and CPU load don't map together all that well, but I would expect to see a bit higher utilization unless things are not threading correctly.
There is another post that explains load average well using a scenario of traffic on a bridge:
Understanding Linux CPU Load
The short is - in a perfect world your load average would never go higher than your count of processors (16 in your case).
I this most likely the system just being overtaxed rather than a ZFS or MySQL issue.
** Edit, I see there was a comment added that said you are seeing some 100% CPU utilization. This would also fall in line with processes just waiting around for CPU time.