When running DTT iotop on a write heavy Solaris 10 server, which runs multiple zones with MySQL daemons installed, I get the following output:
UID PID PPID CMD DEVICE MAJ MIN D BYTES 70 26636 1 mysqld sd1 10 64 R 360448 70 25940 1 mysqld sd1 10 64 R 530432 0 5 0 zpool-rpool sd1 10 64 W 17250816
What bothers me here is the fact that zpool-rpool
takes up most of the io. What can I do to see which of the MySQL or other processes really takes up the IO - a more elaborate breakdown? If zpool-rpool
represents "writes to ZFS", then iotop is really not helping me here... :)
Thanks!
You might find Brendan Gregg's recent blog series on filesystem latency useful. He shows a couple of scripts for investigating filesystem usage with the syscall provider (which should be more reliable for identifying the responsible processes than the io provider used by iotop).
For example, the
syscall-read-zfs.d
script shown in Part 4 could easily be modified to probe on writes and aggregate on pid rather than execname.The output of this script may also be more useful than iotop because it shows the number of IOs and the distribution of IO latency per process. For a database, the latency of reads and synchronous writes are direct measures of performance pain - much easier to interpret than bytes per second.
If you have time, I also highly recommend watching his presentation at BayLISA for a hands-on demonstration of how he goes about investigating MySQL query performance issues.
If you want to measure which applications are reading/writing the most, you want to measure at the syscall level. At the device level it's only kernel threads doing their work.