I had a downtime today on my server because of a high IOWait. I could not do mostly anything on the server, I managed only to run top
to see the IOWait, but I did not have iotop
installed at that time, so I couldn't see which process is causing it. Is there anyway to monitor the iowait live and in case of high load to dump information about the process which is causing it?
I would take the approach of understanding why you have I/O wait. It's probably not a process you'd want to kill indiscriminately, but a result of your system configuration and resources.
Do you have enough storage resources? Is your server physical or virtual? Does your application write a lot of data? These are all factors that could impact the I/O wait levels and performance.
When you were able to check
top
, did you see a high system load as well? If so, you may want to alert on that. A simple way to check and notify for such conditions is to use a system monitor like Monit.