I have a CentOS-5 server that has very little load. A typical output of "sar" command shows like this:
02:20:01 PM CPU %user %nice %system %iowait %steal %idle
02:30:01 PM all 0.63 0.00 1.23 0.31 0.00 97.84
02:40:01 PM all 0.92 0.32 1.34 0.45 0.00 96.97
02:50:01 PM all 0.76 0.00 0.93 0.77 0.00 97.54
03:00:01 PM all 0.41 0.00 0.72 0.15 0.00 98.72
03:10:01 PM all 0.78 0.32 1.49 1.83 0.00 95.58
03:20:01 PM all 0.27 0.00 0.50 0.62 0.00 98.61
However, I am observing that every night at 4:10AM, the CPU usage is very high. This happens almost every night except few days in a month. The "sar" command shows something like this:
03:00:01 AM all 0.10 0.00 0.10 0.11 0.00 99.68
03:10:01 AM all 0.08 0.32 0.18 0.21 0.00 99.21
03:20:01 AM all 0.03 0.00 0.03 0.19 0.00 99.76
03:30:01 AM all 0.01 0.00 0.02 0.06 0.00 99.91
03:40:02 AM all 0.75 0.32 3.21 15.43 0.00 80.28
03:50:01 AM all 0.95 0.00 3.43 9.25 0.00 86.36
04:00:01 AM all 0.69 0.00 3.79 4.86 0.00 90.66
04:10:01 AM all 22.38 0.54 4.47 30.92 0.00 41.69
04:20:01 AM all 2.99 0.02 4.58 4.95 0.00 87.46
04:30:01 AM all 0.34 0.00 0.61 13.22 0.00 85.83
04:40:01 AM all 0.05 0.32 0.17 0.52 0.00 98.95
04:50:01 AM all 0.22 0.00 0.36 0.11 0.00 99.32
05:00:01 AM all 0.23 0.00 0.36 0.12 0.00 99.28
How can I determine what is causing this peak in CPU usage and iowait etc. I examined /var/log/messages but could not determine. There is no other cron job or other things set to run at that time. Since this is showing under %user, I can only make out that it is some user process. How can I exactly determine the cause?
This is when the
cron.daily
scripts run... By default, it runs daily just after 4:00am.From
/etc/crontabs
And in
/etc/cron.daily
.The culprit is probably your
mlocate.cron
, which runsupdatedb
and can be quite intensive on some systems:As stated in /etc/crontab, there is an automatic cron run scheduled at 04:02 AM, each day:
Running a ls /etc/cron.daily/ will show you the running commands scheduled for 04:02AM (they are executed sequentially). You could easily modify them to output load 'before' and after they are executed.
However, if this is a standard CentOS install, there is a very high probability that the script responsible for the high load is either mlocate.cron, 0logwatch (if it is installed) or logrotate.
mlocate will execute a full filesystem scan (which would explain the increased i/o wait on your system), 0logwatch will parse your log file (which can be quite big, resulting in i/o operations) and logrotate can also create significant i/o if it moves file across file systems.
Since you are saying that this happens at 4:10 almost everyday a crude way would be to run
ps aux
in cron for every minute between 4:05 to 4:15 and get the output written in a file. You can then check the %CPU usage of the command that is eating up your cpu.Likely to be a cron job - possibly "locatedb" being updated?
Adding your own cron'd output of "ps" for a night might help spot the culprit.