On a multi user system, I want to measure each user's CPU usage in seconds of cpu time. For the purpose of this measurement, I assume that if a PID belongs to a user, this user is causing the CPU time - that is I'm ignoring daemons and the kernel.
Currently I'm doing this, every five seconds:
- Get each user and the PIDs they are running via
ps aux
- For each PID, get
x
, the sum of utime, cutime, stime and cstime from/proc/[pid]/stat
- calculate
t = x / interval
(interval isn't always exactly 5 seconds when there's high load)
If I run this, I get sensible looking values. For instance: A user on this system was spinning in python (while True: pass
), and the system was showing round about 750 milliseconds of CPU time per second. When the system hung for a bit, it reported 1600ms for one 1-second inverval. Which seems about right, but I undestand that these values can be deceiptful, especially given I don't really understand them.
So my question is this:
What is a fair and correct way to measure CPU load on a per-user basis?
The method has to be rather accurate. There might be many hundreds of users on this system, so extracting percentages from ps aux
will not be accurate enough, especially for short-lived threads which many pieces of software like to spawn.
While this might be complicated, I absolutely know it's possible. This was my starting point:
The kernel keeps track of a processes creation time as well as the CPU time that it consumes during its lifetime. Each clock tick, the kernel updates the amount of time in jiffies that the current process has spent in system and in user mode. — (from the Linux Documentation Project)
The value I'm after is the amount of seconds (or jiffies) that a user has spend on the CPU, not a percentage of system load or cpu usage.
It's important that we measure CPU time while the processes are still running. Some processes will only last for half a second, some will last for many months - and we need to catch both sorts, so that we can account for users' CPU time with fine granularity.
Sounds like you need process accounting.
http://www.faqs.org/docs/Linux-mini/Process-Accounting.html
On Ubuntu, the process accounting tools are in the
acct
packageTo get a per-user report, run
This will give a line for each user showing the username and their total cpu time:
One of the more obvious answers is to just extend what you're currently doing now.
I came across this monitor process for using bash scripting and mysql to track the cpu time of users but it was spanned across a much larger time frame than you were talking about.
Hopefully this can give you some more ideas about the direction you're looking to head in.
http://www.dba-oracle.com/t_oracle_unix_linux_vmstat_capture.htm
This will also handle processes that have been running for days..not sure how to expand for weeks/months/years..