According to http://net-snmp.sourceforge.net/docs/mibs/ucdavis.html#scalar_notcurrent ssCpuUser
, ssCpuSystem
, ssCpuIdle
, etc are deprecated in favor of the raw variants (ssCpuRawUser
, etc).
The former values (which don't cover things like nice, wait, kernel, interrupt, etc) returned a percentage value:
The percentage of CPU time spent processing user-level code, calculated over the last minute.
This object has been deprecated in favour of '
ssCpuRawUser(50)
', which can be used to calculate the same metric, but over any desired time period.
The raw values return the "raw" number of ticks the CPU spent:
The number of 'ticks' (typically 1/100s) spent processing user-level code.
On a multi-processor system, the '
ssCpuRaw*
' counters are cumulative over all CPUs, so their sum will typically be N*100 (for N processors).
My question is: how do you turn the number of ticks into percentage?
That is, how do you know how many ticks per second (it's typically — which implies not always — 1/100s, which either means 1 every 100 seconds or that a tick represents 1/100th of a second).
I imagine you also need to know how many CPUs there are or you need to fetch all the CPU values to add them all together. I can't seem to find a MIB that gives you an integer value for # of CPUs which makes the former route awkward. The latter route seems unreliable because some of the numbers overlap (sometimes). For example, ssCpuRawWait
has the following warning:
This object will not be implemented on hosts where the underlying operating system does not measure this particular CPU metric. This time may also be included within the '
ssCpuRawSystem(52)
' counter.
Some help would be appreciated. Everywhere seems to just say that % is deprecated because it can be derived, but I haven't found anywhere that shows the official standard way to perform this derivation.
The second component is that these "ticks" seem to be cumulative instead of over some time period. How do I sample values over some time period?
The ultimate information I want is: % of user, system, idle, nice (and ideally steal, though there doesn't seem to be a standard MIB for this) "currently" (over the last 1-60s would probably be sufficient, with a preference for smaller time spans).
Since these are absolute counters, you would have to regularly retrieve these metrics and then do the calculation yourself. So, if you want the number over the next minute, you would have to get the numbers, wait a minute, and get the numbers again. SNMP would not update those numbers too often so you may not be able to get these every second anyway.
Once you have the raw user, nice, system, idle, interrupts counters you can get the total number of ticks by summing these up. Even the MIB description says that adding them up is expected.
Then, regardless of how long it has been since you took the measurements, the total number of ticks over that period is
total1 - total0
. And the idle percentage would be(idle1-idle0)/(total1-total0)
.You are asking "how do you know how many ticks per second it is typically" but as you can see, you don't need to know that.
Since most of the Linux distros have the 1/100 ticks, a very simple way to do it is via bash:
On RH/Centos and Ubuntu, it works well and precisely for 5 secs interval... Less than that, snmp does not increment the Counter32, and you get zeroes all the time.
I have done loops and compared with
iostat -c 5 100
, also generating IO withdd
, and it worked well.You can use any of the ssCPUraw OIDs (1.3.6.1.4.1.2021.11.5x from 50 to 57, if I am not wrong, in my example I have used ssCPURawWait, 54), and the
1.3.6.1.2.1.25.3.3.1.2 | wc -l
is to get the number of cores...You need to divide the "delta" of the counter / interval - in my case, 5 / - this is basically what the script does!
In addition to what has been already written by chutz, the reference to the duration of a tick can be found in
man 2 times
:which is a system function to be called in C but also can be obtained by simply running
getconf CLK_TCK
in your shell. This number is a compile-time constant and could be changed by anyone touching the source files, but this would be a rather rare event - common Linux distros all come with the value 100.For example: