I have a server with dual Intel Xeon E5-2667 CPU (6 cores+HT) running OEL
(RHEL
) 6.4
For some reason snmp query to it shows cores from only 1 CPU.
Here's output sensors
command.
[root@host log]# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0: +56.0°C (high = +96.0°C, crit = +102.0°C)
Core 0: +55.0°C (high = +96.0°C, crit = +102.0°C)
Core 1: +50.0°C (high = +96.0°C, crit = +102.0°C)
Core 2: +52.0°C (high = +96.0°C, crit = +102.0°C)
Core 3: +55.0°C (high = +96.0°C, crit = +102.0°C)
Core 4: +52.0°C (high = +96.0°C, crit = +102.0°C)
Core 5: +56.0°C (high = +96.0°C, crit = +102.0°C)
coretemp-isa-0001
Adapter: ISA adapter
Physical id 1: +43.0°C (high = +96.0°C, crit = +102.0°C)
Core 0: +43.0°C (high = +96.0°C, crit = +102.0°C)
Core 1: +41.0°C (high = +96.0°C, crit = +102.0°C)
Core 2: +42.0°C (high = +96.0°C, crit = +102.0°C)
Core 3: +41.0°C (high = +96.0°C, crit = +102.0°C)
Core 4: +40.0°C (high = +96.0°C, crit = +102.0°C)
Core 5: +41.0°C (high = +96.0°C, crit = +102.0°C)
my /etc/snmp/snmpd.conf
has the following line to allow full access:
view all included .1 80
Yet here's what happens when I snmpwalk this server:
[root@host log]# snmpwalk -c public -v 2c localhost sensor
LM-SENSORS-MIB::lmTempSensorsIndex.1 = INTEGER: 1
LM-SENSORS-MIB::lmTempSensorsIndex.2 = INTEGER: 2
LM-SENSORS-MIB::lmTempSensorsIndex.3 = INTEGER: 3
LM-SENSORS-MIB::lmTempSensorsIndex.4 = INTEGER: 4
LM-SENSORS-MIB::lmTempSensorsIndex.5 = INTEGER: 5
LM-SENSORS-MIB::lmTempSensorsIndex.6 = INTEGER: 6
LM-SENSORS-MIB::lmTempSensorsIndex.7 = INTEGER: 7
LM-SENSORS-MIB::lmTempSensorsIndex.8 = INTEGER: 8
LM-SENSORS-MIB::lmTempSensorsDevice.1 = STRING: Physical id 0
LM-SENSORS-MIB::lmTempSensorsDevice.2 = STRING: Core 0
LM-SENSORS-MIB::lmTempSensorsDevice.3 = STRING: Core 1
LM-SENSORS-MIB::lmTempSensorsDevice.4 = STRING: Core 2
LM-SENSORS-MIB::lmTempSensorsDevice.5 = STRING: Core 3
LM-SENSORS-MIB::lmTempSensorsDevice.6 = STRING: Core 4
LM-SENSORS-MIB::lmTempSensorsDevice.7 = STRING: Core 5
LM-SENSORS-MIB::lmTempSensorsDevice.8 = STRING: Physical id 1
LM-SENSORS-MIB::lmTempSensorsValue.1 = Gauge32: 60000
LM-SENSORS-MIB::lmTempSensorsValue.2 = Gauge32: 44000
LM-SENSORS-MIB::lmTempSensorsValue.3 = Gauge32: 42000
LM-SENSORS-MIB::lmTempSensorsValue.4 = Gauge32: 42000
LM-SENSORS-MIB::lmTempSensorsValue.5 = Gauge32: 42000
LM-SENSORS-MIB::lmTempSensorsValue.6 = Gauge32: 41000
LM-SENSORS-MIB::lmTempSensorsValue.7 = Gauge32: 41000
LM-SENSORS-MIB::lmTempSensorsValue.8 = Gauge32: 44000
How can I make snmp report temperature for cores on all cpus?
Something seems to be off, because you have this line:
but nothing afterwards, as if there were only 8 slots for sensors. There is a bug report with dual Intel Xeon E5-2670 (8 cores), where the last Device line is this:
So there are 10 slots there, again only one processor.
There are some Ubuntu instructions successfully showing 20 slots (no "Physical id" lines there) , although with a completely different processor and using the miscSensors category. They say there that "according to the lm-sensors installation page, Net-SNMP 5.5 or higher is required", which is the version on RedHat 6.4.
In any case you may try upgrading Net-SNMP and see if that solves the issue. But maybe it really is a problem with the MIB and that particular family of processors, in which case that that bug needs to be solved.
This was causing me grief as well found the answer here - https://bbs.archlinux.org/viewtopic.php?id=127017 It has something to do with the name output of sensors being duplicated for the second cpu once you add the aliases as described in the article it shows up on an snmpwalk.
Per https://sourceforge.net/p/net-snmp/mailman/net-snmp-coders/thread/54DA2206.9080007%40redhat.com/#msg33390512, duplicate names for sensors were not supported before https://github.com/net-snmp/net-snmp/commit/e886f5eb9701851ad6948583156bfd59fcb6110f. That means that if two sensors are named temp1 (one for the GPU, the other for the CPU) only the latest to show up in the sensors command will show up in snmp.
This patch is in net-snmp starting from 5.8. Might be that your net-snmp is 5.7 or below as was mine (Debian buster). The workaround in this case, per @Kevin answer, is to set up labels for the names of the sensors. That is in /etc/sensors.conf on the agent define:
net-snmp takes them into account after an snmpd restart on the agent.