Wanted to post this here in case anyone else has this issue. We have a Dell PowerEdge R710 running VMWare ESXi 4.1, we got an SNMP trap message via Nagios monitoring:
SERVICE CHECK: theServerName;SnmpTrap;2;chassis marginal Type: chassis Status: marginal Name: 'VMware Rollup Health State' value: 0 233:2:10:09.44
We found a post on VMware Communities about it:
VMware Communities post about SNMP message: http://communities.vmware.com/thread/345217
vmwESXEnvHardwareEvent [1] vmwSubsystemType.2 (Integer): chassis [2] vmwHardwareStatus.2 (Integer): marginal [3] vmwEventDescription.2 (DisplayString): Type: chassis Status: marginal Name: 'VMwareRollup Health State' value: 0 [4] vmwEnvHardwareTime.2 (Time Ticks): 350 days 09:16:01 [5] snmpTrapEnterprise.0 (Object ID): vmwESX
Then the error did go away after a bit (another SNMP trap received):
SERVICE ALERT: theServerNameSnmpTrap;CRITICAL;SOFT;1;chassis normal Type: chassis Status: normal Name: 'VMware Rollup Health State' value: 0 233:9:46:00.04
What does "chassis marginal" mean in this case?
In our case it was the battery in the Perc h700. It was equivalent to "Cache Battery 0 in controller 0 is Charging (Ready) [probably harmless]." message going from an SNMP get.
The cache battery charges from time to time, and as long as that charging completes within a few hours everything is OK.