On a Dell C6220 server with 4 nodes it occurs from time to time that all 4 nodes are powered off. It seems to happen when there is a slight power fluctuation (despite there being an UPS on both power supplies, in fact 2 different UPSes for the 2 PSUs).
There are 5 other servers on the same UPSes (other models, not C6220), none of which turn off at that moment.
Message in the system log is:
2018/01/31 12:21:10 System ACPI Power State Sys Pwr Monitor S5/G2: soft-off
It is impossible to turn on any of the nodes with the power button or via software. The only way to turn them back on is to remove the power cables from BOTH power supply units and plug them back in.
This is the same behaviour as described in the post here on the Dell forum, however, there is no answer/solution on that post.
Is there any way to avoid it? What is the reason for this behaviour? None of my other servers did turn off. Admittedly, there was a slight power problem (possibly 0.5 sec power off) but with 2 separate UPSes I would expect that at least one of them would not have forwarded the power drop to the server, even if one of them would be faulty.
There are a number of power config options on the C6220. Here is how they are set:
Power Management <NodeManager>
Chassis Power Management > Chassis PSU Configuration
Required Power Supplies: 1
Redundant Power Supplies: 1
Power Capping
Chassis Level Capping: Enabled
Emergency Throttling
Sled Level Policy: Chassis Level
Chassis Level Policy: Throttling
These settings are the same on all 4 nodes.
BIOS version 2.5.3
BMC version 2.59
0 Answers