We are running FreeNAS (which is built on FreeBSD) to run our data storage systems. It is running on an APC Smart-UPS 750VA X.
On a couple of occasions, I have been alerted to the fact that the server is down by our monitoring systems. After a few minutes the server is back up and running no problems.
When I run last
I can see that the server has just booted and checking /var/log/messages
I can see it has run through all of the boot process however I can't see any panics or any reason for it shutting down. It literally goes from being fine to outputting boot information.
So this has led me to wonder if its a power outage that is causing this to happen but how can I determine for sure this is the case? I guess getting an interactive card for the APC UPS and hooking it up to the network would be one way... Any other way of me finding out right now why this has happened?
I think there are a few obvious solutions to finding out more:
Your machine can't really tell what happened in a power outage: those electrons just stop showing up. The UPS might know (if you're losing power, as opposed to a flaky power supply or something) but I don't think you have much hope of the server being able to tell you.
FreeBSD has a great port named
sysutils/apcupsd
intended to interact with APC smart-UPSes.Link your UPS with usb-cable to the host. Edit /usr/local/etc/apcupsd/apcupsd.conf:
That config means the next behaviour:
when power is lost for less than ANNOYDELAY seconds, UPS just goes on battery with no signal
after ANNOYDELAY seconds UPS begin to beep.
when (accu level becomes lower than BATTERYLEVEL percent) OR (estimated time on battery is less than MINUTES),
apcupsd
will wait for KILLDELAY seconds and beginshutdown -h now
process.after that UPS will power off the load and goes into hybernation.
when power is back, UPS power on the load and, if it is configured to boot after power loss, it will be launched normally and cycle is complete.
I don't feel like you've done the bare minimum of troubleshooting here. This has become a bad question because of the scare details presented.
Obvious thing to do...
(easy)
(also easy)
Save your logs to disk. You may change log path to disk by this util or change path manually. At next reboot you can find out a reason.