My goal is to automate a backup routine on a small OpenSolaris NAS (running OmniOS + napp-it on a HP Microserver N54L) in combination with SATA disks.
Background:
I have installed one of those 5.25" -> 3.5" carrier-less HDD trays that contain a simple SATA or SAS/SATA backplane with one port, a power button and some LEDs (power and HDD activity). To backup multiple HDDs (one each week in rotation, stored offsite), I have written a script that uses zfs send/recv
to dump the complete main pool including all snapshots (updating only new blocks). This script works fine when I manually start it.
I'd like to further automate that process, because the NAS does not have direct VGA or serial console attached and it is tedious to insert the disk, go back to the desktop system, log onto the web interface or SSH and start the script manually. Timed start via cron job is not an option, because the days of backup may vary slightly (forgot the disk, holidays, etc.). So the backup should start right after insertion of the disk.
Problem:
In the script I use cfgadm
to connect + configure and later unconfigure + disconnect the disks. If I only insert the disk and it spins up, I have no way of knowing that the disk is there. Possible solutions I've considered already:
- Probing for a new disk and zpool every x minutes continuously by using
cfgadm -f -c connect
and checking for error results. Not very elegant. - Checking
/var/adm/messages
every x minutes and grepping for device path or AHCI. Not possible, because messages are only written if the device is connected manually. - Using
iostat -En
. Displays the disks, but I have to grep for the exact serial numbers, because it does not list port information. Also needs to be done every x minutes. - Using
cfgadm
with SELECT syntax to filter for receptacle status. Does not work, because the insertion does not trigger anything (maybe backplane is too cheap for that). - Recognizing the power on/off of the enclosure. Would be okay, but I couldn't figure out how to accomplish this.
- Remapping the power button or adding another button to the machine. Could work, but I also don't know how to do this.
I think I would need two things:
- a reliable way to identify disk and port status in combination (so only the correct disk in the correct slot is detected)
- a way to register this detection and trigger an event (start shell script)
Is this possible? If not, what would you suggest as alternatives?
Final solution (updated 2015-01-26):
For anyone with similar problems in the future:
- Enable AHCI hotswap in OmniOS as detailed in the accepted answer by gea.
- Use
syseventadm
as detailed in my own answer to trigger the backup script when the disk comes online. - Make sure your cables, controller and disks are fault-free and play well together (I had problems with WD SE 4TB disks and the onboard AHCI SATA controller, which resulted in random
WARNING: ahci0: ahci_port_reset port 5 the device hardware has been initialized and the power-up diagnostics failed
messages in the system logs).
Onboard Sata/AHCI is hotplug capable but this is disabled in OmniOS per default: To enable add the following line to /etc/system
set sata:sata_auto_online=1
Interesting question... a bit of a science experiment, as I'd probably just use USB or send remotely or have this on a schedule...
But in your case, I wouldn't try to "look" for the disk at all from a
cfgadm
or log parsing manner. That's not scalable.I'd simply name the removable disk with a unique ZFS pool name and script logic around a periodic
zpool import
. In ZFS under Linux, the pool import process is a systems service/daemon. But there's no cost to running it periodically. It'll detect the drive and associated pool.I hope you're exporting the pool when you're done with the backup as well. That would cover situations where the drive remains in the server for multiple backup cycles. Like leaving a backup tape in its drive.
I'll add this answer to document what I found out about monitoring events (may also be useful in other cases):
While trying to ask the question on unix/linux.SE, I noticed a useful thread about using
udev
on Linux to monitor for kernel events. As an equivalent tools for Solaris, I stumbled upon the suggestion to usesyseventadm
which watches for sysevents and triggers defined actions/scripts.At first I did not find much except copies of the man page and some discussions about a problem with Xen Hypervisor, but the supported events are listed in
/usr/include/sys/sysevent/eventdefs.h
(or online at/usr/src/uts/common/sys/sysevent/eventdefs.h
in various repos) and other files in that directory.Using the first example from the manpage and
syseventadm add -c EC_zfs -s ESC_ZFS_scrub_start /path/to/script.sh \$pool_name
I successfully tested a sample event that fires every time a scrub is initiated and returns the pool name as first argument.After some trial and error, I found the correct way to monitor for newly added disks:
Everything after
disk
is optional and directly passed to the script as arguments$1
to$5
.Now as soon as the newly added disk comes online, the script will be triggered and the script can check if the device ID is correct (optional) and then import the pool by name.