We're monitoring several (~40 so far) servers using Nagios 3, and after some massive headaches trying to check event logs and text logs and so on with active checks, I've got NSCA installed on our Nagios server. The next step is obviously to have backup software report successful runs using send_nsca
, and I've got this working on Windows too (from Nagios Exchange) — BackupExec handily supports running commands only after a backup has been verified, and we're after something similar for NTBackup and Windows Server Backup.
I'm very happy to use a batch file to do this, as NTBackup doesn't seem to have this built-in, but I've found conflicting information on whether NTBackup populates %errorcode%
properly (i.e. only if the backup ran without errors).
Does anyone have experience or ideas for getting NTBackup to report this information correctly, or is there some other solution we "should" be using?
Regards,
Carl
I've used NSClient++ on Windows servers to allow Nagios to run all sorts of useful checks. I highly recommend it, and it might work for what you need.
For example, in one instance I used NSClient++ to check and make sure that the directory backups were written to always had a file that was modified within the last 24 hours. That was a good, albeit simple, way to make sure that a backup ran.
It also has features that allow you to search for events in the Windows event log. Then Nagios can raise an error based on the results. That might be able to provide a more exact check.
I ran into this same issue. I hate that ntbackup has not notification options. I just install ruby on the box and threw together this script. If you set this to run after the backup, in a batch file or what not, you should always get the most recent log file. I have this dropped into an mbox on my nagios server and then parsed by additional scripts.
You could also write a simple script that just does a regex on the most recent log file to determine if the backup was successful. ^/NTBackup finished the backup with no errors./ if that fails to match you could consider it a failure.
In my case i wanted to keep as much data as possible so i just emailed the log and parsed it.
http://pastie.org/1510940
I don't know the answer to your question, however, I might suggest just trying it.
Is it possible for you to set up server to fail it's back up and then test the value of %errorcode% ?
script the backup to run normally. ignore the exit code of ntbackup. scrub logfile for interesting parts. send status to nagios via send_ncsa. profit.
the following from MS technet :
Windows 2000 Backup (Ntbackup.exe) does not have a command-line parameter to specify the location to which reports are saved after a backup operation is finished. The backup report is saved in the profiles folder of the user who performed the backup operation. You can view the reports by clicking Report on the Tools menu in Backup.
Backup keeps only the last 10 backup reports. The corresponding Backup##.log files are located in the "Documents and Settings\User_Name\Local Settings\Application Data\Microsoft\Windows NT\NTbackup\Data" folder.
It would be nice if a command can be run / created / backup data logs access where a backup status could be ascertained like:
Just with that limited amount of info, a system check can be scheduled to execute after the backup should have finished and a Windows 2008 Server Backup status could be determined and communicated by Nagios.
Anyone have an idea where to find the data or how to write the program to fetch it?