The backup performance of a BackupExec installation suddenly dropped by 50-70% for no apparent reason. The was no user intervention, no reconfiguration, nor updates, and all tapes were affected at once. The system is deployed on a Windows 2003 SBS 32-Bit system, no remote agents involved (except the local one, means: no networking involved).
I do not find any clues about the cause of the failure. The result is that the backup is automatically cancelled after 6 hours where it took 4 hours before and it only walked about 50% of the files and 20% of the data volume opposed to a usual complete backup run. The capacity of the tape is also not used (90% before, now only a fraction of it).
I tried to turn of the single instance backups and also tried turning of using snapshot providers to no avail.
There is no error message as the backup job times out before it can finish (so in fact the error is "backup job did not complete within time" or similar).
Update: The problem persists with or without AOFO. We also ran the cleaning tape. 4 tapes are in use since about 2 years, one tape is pretty fresh. Both generations of tapes show the same issues so it seems not related to the tape. However we are going to try again with a brand-new one.
Any ideas how to debug this?
You can Debug BEX using the SGMon utility, it is in the program directory.. however, it has quite extensive output..
You can also create smaller jobs, and run them sequentially, or, to back them up to a "folder" first, then run a "duplicate" backup job to tape. If it fails on the folder job, its a network/source issue, if it fails on the tape, its a drive[r]/tape issue.
One of our servers started to do something like this, we got the drive itself replaced ASAP, problem solved.
Check your job logs. This kind of thing is generally caused by BE having a screaming fit over a single file somewhere (possibly an Access database, PST or similar on a file share which a user has left a file lock on), and it should be immediately possible to identify precisely the point during the job at which things slow down.
The capacity of the tape is also not used (90% before, now only a fraction of it).
I've seen behaviour like this, and what the problem was was that the tape set was X uncompressed, and X*2ish compressed. Once I got more than X, the backup slowed WAY down (because of the compression overhead) and suddenly I had all this extra space.
There has to be at least one remote agent involved, the one on the server you're backing up, even if it's the backup server itself. Have you checked the tape drive for any errors or alerts? does the tape drive need cleaning? Are you using the AOFO?