This may be somewhat subjective, so if you feel the topic should be closed - go ahead.
We make heavy use of Netbackup and a large, multi-drive LTO-4 library. We regularly struggle with getting everything done in the appropriate backup window. One of the things we've avoided is using the multiplex function to drive multiple backup jobs to the same tape. We've heard different reasons for this such as the benefit doesn't justify the risk and have not done it.
As we talk through various options towards solving our throughput issues, this one invariably crops up. I'm looking for opinions and approaches to this issue.
I don't know about the risk in terms of hard numbers, but the benefit of multiplexing N jobs to 1 tape drive while they run is you can start all N jobs at the same time (so you're not waiting on the first guy to finish with the tape before the next one can start backing up).
The big downside I see in doing this is it winds up interleaving your backups. Where right now you may have a tape that has
AAAAAAABBBBBBBBCCCCCCCCCC
on it multiplexing will give you an interleaved tape that has something likeABCABCABCABCABCABCABCABCA
on it.When you go to restore "A" from that tape your tape drive will have to skip over all the Bs and Cs in the way. This slows down the restore, and adds some wear and tear on the tape/drive as it fast-forwards (in terms of risk, there's an increased chance of snapping a tape as a result).
joeqwerty and ErikA both pointed out the solution I use and recommend if you have the disk space for it: Stage everything to disk first, then write it out to tape contiguously. This lets the backups on your machines "finish" (the data is backed up in the disk staging area) and lets the backup system put that data on tape in a logical, contiguous fashion at its relative leisure: You don't care if the tape keeps spinning for 6 hours or 16 hours as long as it's done by the time you start your next backup.
If you don't have the disk to stage everything you can still minimize the break-up by staging as much data as you can. Ideally you would stage up to the size of a tape per backup client if possible (so if server
A
has a whole tape's worth of data it might be contiguous on one tape, or at least only spread over 2 of them), but half-tape or quarter-tape staging areas can still help with performance and minimizing fast-forward operations..I don't know what the current best practice recoomendation for NetBackup is but for BackupExec it's to perform your main backups to disk first and then to tape. A backup job is going to run much faster to disk then it is to tape. This may allow you to get your backups completed within your backup window.
Do you use a disk staging area and then relocate those backup images to tape? If so, keeping multiple jobs on each tape should be no problem. This is how we do it, and have never had any issues to speak of.
However, if you're backing up directly to tape, it would seem ideal to not multiplex if possible.
One of the cool things about NetBackup is the granularity of tuning it allows you. Depending on the SLA of the data you're backing up you can tune up or down the multiplexing settings.
Take the policies that backup data meeting your shortest SLA and keep the multiplexing settings on that policy low (if you must, although a test will tell you how much restore performance you're actually losing by turning up multiplexing settings and tell you exactly how high you can set it for the restore you require). Alternatively, take the backup jobs with the benefit of long RTO or a lenient SLA and turn up the multiplexing settings as high as you can without ultimately degrading performance.
Two additional points: 1. Remember buffer tuning. Properly configure (and test, probably the most important step) both device buffer tuning and communication buffer tuning if you haven't already done so. 2. Use synthetic backups. Synthetics can consistently buy you backup window (and ultimately resources).