I am looking to begin a tape backup regimen and am looking to keep data flowing to the tape drive in a sufficient manner (120+MBs target sustained) but cannot figure out how to do so without a dedicated source drive/array that idles when not writing tapes. The documentation for our specific drive mentions no minimum throughput required.
Enviroment
- Linux Debian writing to tape using mt & tar backing up RAR archives with recovery record, each ~1GB-300GB in size
- LTO-4 Tapes on Quantum TC-42BN tape drive via SAS over external SFF Cable
- Server is used for file backups only, no network services or fileserving.
- MD RAID arrays with data intermittently being read/written in spurts throughout the day/night.
If the source array has significant reads/writes (from scheduled backups) during a tape write, throughput to the tape would drop dramatically even if temporarily. So some questions centered around source array/tape write throughput:
- I am assuming a sustained drop in throughput to below 10-20MB/s (or less) on the source during a tape write would be a problem?
- Do I need to have a source guaranteed to have no backups scheduled to it? Essentially 2 arrays minimum; one for backups and one for archives and tape writing?
- Is there a QOS for drives/arrays that could prioritize the tape writing over all else?
- LTO-4 tape drives throttle, so is there a common lower throughput limit to maintain for LTO-4 or does it vary widely per drive? Again, documentation mentions max designed speed and "variable speed transfers", but no mention of how variable.
- Am I missing something in this source-throughput equation, or have unfounded worries?
Update:
I decided to tax things minimally with a single I/O stream via a 600GB archive job reading from the array at about 30MB/s sustained while a tar was being written to the tape from a 4 drive RAID 6 with consumer SATA. The tape definitely slowed to a crawl via listening to the drive but did NOT seem to run out of data or shoe shine. This tells me to NOT expect things to keep up during a full scheduled backup for our hardware configuration but it can handle a less taxing I/O job wile writing to tape.
As of note, the LOT4 tapes must do 56 end-to-end passes so effectively it writes in ~14GB chunks before it stops for some seconds to slow down and then "go" the other direction. I think this helped keep the drive "fed" with data under lower throughput as I have read ahead and async writes set in the stinit.def.
Another note is a read of "dd if=/dev/st0 of=/dev/null" only produced a result of 107MB/s. This, I would assume, is the real-world max effective throughput of this the drive and NOT 120 MB/s. The drive is currently on a dedicated SAS PCIe HBA with no other PCIe cards installed
In the meantime, I setup a 1TB RAID0 as a Disk2Tape buffer and had to add another disk to server to make this feasible.
I would still love to find away to do some sort of QOS for the tape drive and set writing to tape top priority so we can simplify our arrays and reduce parasitic hardaware costs, but in the mean time, I'm not seeing a way to NOT get around having a dedicated disk2tape buffer if I want to ensure continuous writes no matter what scheduled jobs hit the array.