How can this be?
Running a cp or rsync (with -W --inplace) takes two hours for 93Gb; a tape restore over the dedicated backup network is 41 minutes. Tape restore is 50 Mb/s; disk to disk was measured and calculated to be 16 Mb/s tops - 2 Mb/s if the CPU is busy.
The restore software is Veritas NetBackup; the disks are on an EMC Symmetrix array over fiber. The box is an HP rx6600 (Itanium) with 16 Gb running HP-UX 11i v2. All the disks are on one fiber card, listed as:
HP AD194-60001 PCI/PCI-X Fibre Channel 2-port 4Gb FC/2-port 1000B-T Combo Adapter (FC Port 1)
The disks are also all using Veritas Volume Manager (instead of HP LVM).
Update: It occurs to me that this is not just a straight disk-to-disk copy; in reality, it is a snapshot to disk copy. Could reading the snapshot be slowing things down that much? The snapshot is an HP VxFS snapshot (not a vxsnap); perhaps the interaction between the snapshot and VxVM is causing speed degradation?
Update: Using fstyp -v, it appears that the block size (f_bsize) is 8192; the default UNIX block size is 512 (or 8192/16). When testing with dd, I used a block size of 1024k (or 1048576, or 8192*128).
I really wonder if it is the block size. I read over at PerlMonks that the Perl module File::Copy is faster than cp; that is intriguing: I wonder.
If NetBackup is using tar, then it is not using cp: that might explain the speed increase as well.
Update: It appears that reading from snapshot is almost twice as slow as reading from the actual device. Running cp is slow, as is tar writing to the command line. Using tar is slightly better (when using a file) but is limited to 8Gb files (file in question is 96Gb or so). Using perl's File::Copy with a non-snapshot volume seems to be one of the fastest ways to go.
I'm going to try that and will report here what I get.
Another question is whether you're IO bound inside the FC network, ask the SAN guys to demonstrate (graphs are good) actual spare bandwidth available (oh, and if the FC switches are the Cisco ones how they're ensuring they're avoiding the bandwidth issues inside the switch)
Are you limited by reading from, and writing to, the same disk in the array?
If your tape is also on the SAN, then it's possible that the xfer is being handed off and going straight from tape to disk, while a copy is being required to be passed through the host doing the copy, and is therefore slower.
To ensure your test is like for like, try doing the disk copy via tar (NetBackup uses tar to read from tape):
$ tar cf - oldstuff | (cd newdir; tar xf -)
If all of your disks are on the same fibre card, you could theoretically be IO bound on that one card, but I doubt it.
The VxFS snapshot could be adding overhead, especially if the original source file system is busy with writes at the time. VxFS does copy on write, so if the original disk is receiving writes, the snapshot disks will be busy receiving the original disk data.
If the original file system is idle, you can rule out the VxFS being a factor.
If the disks are connected to different buses on your mother board, the data may be copied across 3 or more internal buses and the latency is killing your IO for disk to disk copy. In this case it is entirely possible that the network tape drive has an inherently higher bandwidth path to the target disk than the source disk does.