I have two Dell R515 servers running CentOS 6.5, with one of the broadcom NICs in each directly attached to the other. I use the direct link to push backups from the main server in the pair to the secondary every night using rsync over ssh. Monitoring the traffic, I see throughput of ~2MBps, which is much less than I'd expect from a gigabit port. I've set the MTU to 9000 on both sides, but that didn't seem to change anything.
Is there a recommended set of settings and optimizations that would take me to the maximum available throughput? Moreover, since I am using rsync over ssh (or potentially just NFS) to copy millions of files (~6Tb of small files - a huge Zimbra mailstore), the optimizations I am looking for might need to be more specific for my particular use case.
I am using ext4 on both sides, if that matters
Thanks
EDIT: I've used the following rsync
options with pretty much similar results:
rsync -rtvu --delete source_folder/ destination_folder/
rsync -avHK --delete --backup --backup-dir=$BACKUPDIR source_folder/ destination_folder/
Currently, I'm looking at the same level of bad performance when using cp
to an NFS export, over the same direct cable link.
EDIT2: after finishing the sync, I could run iperf
and found performance was around 990Mbits/sec, the slowness was due to the actual dataset in use.
The file count and SSH encryption overhead are likely the biggest barriers. You're not going to see wire-speed on a transfer like this.
Options to improve include:
-e "ssh -c arcfour"
)dd
, ZFS snapshot send/receive, etc.tar
, netcat (nc
), mbuffer or some combination.tuned-adm
settings.rsync
command. Would-W
, the whole-files option make sense here? Is compression enabled?As you probably know copying a lot of little files (eg mailboxes using MailDir format or similar) is definitely not the best option to take advantage of high bandwith interfaces. SSH is probably not the best transport protocol for that either. I would try using tar to create a tarball on the source host prior to send it to you secondary host.
If you need incremental backup you may want to try the
-g
options of tar. If you still need to maximize throuput, try using netcat instead of ssh.Try teasing apart the contributing factors:
and testing them independently.
I've had some bad experiences with Broadcom drivers, so my first suggestion is to test the usable network bandwidth with:
dd if=/dev/zero bs=1m count=10k | rsh backup_host cat \> /dev/null