I'm trying to optimize the daily backup of a LVM snapshot of a large MySQL database. It works quite ok when I just cp
the files (local RAID to other local RAID), with an average speed of ~100MB/s. But since the database files (600GB, most of it in two files of 350GB and 250GB) do not change very much over the course of one day, I thought it would be more efficient to only copy the changed blocks.
I'm using
rsync --safe-links --inplace -crptogx -B 8388608 /source/ /destination/
It did work, was slower than the simple copy, and I did not see any read activity on the target disk. My thought was that rsync would read (8MB) blocks from the source and the destination, compare their checksums and only copy the source block into the target file if it was changed. Am I being mistaken here? Why am I not seeing rsync read from the target in order to determine if the blocks have changed?
Here are some graphs:
Disk usage: you see that rsync --inplace (only done for the bigger file on the last day) reduced the "dent" in the disk usage of /mnt/backup, meaning that it did indeed update the existing file in place.
IO stats: the backup is made from sda to sdb. Somehow there is a huge peak in reads from the source, followed by the "normal" read(source)+write(target) activity. I was expecting simultaneous reads from both devices with little write activity on the target.
What you are probably seeing is due to the way how your files are changed and how rsync is calculating checksums. The rsync man page regarding --inplace has a basic explanation:
So you should probably either not use --inplace or use --backup to preserve the old copy of the file. This being said, rsync seems to handle large files rather inefficiently, so it may be not the best tool for the job.
If you are using LVM and really want to transfer snapshot data, you might not want to run rsync which is quite calculation- and I/O intensive on both sides but copy the snapshot's CoW data over to the destination machine using lvmsync instead - this would spare you the I/O and the CPU cycles at the price of a presumably larger transfer size.
Another approach to the problem would do "dumb" block device checksums (e.g. with MD5) and transfer differentiating blocks like in this answer here on ServerFault or in the blocksync.py script (I've linked the most recently active fork of it). It would not depend on snapshots at all, but obviously you would want to create one for the time of the copy to ensure that consistency of your data is maintained.
If you are concerned about your database's write performance with active snapshots, you also could take a look at ddsnap which contains several optimizations for snapshotting and volume replication, effectively working around your concerns.
I believe you want
--inplace --no-whole-file
. Notice that for local filesystems,--whole-file
is assumed (see the rsync man page). See a nice little test on unix.SE. Note the comments.