I want to clone one very large directory (many terabytes of multiple-gigabyte files) into another on another drive. I have been using this command:
ionice -c 3 rsync -avz /path/to/sourcedir/ /path/to/destdir/
The process takes over a day and more often than not gets interrupted, hence the use of rsync
to be able to resume without restarting from zero. The theory should be that the above command is idempotent, so when anything fails I should just be able to reissue the same command to let it work out where it was interrupted and continue from there.
Now, because the point of the operation is to retire and recycle the source drive, before doing that I wanted to be super-sure that all files had been properly copied. So I used the approach in this question to compare each file byte by byte. Sure enough, there were a number of files that had a different hash.
So the theory question: does rsync
, unlike what I thought I understood, work merely on file names, rather than content, or at least length?
And the (more important) practice question: are there other options I could be using instead, to force rsync
to produce an exact clone of the source directory? In particular, in the case in which rsync
is launched when the dest directory already has a file with the same name as one in the source directory, but with different content, I want the command to ensure it is replaced (or "completed") with the actual original file from the source directory.
Yes you can make
rsync
look into the files to check that everything matches. Fromman rsync
Of course it will be slow, but
rsync
should find differences that the normal check would not find.But rsyncing is not cloning. If you want a cloned copy, use Clonezilla.