I routinely use Unison to sync users' home directories between workstations where the user is expected to work. Unfortunately, as the firm grows, the Unison become slower and slower in determining what files have changed. The time taken by actual transfer is negligible in comparison.
The synchronization is done in the star topology, with RAID-6 unison server in the center. Some workstations use Windows (with NTFS), some Linux with either Ext-4 or BTRFS(!).
At the time of writing, there is one user, whose home directory is 45GB large with 100K files, and the full synchronization time takes about 30 minutes for him. Note, that simple directory traversal with find >null
takes less about 2 minutes.
What are the strategies for further speeding up the process? (except reducing the number of files to sync) I believe that in theory the Unison can be sped up, but the fastcheck
option isn't enough.
OK, I've found the culprit: the unison does ignore the
fastcheck
option forxls
andmpp
files and always performs the full comparison for them. It does so, because Excel is in habit of modifying the xls files without changing the last modified date.Unfortunately for us,
xls
makes files about 20% of the whole volume of the documents.Editing the
/usr/bin/unison
in hex editor and replacingxls
for something not likely to be found (likexxx
) did the trick.In Unix filesystems (btrfs, ext4) this procedure should be safe, as any change of the file should change the inode number, and unison is supposed to use this information if available. As for the clients based on ntfs, I think we should suffer the slow time... or maybe there is some alternative (abandoning Excel or changing the file system).
After this hacking, the unison sped up more than ten times!