Although similar threads have been asked on this site and its siblings before, I've not managed to glean the answer to this persistent question. Any help is much appreciated.
The situation: I've got two laptops; both contain a ton of music. Sometimes I move these music files to different locations, or change the metadata in them, or convert them to a different format. I might do any of these things on either machine. I rarely do all of them at once — ie it's unlikely that I'll convert a file's format and move it to a different location all in one go. I'd like to be able to synchronize these changes without having to sift through everything that was renamed or moved.
I'm familiar with rsync but I find it inadequate, because
- although it can compute checksums, it doesn't have any way to store them. So if a file differs, it can't figure out which side it changed on. This also means that it can't attempt to match a missing file to a new one with the same checksum (ie a move)
- if the filesize and date are the same, it , so it takes an epoch to do a sync on a large repository. I would like to only check the checksum if the files
- even if you turn on checksumming, it still doesn't use it intelligently: ie it checksums files even if the sizes differ. IIRC.
- it's not able to use file metadata as a means of file comparison. this is sort of a wishlist item but it seems doable.
I've also looked into rsnapshot, but its requirement to create a full backup is impractical in this situation. I don't need a backup, I just need a record of what file with each hash was where when. Unison seems like it might be able to do something vaguely along these lines, but I'm loathe to spend hours wading through its details only to discover that it's sadly lacking. Plus, it's fun asking questions on here.
What I'd like is a tool that does something along these lines:
- keeps track of file checksums or of actual renames, possibly using inotify to greatly reduce resource consumption/latency
- stores a database containing this info, along with other pertinencies like the file format and metadata, the actual inode, the filename history, etc.
- uses this info to provide more-intelligent synchronization with a counterpart on the other side.
So for example: if a file has been converted from flac to ogg, but kept the same base filename, or the same metadata, it should be able to send the new version over, and the other side should delete the original. Probably it should actually sequester it somewhere in case they or you screwed up, but that's a detail. And then when the transaction is done, the state is logged so that the next time the two interact they can work out their differences.
Maybe all this metadata stuff is a fancy pipe dream. I would actually be pretty happy if there was something out there that could just use checksums in an intelligent way. This would be sort of like having the intelligence of something like git, minus the need to duplicate data in an index/backup/etc (and branching, and checkouts, and all the other great stuff that RCSs do. basically just fast forward commit pushes are all I want, with maybe the option to roll back.)
So is there something out there that can do this? If not, can someone suggest a good way to start making it?
Sounds like you are talking about Windows Live Sync...
I have used Unison which works well. And it is free.
Have you had a look at Microsoft's free Sync Toy? I use it for my media files and it sounds like you're looking for similar functionality that I use in this tool. It doesn't use a database of the file info but it seems to manage okay with my format changes. Perhaps I don't have the same demands for efficiency though!
I use Beyond compare to sync my data It works great and has many options
Check it out
Scooter Software
Good Luck