I have a file server that files get dumped on through-out the day. They can be any size, any type, etc. I then want to sync those files down to a remote backup server but the trick is once they're on the backup server they get renamed and moved so a simple rsync won't work.
Currently I keep track of what files I've downloaded by running an ls on the dir and saving that locally after I sync them. Then when I run the job again I rsync files by excluded the files in that list. This works for the most part but sometimes a file with have an odd character and the re-downloads. Also, if for some reason the network flakes out and the "ls" fails next time it will try and re-download everything because the exclude list is empty.
Is there a better way to do this?
If you really need to have the original filesystem and another, which is originally a copy but renamed, then I'd maintain (not rewrite) a list of downloaded files. If on a backup process there is anything new, then it as added to the list. If not, then nothing is added to that list.
If you are having problems with odd characters, try to fix it wherever the problem is: convert (with
iconv
) the list of downloaded files, or connect to the server using the correct charset, etc. You'll need to do it by hand and double check everything is correct before adding it to the scripts you're using.