I'm using for the first time rsync to create daily backups of my websites and I was wondering if I should overwrite the previous copy or should I create multiple copies and overwrite only the oldest one ? (I might not have enough space for that, though).
I actually have also this question. Let's suppose most of files are accidentally erases.. does rsync delete all these files from the backup space because they don't exist anymore ? How does exactly work in this case ?
thanks
If you are on a system that supports hard links (e.g. Linux, Unix, OSX etc), something like rsnapshot might be a good solution. This automates daily, weekly etc. snapshots and rotation. Disk space used is minimal over time because file system hard links are used.
rsnapshot http://rsnapshot.org/
You definitely need to have a backup strategy that includes keeping a number of older backups. If you keep only one mirror of your live system, then, should the files on your live system become corrupted, your backup process will dutifully mirror the corrupted files, and your backup will become useless.
A good strategy is to keep a number of timestamped (or at least datestamped) backups available, so that if you find that files in your latest backups are copies of corrupted files, you can refer back to older versions that are known to be good.
rsnapshot
, mentioned by Alastair, is a very nice tool, based on rsync, that creates and keeps a configurable number of versioned mirrors (or "snapshots") of your sources on a configurable schedule. You can tellrsnapshot
how many hourly, daily, weekly, and/or monthly snapshots to keep, and it will automatically take care of deleting the oldest snapshots when your chosen maximum number of snapshots to retain is reached. As also mentioned,rsnapshot
is quite efficient with disk space, since it uses hard links to represent files that don't change between snapshots.Unless you're doing something fairly odd, you won't overwrite in the traditional sense. rsync is designed to bring a file system tree on a remote server into sync with a file system tree locally (or vice-versa). The first day you run it, it'll take a while and copy the whole tree; the following day, you'll copy only the files that have changed.
If you want historical information, but are short of space, then instead of keeping multiple copies of the tree, you may find it more satisfactory to run a cron job that tars up the directory on the remote server, compressing as hard as it can, and keep as many of those tarfiles as you can afford.
As derfk has noted, you get to choose whether rsync (eg) deletes files on the remote side that have vanished locally, as well as much other behaviour - but it's your job to catch it before it syncs the tree, if someone has changed all your local files to say "I HATE PIE".
That's one of the reasons why I don't think of rsync as a proper backup solution. But if you have only another server to do the backups do, and you don't have much space on that server, then rsync + aggressive tar is probably about as good as you can do without getting into funky union FS tricks, and it's definitely better than no backups at all.