I make a lot of backups. I do so on different disks, which are stored on different places. I am looking for a type of backup, but I can't find its name (which I need in order to figure out whether rsync or another ubuntu tool can help me with it).
Here is what I am trying to achieve.
- Always keep an identical copy of the current state of the home folder (in other words, the backup and the home folder are identical after each back up)
- On each backup every file that is changed or deleted in the home folder is taken out of the backup and is stored in a folder that contains all changed and deleted files on the particular day of the backup.
For instance,
Day 1
/home/joey/1.txt
/home/joey/2.txt
/home/joey/3.txt
Regular old-school Backup:
/media/backup/joey/1.txt
/media/backup/joey/2.txt
/media/backup/joey/3.txt
Day 2
/home/joey/1.txt
/home/joey/3.txt
# D /home/joey/2.txt is deleted
Backup with an exact copy of joey, but with a new diff folder:
/media/backup/joey/1.txt
/media/backup/joey/3.txt
/media/backup/day2-diff/joey/2.txt
Day 3
/home/joey/1.txt
/home/joey/3.txt # A /home/joey/3.txt was changed
/home/joey/4.txt
Backup with again an exact copy of joey, and with a diff folder for a changed file:
/media/backup/joey/1.txt
/media/backup/joey/3.txt # the new version
/media/backup/joey/4.txt
/media/backup/day2-diff/joey/2.txt
/media/backup/day3-diff/joey/3.txt # the old version of the backup is copied here
The logic is the following: currently I have so many backups that I need to delete them at some point. This is bad luck, because I want to keep at least the files that I deleted and that I changed. This type of backup would allow me to do so.
So I was thinking of
- a dry run rsync TARGET -> SOURCE to get a list of changed files from the perspective of the TARGET
- a script to copy these files to specific folders with the time and date in the folder name
- a regular rsync SOURCE -> TARGET
I know this is altering the backup, but I think that given the number of backups I have, this should not be a problem.
Is there a name for this type of backup (main question). If possible, how would one achieve it on ubuntu?
I am not sure whether a file that is deleted, should be part of every single diff afterwards. This is a bit a choice between a diff for each backup or one big diff that is incremental. Again, not sure about terminology.
It sounds to me like what you are wanting to achieve is exactly the way BackupPC works. See http://backuppc.sourceforge.net/
You can install it in Ubuntu using
Note however that the default install doesn't do anything on its own. You will have to create configuration files for each machine and/or directory you want backed up.
The way BackupPC works is by transferring a copy of the files on the first backup, and then on subsequent backups, creates hard links to the unchanged files and copies the changed ones. SO to your filesystem, when you navigate to the BackupPC backup directory, you have a snapshot of the way the files looked on a particular time.
See http://backuppc.sourceforge.net/info.html or install backuppc on your system and read the docs there.
Here's how to install it (the process may vary in newer versions): How to configure Backuppc in ubuntu 12.04?
Watch for issues with differing filesystems on different systems (Windows / Linux / Macintosh). Furthermore, differences in the the way the backup volume is connected between the host and the backup clients (network drive vs. local drives for example) will have a huge effect on the time it takes to finish a backup.
I would just use git. You can revert changes as necessary and host your "remote" repo on a hard drive, another computer, or just about anything else you want to. You can check the status of changes and push to your repo as needed, meaning that a daily backup isn't necessary. That being said, you might want the daily backups, in which case you could write a bash script and run it as acron job if 'git status' reveals changes have occurred.