I have a website which works with lots and lots of text files - now they take up about 40 GB of data and increasing over time. I need to make a full daily backup. My current strategy is to make a password protected archive and store it in the dropbox with this command:
tar cfz - /var/www/mysite | openssl enc -aes-256-cbc -e -k "b@ckupPassword" > /home/user/Dropbox/server_backups/sources/2013_01_04_0500_mysite_source_encrypted.tgz
It works, but making an archive takes about 14 hours and consumes a lot of IO - and it will be worse as the amount of data increases.
What is the proper strategy of backing up such large amount of files?
I would use
rsync
, provided I have space enough.This example has a full backup and keeps a week of incremental ones.
The script can be complicated to store offsite, on dropbox folders and so on.
If you really need to use
tar
, you can keep track of the modified filesIf you want a full backup, delete
/var/log/mysite.tarlog
As suggested above rsync seems to be the best way to backup the whole site. Yet i would suggest you to implement some sort of replicated filesystem, something like a simple GlusterFS volume with replication.
Replication is not a backup, but it can help you to reduce the I/O impact of backups and eventually provide you with a solid base to expand your website later into a cluster.
It is better to use an incremental backup mechanism in this case.
Using
rsync
you can take incremental backup.Click on me for more details about using
rsync
Click on me to read more about incremental backup