I've a linux VPS hosted in a VPS datacenter.
I need to implement backup on it. I'll order The Planet's Stored Cloud to do this. http://www.theplanet.com/cloud-storage/
My question is about the Backup Strategy:
Today I have other servers and I use a rsync backup by week. I get a full backup at the beginning of week, and increment this along the week.
In the backup server I've something like it:
200902_week06
200902_week07
200902_week08
....
By week, I'm using rsync --delete. It's working for my proposes until now.
But in this new server, I've a lot of files, and copy and recopy it all weeks will lost bandwidth and storage size.
I this old way, I can rollback files if something goes wrong just by week. In this new case I need to rollback files by day.
I'm thinking something like Time Machine on mac: I send just what is new, like rsync, but I can rollback (and travel to) each committed day.
To do this, I'm thinking to use a VCS, like Bazaar, to manage the commit entries. What do you think about this?
The second question about this strategy is to use a second backup storage: The Backup from Backup. I know data-centers like The Planet have RAID. But what happens if someone get access on my VPS and get the user and password from backup service stored on my CRON's backup script?
In my actually way, I'm doing two backups, and in my public server there isn't mention about the second storage. Again: what do you think about this? There are other ways?
Thank you, Daniel Koch
For your first question, I'd recommend taking a look at rsnapshot. It's basically a wrapper around rsync, diff, and a few other tools. It will manage your versioned/incremental backups and, on your backup server, provide a browseable tree for each "snapshot". It uses filesystem hard links to provide a full "view" of each snapshot even though only a few files might have changed.
For your second question - you should be using pull backups, not push backups like you're currently using. From your backup server, generate an ssh keypair, throw the public key on your production server, and then use that account to perform backups via ssh. That way, since only your public is is on the server, no harm can be done to your backups if it gets compromised.
Also, regarding your proposal to use a VCS to manage backups. I'd highly recommend against that. Sure, it would probably work, but performance would likely be very poor. There are much better purpose-built backup tools that you'll be much happier with in the long run.
I've recently stumbled upon a good blog post that shows how easy it is to create a Time Machine scheme while using rsync. Check it out. I've also added it to the rsync wikipedia page for a longer reference.
http://en.wikipedia.org/wiki/Rsync#Examples
http://blog.interlinked.org/tutorials/rsync_time_machine.html