I am looking for recommendations for backing up my current 6 vm's(and soon to grow to up to 20). Currently I am running a two node proxmox cluster(which is a debian base using kvm for virtualization with a custom web front end to administer). I have two nearly identical boxes with amd phenom II x4's and asus motherboards. Each has 4 500 GB sata2 hdd's, 1 for the os and other data for the proxmox install, and 3 using mdadm+drbd+lvm to share the 1.5 TB's of storage between the two machines. I mount lvm images to kvm for all of the virtual machines. I currently have the ability to do live transfer from one machine to the other, typically within seconds(it takes about 2 minutes on the largest vm running win2008 with m$ sql server). I am using proxmox's built-in vzdump utility to take snapshots of the vm's and store those on an external harddrive on the network. I then have jungledisk service (using rackspace) to sync the vzdump folder for remote offsite backup.
This is all fine and dandy, but it's not very scalable. For one, the backups themselves can take up to a few hours every night. With jungledisk's block level incremental transfers, the sync only transfers a small portion of the data offsite, but that still takes at least a half an hour.
The much better solution would of course be something that allows me to instantly take the difference of two time points (say what was written from 6am to 7am), zip it, then send that difference file to the backup server which would instantly transfer to the remote storage on rackspace. I have looked a little into zfs and it's ability to do send/receive. That coupled with a pipe of the data in bzip or something would seem perfect. However, it seems that implementing a nexenta server with zfs would essentially require at least one or two more dedicated storage servers to serve iSCSI block volumes (via zvol's???) to the proxmox servers. I would prefer to keep the setup as minimal as possible (i.e. NOT having separate storage servers) if at all possible.
I have also briefly read about zumastor. It looks like it could also do what I want, but it appears to have halted development in 2008.
So, zfs, zumastor or other?
This might not be possible in your situation, so I hope I don't get down-voted in that case, but it might be more efficient to change your backup strategy. If you back up specific data instead of VM snapshots, your backups would run much quicker, and it would be easier to capture changes.
Depending on your VMs and what they're used for, you can just have them back up data to where you store the snapshots now daily (or whatever schedule is appropriate), and then JungleDisk can back up just the data. That would more efficiently transfer changed files, and the space required for backups as well as time needed would be reduced. In addition, you could still take snapshots to retain, and just do that much less often (weekly, for example).
In this case, you could always just bring up a new VM and restore data, or use an older snapshot to restore the VM, and then use the data backup to restore to the most recent point.
If I were doing offsite backups i would choose the following options:
(a) shell script that does SCP copy to remote server, This way you could add a cron job that automatically runs the script that creates the backup. Additionally you can make it so that it creates a temporary archive file before actually transferring the files thereby saving bandwidth by not transferring while sill gziping.
or
(b) Install a server management tool like Webmin and get that to do automated backups. I am currently sing this on my production servers right now without any problems, It just works flawlessly. I would also recommend cloudmin (paid) for managing many vm's as it provides an all in one solution.
some extra links:
http://www.debianhelp.co.uk/backup.htm
http://ubuntuforums.org/showthread.php?t=35087
Hope that helps, RayQuang
you might want to take a look to backuppc.
backuppc can work on top of rsync which does incremental copy.
further more you can easily wrote a black list of folder who does not have to be backuped. For instance: temp/ /tmp .garbages/ ...
http://backuppc.sourceforge.net/
backuppc has a clean web interface allowing you to download some parts of a backup directly as a zip file. It can be monitored by nagios using check_backuppc.
I'm not sure, how much architectural change you were planning to make to increase your scalability. However, if you would be open to switching VM platforms you could look at VMWare.
There are lots of good VMWare backup solutions, I've personally used VzionCore. You can then do some slick stuff with snapshots and point in time recovery. There is even an ability to fail over to a remote site.
zfs does it great, you already mentioned knowing that though and the downside of not working great at the 2 server scale. It also isn't going to give you the DRDB failover, i.e. Nexenta will be a single point of failure.
You can consider trying to get VirtualBox on OpenSolaris or NexentaCore but not as simple as ProxMox + DRDB so you can re-use your existing machines.
If you measure your changes and find them low enough you could try DRDB with a 3rd mirror offsite - It's only going to work if the number of writes is extremely low on your VMs.
Steve Radich - Windows Hosting & SQL Performance Since 1995 - http://www.BitShop.com/Blogs.aspx
I think I may have found the ultimate answer to my question:
BUP https://github.com/bup/bup
Features:
It uses a rolling checksum algorithm (similar to rsync) to split large files into chunks. The most useful result of this is you can backup huge virtual machine (VM) disk images, databases, and XML files incrementally, even though they're typically all in one huge file, and not use tons of disk space for multiple versions.
It uses the packfile format from git (the open source version control system), so you can access the stored data even if you don't like bup's user interface.
Unlike git, it writes packfiles directly (instead of having a separate garbage collection / repacking stage) so it's fast even with gratuitously huge amounts of data. bup's improved index formats also allow you to track far more filenames than git (millions) and keep track of far more objects (hundreds or thousands of gigabytes).
Data is "automagically" shared between incremental backups without having to know which backup is based on which other one - even if the backups are made from two different computers that don't even know about each other. You just tell bup to back stuff up, and it saves only the minimum amount of data needed.
You can back up directly to a remote bup server, without needing tons of temporary disk space on the computer being backed up. And if your backup is interrupted halfway through, the next run will pick up where you left off. And it's easy to set up a bup server: just install bup on any machine where you have ssh access.
Bup can use "par2" redundancy to recover corrupted backups even if your disk has undetected bad sectors.
Even when a backup is incremental, you don't have to worry about restoring the full backup, then each of the incrementals in turn; an incremental backup acts as if it's a full backup, it just takes less disk space.
You can mount your bup repository as a FUSE filesystem and access the content that way, and even export it over Samba.
Edit: (Aug 19, 2015) And yet another great solution comes out that is even better: https://github.com/datto/dattobd
It allows live snapshotting, essentially giving COW like features to any regular old file system in Linux.
Edit: (Jul 15, 2016) And even another great solution that blows bup out of the water: https://github.com/borgbackup/borg
Its particularly better than bup at pruning. It seems to have great support for compression, encryption, and efficient deduplication. dattobd + borg ftw!!!
I run a large proxmox cluster and have to suggest you change your backup strategy away from the built in vzdump snapshot style backups, which take ages, are always full therefore large in size and make restore of individual files extremely long winded.
Consider an 'in guest' file backup solution of which there are many. Backuppc, Urbackup, bacula, amanda etc...
It will be much faster, consume far less space and be much easier to restore specific files.