I'm looking to replace an aging and decidedly suboptimal mysqldump
-based database backup strategy with Percona's Xtrabackup. This all looks pretty straightforward, except that I'm wondering: how many incremental backups should I keep between full backups?
I realize that restoring from a long set of incrementals would be somewhat tedious, but it looks like it'd be pretty easy to script.
I also imagine that if I somehow lose an incremental that's part of a backup set, I would lose everything from that point on. That seems like it'd be unlikely (these guys will be headed to S3), but still a bad day.
Are there rules of thumb about this sort of thing? Would hourly incrementals with a complete backup a week (so, 168 files per backup set) be insane, or normal for some workloads?
FWIW, we're looking at a ~10M row database, growing at ~20k rows a day, very few changed rows (ie, append-mostly). So the incrementals would be pretty small.
It's pretty difficult to say what is the best backup strategy - it all depends on the value of your data, and the penalty you will pay if you lose last X hours.
You have correctly identified a problem with incrementals - if you lose one, you lose everything after that one. Another problem is if the full gets corrupted, you have lost everything, period.
Now, realizing that, planning for backup is a fine art of balancing between the cost of backups and cost of data loss.
In case you rely only on one master database, and wish to make incremental backups every hour, I would advise you to do full backups as often as your link can upload them to S3. So, for example, if you can upload full backup to S3 in 4 hours, do it every single day.
But, to be much better off, I would suggest to set up master-slave replication, and backup binary logs too. That way, you can plan to have incrementals for as long as your binary logs keep the data. In case of incremental failure, you can do a full backup restore and then replay the logs. And the added benefit is backing up on a slave hence not incurring performance penalty on master (and users).