I have implemented a HA solution for mysql based on master-master replication. There is a mechanism on the front end part which guarantees that only one db will be read/written to at a given time (i.e. we only use replication for HA).
I have confirmed replication works as expected, but I am wondering about failure scenario and recovery. In particular, I worry about what happens when one master fails in an unrecoverable state, and needs to be recreated from the other master:
- As the other master is live and most likely used, I cannot lock it and create dumps from
mysqldump
(our databases are moderately large, andmysqldump
can easily take hours after a few months of usage). - Even assuming I have a dump, it is crucial that the binlog position as shown by SHOW MASTER STATUS correspond to the dump being done after the database has been locked.
The simple solution to the first problem is to use a third database acting as a backup, from which I can do the mysqldump
. But then how do I make sure the recreated master can start replication from the running master in a consistent way?
There are two basic approaches to this problem that I am aware of. First, if you are running InnoDB instead of Myisam, then you can do the backup in a transaction (--single-transaction --lock-tables=FALSE), which combined with --flush-logs (not required but nice) and --master-data will give you a consistent backup with replication position information. Flush logs will reset the logs before the dump is created, which means the position will always be 106, and --master-data will put the logfile name and position right in the dump file. Of course, you have to run this on the master for --master-data to work.
The second way, which you mentioned, is to use a third host to create the backups. In this case, you need to stop replication, make sure the DB is read_only (although really, all your replicas should be read only since this does not effect writes from replication) and then create your backup AND record the replication position. You can not use --master-data in this case. Instead, you might do something like this:
If you ever need to restore from this backup, you would run the restore and then setup replication where the two parameters master_log_file and master_log_pos come from the DB.replication file:
Note: you can AND SHOULD test this from another replica.
Additional note: if you have a pool of replicas (for example if you have separated reads from writes for a web app) it is possible for the replicas to be out of sync with the new master; this can happen if the failover occurs during a period of heavy write I/O since the replicas stream asynchronously and there is no guarantee that your standby is at the same position as the other replicas when you failover. However, this hasn't happened to me yet...