We are in the process of moving our MySQL DB from AWS RDS to EC2.
RDS provides us with with a solid HA solution, which we want to replicate in EC2. We've looked at MySQL Utilities and MHA, both of which present problems for us. As a result, we're thing about a DIY solution, which will work as follows:
We will have a Master and a Slave replica (Slave 1). We will have a Slave replica of the Slave (Slave 2).
Master and Slave 1 will be behind a HA Proxy cluster.
We will monitor the Master very closely.
If the Master monitor detects a failure, we will run a script that shuts down the Master EC2 instance and stops the Slave process on the replica.
Slave 1 will now become the Master, and Slave 2 will become Slave 1. The new Slave 1 will be added to the HA Proxy cluster.
We will then rebuild Slave 2 from a backup of Master (backup every 3 hours).
We obviously need to be careful to ensure that all steps in the process exit cleanly, and that we have rock-solid error handling.
Can anyone offer an opinion as to the viability of our proposed DIY solution?
I am assuming you are talking about asynchronous replication here....
While slave 1 needs to be configured to accept updates from the original master, configuring it as a master itself eliminates a step from your failover process.
Slave 2 should never be more than a few seconds behind the master - so why rebuild it from a backup which is hours behind?
Do you actually need the capacity of a 2 active + 1 passive node solution? Its adding a lot of complexity here? Does slave 2 currently do anything which can't be throttled in the event of the loss of master?
Or you could just go with multi-master replication.