little theoretical question. I'm done setting up a bunch of 2-node MySQL 5.1 clusters user the control of several MMM instances.
We started some testing, joyously kill -9 the writer nodes and all went fine with the app chugging along, oblivious of the DBMS turmoils.
Then I thought, what if in production first server A goes down, server B takes over and more work is done, and finally B goes down as well.
If the sysadmin restarts the cluster first from A and later joins B while in the meantime work is done on the outdated data of A?
Does MySQL have a quorum mechanism that keeps A (or even B) in Recovery mode until it has decided what is the most recent transaction to continue from?
Thanks and apologies if it's an FAQ...
This is probably more of a starter answer, to get the discussion rolling. I expect some MySQL guru to come along and give you the right answer... ;-)
but for this scenario I have configured my clusters to use offset auto-update sequences in order that when you bring the primary master back online, that the data that was written to the slave will not conflict like so;
So on the primary master
on the slave machines (each slave with a different offset)
this means that you can replicate the binary log off the slave, back into the master to recover the Data that was written there during the master outage.
If you are using MySQL 5.1+ (without the server-id bug version) then you can configure master-master, and have the master replicate the "lost" queries back off the slave automatically.
(unfortunately I think this is beyond the capabilities of MMM, but then again I've not looked at MMM for a few years)