I'm curious to hear strategies and methods people use for performing alters on very large tables in MySQL. Large could be any number of rows or size that would be impacting to alter. For the sake of conversation, let's say 2 million+ rows to which any alter is going to be impacting to normal application performance.
The two primary strategies I see are to either perform the alter on a slave and then promote it to be the master after it has finished, or to create a copy of the table with the intended alters already done then copy and catch up the data and do a rename to swap them out before dropping the old.
I am ideally seeking a means to do the latter. A large concern of mine here is triggers on the table being altered, and of course ensuring the data in the two tables is kept in sync before they are swapped. I am thinking the potential for error or missing data can be mitigated to some extent by making use of the read_only variable at key points in the process to make sure data is not changing when fiddling with triggers and after you have caught up all the data. I understand that would have an impact on the application using the database, but it's better than risking corrupted data.
I have been looking at utilities and strategies for doing this and there are several out there. A notable one is this one, which Facebook used as the basis for their online alters. : openark kit documentation. The process is then elaborated upon here: Thoughts and ideas for Online Schema Change
What are your experiences with either method? What pitfalls and gotcha's did you come across? Which do you prefer/suggest and why?
Percona/Maatkit also have their own: pt-online-schema-change
I have done online schema changes of tables in excess of 100 million rows using the trigger method and it works really well.
Add trigger to the old table to copy inserted data to the new table. It would look something like:
Take note of the first auto-increment id moved to the new table with the trigger.
RENAME original_table TO original_table_backup, new_table TO original_table
The Continuent folks include this ability in their Tungsten Enterprise product. For example see: https://s3.amazonaws.com/releases.continuent.com/doc/tungsten-1.3.3/html/Tungsten-Concepts-And-Administration-Guide/content/ch03s19.html