I periodically need to make changes to tables in mysql 5.1, mostly adding columns. Very simple with the alter table command. But my tables have up to 40 million rows now and they are growing fast... So those alter table commands take several hours. In a couple months they'll take days I'm guessing.
Since I'm using amazon RDS, I can't have slave servers to play with and then promote to master. So my question is if there's a way to do this with minimal downtime? I don't mind an operation taking hours or even days if users can still use the db of course... Can they at least read while columns are being added? What happens if my app tries to write? Insert or update? If it fails immediately that's actually not so bad, if it just hangs and causes problems for the db server that's a big problem..
This must be a fairly common scaling issue, everyone needs to add columns.. What's typically done to a production db? Slave -> master migration?
Update - I forgot to mention I'm using the innodb storage engine
I just had to do this recently. What Amazon recommended was using the Percona Toolkit. I downloaded it and was able to run something like:
and it works great. It tells you how much time remaining in the process.
It actually creates a new table with the new column and then copies the existing data over. Further, it creates a trigger so that new data is also pushed over to the new table. It then renames the tables automagically, drops the old table and you're up and running with the new column and no downtime while you waited for the updates.
Don't. No really. Just don't. It should be a very rare occasion when this is ever necessary.
Assuming your data really is normalized to start with, the right way to solve the problem is to add a new table with a 1:1 relationship to the base table (non-obligatory on the new table).
Having to add columns regularly is usually an indicator of a database which is not normalized - if your schema is not normalized then that's the problem you need to fix.
Finally, if your schema really, really is normalized and you really, really must keep adding columns then:
symcbean provides some solid recommendations.
To answer your question, the easiest and best way to mitigate impact is by having multiple databases replicating. Dual master with an appropriate failover procedure stopping replication on the active, which allows an alteration on the inactive without impacting active.
You could potentially do this on a single live database and minimize impact by using a procedure similar to the one I detailed in this answer. Admittedly, this is similar to what symcbean described but includes technical details. You could use an auto_increment field as well and not just timestamp.
Ultimately, if your data set is growing so large, you need to also consider archival between OLTP and OLAP databases. Your transaction dataset should not need to be so large, if you design appropriately.
From the manual: http://dev.mysql.com/doc/refman/5.1/en/alter-table.html
So, reading will work fine. Writes will be stalled, but executed afterwards. If you want to prevent this, you'll have to modify your software.
I am in similar situation where I have to alter 1 of my transaction table which is almost 65GB. I hear 2 solutions