I'm running an web application hosting service (basically hosting forums for free), and I have two remote servers at my disposal. The code for the application is stored on both servers and isn't a problem, but I'm wondering how to deal with the databases.
When someone goes onto a site *.example-host.com, they are sent to one of the two servers and both must be capable of loading the forums from a database. The database must also have write access, for when new members register or post topics etc.
The main requirement is speed, but uptime is also important (if a server goes out, the site should still work).
I have a few options, but I'm inexperienced and not sure which to go with:
1) [PHP] Split the forum records 50:50 between the two servers. If a server does not have the record for a forum requested, it can request it from the other by remote MySQL and load it. This idea sounded okay, until I realised that 50% of the time, users would be waiting significantly longer for pages to load. I also realised that if one of the servers went down, half the forums would be inaccessible and registrations would have to be disabled.
2) [MySQL] Dual master replication. This would attempt to mirror the two databases and sounds perfect, but I've heard that it can be very problematic. I don't know how fast this is.
3) [MySQL] Use a standard replication, distribute read only queries on both nodes and read/write queries to the master. This sounds like a good option, but again, I'm not sure on speed. I also don't know what would happen if the master server went down.
If you have any other suggestions, please post them :)
Solution 1 is close to sharding but the entire architecture needs to be considered and designed to accomplish that in the best manner. Sharding typically comes up with large scale installations pushing limits of the technology platforms.
Solution 2 or dual master replication would be applicable but as your links are physically separate it would be risky to have the application dynamically point to either database. You would want to pick one database and if the database failed, manually repoint the application to the new database. Automatic failover of the application introduces the risk of split brain. You could take nightly snapshots of the secondary databases for backups.
As described in solution 3, replication is often used to distribute the readonly load to different database servers. It also lets you use different engines and configurations for the read queries. For example, MyISAM can be quicker for readonly queries.
Replication is typically only subject to the physical hardware limitations be it network or system resources. Unless you are storing binary data in the database on a large scale, I would not worry about replication delays under normal load.
Being as that your main requirement is speed, I would first focus on the local system configuration and resources. Chances are substantial optimizations can be made there.
Automatic high availability solutions are typically best localized to a single physical environment and in case of extreme failure manual solutions can be applied to enable the physically separate site.
I'm generalizing based on a LAMP stack and focusing on Web applications. Different applications, protocols, and technologies change thing a bit but in regards to Web servers and databases what I describe is more generally applicable.