I'm new to handling the infrastructure for production service deployments. My intuition tells me that if I want to have my service be "up" as much as possible and yet can only afford say 2 dedicated servers (startup time!) that I should make one server a redundant copy of the other. Then setup failover, replication, etc.
However, after reading some case studies and even hearing that Stack Overflow and OK Cupid only have a single database server, perhaps I'm overthinking things?
I kind of hate having to spend say $250/mo. on a leased server that acts as a backup just in case.
This all depends on your service that you provide but come on, Stack Overflow must be important enough that it should require a redundant database.
OK, enough rambling. What am I missing? Help! Thanks.
Try to find the chance of your server failing. Also figure out how long it will take you to get a replacement and backups restored. That is how long you will be down for. The price tag of the server and the time setting up redundancy is how much you pay to reduce the possibility. Is the price worth it to your company and server, or would the money be better spent elsewhere?
Remember, if both servers are in the same place, same power, network equipment, etc... they still might both go down. And problems with the database itself can replicated and it can still go down. So it is how much are you willing to pay for the device level redundancy?
In many cases a lot of newer companies are using clusters of cheaper servers instead of just one or two "big" servers to alleviate the costs. If your application supports clustering it can also give you an easy way to double or triple your capacity by just spinning up more instances of the server. Many people use Amazon in exactly this manner because it is really easy to start another instance when you need one (and of course shut one down when not needed if your volume is highly dynamic) and if you have 2 "cheap ones" running in parallel at all times, a failure on one will only impact you until you can start another one.
SO has multiple database servers. They have a backup slave as far as I know.
http://blog.stackoverflow.com/2010/02/thermal-event-at-datacenter/
I would be incredibly surprised if OK Cupid didn't have at least one redundant database server.