I am trying to implement automatic failover for an ASP.NET MVC website running on Windows 2008 R2 Web Edition / SQL Server 2008 R2 Web Edition. I know that this is not supported out of the box, but I have to make do with the software licenses which are currently at my disposal.
By automatic failover I mean that in event of desaster the secondary server should kick in with all data from the primary server's database. If that's not an option, I'm also happy with "all data less whatever has been written to DB during the last 5 minutes".
I'm happy to do the switch back to the primary server if necessary.
Here's what I've come up with so far:
Setup log shipping from the primary server's DB to the secondary server with a very small time interval (5 minutes). The DB is not too big (<50 MB) and there are few writes, so hopefully this won't have a big performance impact. I'd prefer mirroring, but that is not supported by SQL Server Web Edition
Always deploy the web site to both servers
Use DnsMadeEasy (dnsmadeeasy.com) to monitor the primary server and dynamically switch to secondary if primary goes down
Run a service task on secondary that monitors the dns entry every minute. If it points to secondary, then try to trigger one more log shipping run (in case the primary web server is down, but the primary DB is still running). Last but not least deactivate log shipping.
What do you think? Is there any easier way to do this (short of using a different DB or a cloud solution)?
Thanks,
Adrian
According to the information on high availability located here, the only option for the Web edition is log shipping. (Replication isn't an option either, though I would only suggest that in limited cases because it is very intrusive with respect to the database schema for the app.)
That being said, log shipping can be perfectly great if you can tolerate a few minutes worth of data loss.
"Automatic", however, is dicey. Some things to keep in mind: Log shipping failover isn't automatic. The "Monitoring Server" just monitors, it does not take action when there is a problem. Something or someone will have to decide that the original server is down and that the databases on the failover server should be brought out of recovery. You could write a program to do this, with the risk that that brings, or rely on humans if you have 24x7 monitoring, which makes a five minute downtime window very uncomfortable.
Whatever jobs you have (be they 'admin' stuff like backups and reindexing or something specific to your app) should exist on both servers, but they should not run on the failover until you need it. If you have a large number of jobs, you will want some kind of automated way to enable/disable them. There was an article on this just recently, I can't find it right now but you should be able to find something to give you ideas with some googling.
Ditto for packages, linked servers and any other server-level stuff. These objects should always be on both servers, and you can't "log ship" changes of these objects from the primary server to the failvoer server. With complicated server configurations, keeping things identical on two servers can be a task requiring significant time, or some software.
You will want to be sure the login names, passwords and SIDs match on both servers. If not, your web app may not be able to access the database on the failover server when it is brought up. This is easier if you are using domain security for your app's login because you don't need to worry about SIDs. If you are using SQL Server security, you need to be more careful.
The primary server and the failover server will have different host names. You will need a way to change the data source(s) for your web application from SQLServer A to SQLServer B. We use DNS C names, so we can update an entry in the DNS rather than updating files on the web server(s).
Naturally, you want to test this before you really need it.
If you have a scheduled outage for the once-a-month Windows patches, you should integrate a manual fail over into the process for patching the servers. This will minimize downtime for the app, test the fail over process and provide a realistic drill for whoever does the work.
You will probably fail over more times for patching windows and similar whatnot than you fail over due to an actual emergency.
Manually failing over tests your process, so you know if will work during a real emergency. It also forces you to go through setting the log shipping back up in the other direction, which can take a significant amount of time if you have a large database.