This is a re-hash of a question I asked a while back - after a consultant has come in firing ideas in to other teams in the department the whole issue has been raised again hence I'm looking for more detailed answers.
We're intending to set-up a multi-instance SQL Cluster across a number of physical blades which will run a variety of different systems across each SQL instance. In general use there will be one virtual SQL instance running on each VM host. Again, in general operation each VM host will run on a dedicated underlying blade. The set-up should give us lots of flexibility for maintenance of any individual VM or underlying blade with all the SQL instances able to fail over as required.
My original plan had been to do the following:
- Install 2008 R2 on each blade
- Add Hyper V to each blade
- Install a 2008 R2 VM to each blade
- Within the VMs - create a failover cluster and then install SQL Server clustering.
The consultant has suggested that we instead do the following:
- Install 2008 R2 on each blade
- Add Hyper V to each blade
- Install a 2008 R2 VM to each blade
- Create a cluster on the HOST machines which will host all the VMs.
- Within the VMs - create a failover cluster and then install SQL Server clustering.
The big difference is the addition of step 4 whereby we cluster all of the guest VMs as well. The argument is that it improves maintenance further since we have no ties at all between the SQL cluster and physical hardware. We can in theory live migrate the guest VMs around the hosts without affecting the SQL cluster at all so we for routine maintenance physical blades we move the SQL cluster around without interruption and without needing to failover.
It sounds like a nice idea but I've not come across anything on the internet where people say they've done this and it works OK. Can I actually do the live migrations of the guests without the SQL Cluster hosted within them getting upset?
Does anyone have any experience of this set up, good or bad? Are there some pros and cons that I've not considered?
I appreciate that mirroring is also a valuable option to consider - in this case we're favouring clustering since it will do the whole of each instance and we have a good number of databases. Some DBs are for lumbering 3rd party systems that may not even work kindly with mirroring (and my understanding of clustering is that fail overs are completely transparent to the clients).
Thanks.
If those blades will be fully dedicated to running SQL Server only, why are you even bothering with virtualization?
Why don't you simply install Windows Server and SQL Server on each of them and set up your cluster accordingly, without the additional virtualization overhead?
Sounds complicated.
I would have to weigh up the "complexity" of your solution with the reliability and relative simplicity of a bog standard physical clustered SQL server implementation.
Are ALL the databases mission critical? In my experience, usually not, so I tend to host the most important databases on servers that are setup with full resilience and the rest (usually a large proportion) on simple SQL servers.
This allows you to concentrate on keeping the most important systems up and running, and not trying to keep "all the balls in the air at the same time".
All servers will need regular routine maintenance. Given the regularity and severity and importance of security patching, we've moved away from trying to maintain a theorectical 5 nine's up time (that the users liked, but didn't really NEED) to a more realistic "we will keep the servers safe and secure - but there WILL be short MANDATORY maintenance windows to allow us to keep servers properly patched."
Of the listed options, I'd pick #2 but I wouldn't cluster SQL (step 5) because it's adding a layer of complexity that you don't gain much from. The Hyper-V clustering will already allow you to run that VM on either host so you're covered for hardware failures.
I assume you're planning to use fixed-size VHDs for the SQL log and database volumes.
I completely understand the comments from others about skipping Hyper-V altogether and just using the 2 blades as a normal SQL cluster - that would certainly be the traditional approach. However, the flexibility advantages to virtualizing workloads is huge for maintenance, upgrades, and hardware failures. The portability of VMs is very appealing.
Note though that the value of this solution also depends on your environment. If you have no other Hyper-V servers and your staff aren't too experienced with Hyper-V, virtualizing one of your most critical workloads might not be a good idea. However, if you're like many IT shops and started virtualizing less critical servers, have built up a few hosts and have the skills and procedures to reliably run Hyper-V, expanding that focus to more critical workloads is completely reasonable. Personally, I'd rather manage clustering at the host level vs at the SQL level and I think we'll see this being done more and more though it isn't yet as common.
Finally, your questions about running SQL on Hyper-V: yes, live migration will work fine with SQL and it won't notice - AND - SQL db mirroring is great but yes, it isn't universally supported so doesn't fit every situation.
The advantage of your consultant's setup is that if you failover the hardware only, it won't protect you against a faulty service or what not. It's always good to have redundancy in both the hardware and the software, so if you ask me, I'd go for option 2 (considering you have the budget and knowledge for it, of course)
for the nic issue, you should be able to solve that with teamed nics, I dont know what your hardware is but there have been alot of issues like that with the Dell power edge 1950s, 2950s, R710s etc. where the nic becomes a recieve only nic and doesnt properly send traffic when in HyperV. Latest drives and correct teaming configuration will not only provide additional redundancy but it will increase performance aswell.
Really Option 1 and Option 2 are very near the same thing. The biggest question is how do you want your data to be accessed and what is your SAN designed for. SQL 2008 actually has performance increases when run in virtualization across clustered shared volumes and again clustering SQL the reason being is that SQL is smart enough to offload its processes to multiple SQL Nodes. This not only gives you a big boost in performance (imagine when intel first came out with true hyper threading) but it also increases your infrastructure performance for high overhead networks by being able to split up the traffic and use packet redirection.
I think your consultant has it right. I have the exact idea as he does, that I wish to implement in my current environment. 2 physical pieces of hardware each piece of hardware running Hyper-V with 2x W2k8 installations.
This gives your Complete level of failiver clustering and Complete H/A in your SQL environment.
Or maybe my idea is stupid?