I've been reading around Hyper V and how it manages virtual networking & I think I've spotted a problem with a small cluster that has been put together for me by our networks team to host a SharePoint farm.
The servers is question have two NICs - 1 to be used for standard LAN traffic and the other for communication with SAN storage.
On a default installation when the Hyper V role is enabled and a Virtual Network is created for each NIC then as I understand it the following should happen:
- Create a virtual switch.
- Create a vNIC with the same name as the vSwitch.
- Host unbinds itself from the physical NIC and binds itself to the new vNIC.
The result is the host OS uses the same virtual networking route that the child VMs use and any new guest VMs can create their own vNICs against the appropriate vSwitch.
In the case of the servers that I'm looking at I see that on each of the host machines the vNICs have been unbound and the machine has been re-bound back to the physical NICs.
Everything works fine as far as I can tell. The set-up has been replicated across all blades in the cluster so Live Migration etc works fine.
Am I correct in my understanding of Hyper V networking as I've briefly summarised above? (If I'm wrong you can ignore the rest of the post...!)
If so, given the changes that have been made in this set-up are there any potential problems that may arise with the hosts machines bound the physical NICs?
We've have an intermittent issue with network activity freezing on random VMs at random times - the solution is to either re-start the VM or live migrate it to another host. Another intermittent issue is for hosts to BSOD and re-start occassionally - crash dumps seem to indicate a problem within a network driver. So far I've been getting told that the issue is probably just drivers or BIOS settings and various things have been tried without success to stop the problems. I'm wondering if our problems could be related to the hosts binding to the physical NICs when they should be bound to the vNICs instead.
As far as the bsod's are concerned the following (relevant to your situation) hotfixes will need to be applied.
http://technet.microsoft.com/en-us/library/ff394763(WS.10).aspx
Are you really running with just two nics, are they 10gb's? if not then your clusters not running in a recommended configuration. True it will pass with warnings the cluster validation test but it will be less then optimal.
If you truly have one nic for lm,csv,hb,vms,mgmt then its configured the only way possible and the host must share the nic's with the vm's.
Another issue was this cluster built with hyper-v installed then sysprepped? If so then you have mac collisions.
http://blogs.technet.com/b/jhoward/archive/2008/07/15/hyper-v-mac-address-allocation-and-apparent-network-issues-mac-collisions-can-cause.aspx
Not. You are doing it ok. In an Hyper-V team you can bing a real network adapter with the same name to a virtual network in all the nodes. You should do this and this is the recommended by Microsoft.
Altough you are using the NICs for doing a lot of things and that could be genereating your problems. Keep in mind that only use two network adapters for this scenario is not the best option and could have unexpected behavior.
Based on the Microsoft recommendations you should have in all the Hyper-V nodes the following:
Try to have them all. Remeber to rename them exactly on all your nodes.
Remeber that is not supported by Microsoft to have Teaming at NICs in Hyper-V Clusters, that could genereate failures. There is more information at Microsoft KB 968703.
Read this article about network best practice for knowinf what protocols you should enable in whic NIC and about the metric. It's highy recommended to change the metric in each interface.