I have a pair of Dell PowerEdge R710 servers (2x Xeon X5690s, 64GB, 8x BCM5709C NICs) running Windows Server 2008 R2 Datacenter. They're backed by a Dell PowerVault MD3200 with 3.6TB in RAID 10 (effectively 1.8TB) via HBA (not iSCSI!). These were setup in a Hyper-V Failover Cluster.
I'm rebuilding my cluster because of some issues that occured. As part of the rebuild I've also upgraded all firmware on the servers (not that it matters, but I'm putting it out there).
Since, I'm setting up a "fresh" cluster, I'd like to try and fine tune the network connections a little better this time around. I've spent countless days researching best practices and such for my hardware, but I'm still left with questions.
Before I get to the question(s), the NICs are all running firmware 7.4.8 and driver 7.4.14.0. They are split into three teams: Public, Private and Virtual Machines. The teams are of type Smart Load Balancing and Failover.
- Public Team (2x NICs - used for host network access)
- Private Team (2x NIcs - used for failover cluster, live migration, etc.)
- Virtual Machines Team (3x NICs - used for VM network access)
There is also one physical NIC which is bound as a separate vSwitch in Hyper-V and it's allocated to our web server only because it carries the dedicated connection for it.
All ports are split between two Dell PowerConnect 6224 switches for redundancy. Anyway, on to the question(s):
First, what settings do I need to configure on the VMs Team for best VM network performance? From what I've read, I should disable TOE (TCP Connection Offload for Broadcom?), Wake-on-LAN, Jumbo Frames, Flow Control, RSS and QoS on all NICs that will be a part of the VMs Team. The only thing I should have enabled is LSO and CSO. Is that correct?
EDIT: I've also read that I should preset the link speed rather than leaving it on auto if I know my hardware's capabilities, is that also a good thing to do?
With the new firmware and drivers I have the ability to enable VMQs, which after reading a Dell whitepaper on them I guess they're a good thing to do. However I'm also reading something about registry settings, and I'm confused. I was following a guide on a forum posting on Broadcom's site, but I'm not sure if it's a good idea to do that since it's for a different NIC model. What is the proper way of configuring VMQs to work for my hardware?
Going back to disabling Jumbo Frames from above, I have them enabled right now for all teams. Should I just disabled them entirely? I've read (after I enabled them) that it's only used in iSCSI setups, which I'm not, so I'm not sure...
On the dedicated connection for the web server, do I need to have VMQ enabled on it if it's only available to one server?
Lastly, any other recommendations for any of the connections would be appreciated. Thanks for reading this and thanks in advance for any help!
My shop currently uses teamed Broadcom NICs in PE710s for a Hyper-V cluster. A lot of the options are dependent on what you're doing with the particular link. In many cases, whether or not a certain feature is enabled will make no difference in the performance of the NIC. As such, most can be left in a default state with no ill effects. Since you're using direct attached storage, this is what I would recommend and why (based on my own experience):
TOE (TCP Connection Offload for Broadcom) - This feature is used to offload iSCSI session management to the HBA. Since it doesn't appear that you're using iSCSI SANs, this can be turned off. If left on, nothing will happen, since the feature must also be configured for it to work.
Wake-on-LAN - Can be safely turned off. Really has no effect, assuming your servers are always on (personally, I don't see much sense in having a server go to sleep). There are certain security implications of having this feature left on (e.g. rogue magic packets), but again, if the server is always on, they're not really an issue.
Jumbo Frames - Depends on your network configuration and intended use. Jumbo Frames improve performance if large data packets being sent across the network by reducing the number of frames (and associated headers). ALL of the network hardware along the data route, in addition to your NICs, must be able to support jumbo frames and have the feature enabled before use. We have this turned on for our iSCSI networks and off for all other traffic, since we don't control core router infrastructure. If in doubt, leave it off. This can give you network troubleshooting nightmares if you have it turned on at the NIC, but not elsewhere.
Flow Control - Again dependent on network configuration. Setting this to auto is usually fine. The HBA will automatically detect if it is supported. We only turn this off if a vendor specifically recommends against using it with their hardware.
RSS - or Receive-Side Scaling allows network load from a network adapter to be shared across multiple processors. RSS enables packet receive-processing to scale with the number of available processors. This allows the Windows Networking subsystem to take advantage of multi-core and many core processor architectures. I would leave this on, unless you're sure it is causing degraded performance. Additional information here.
QoS - or Quality of Service. This function tags data depending on its type and allows for prioritized traffic handling. Only useful if your network supports it. If you're not familiar with QoS configuration, either turn it off or get smart on it before turning it on. There is more to setting this up than just turning it on at the NIC.
LSO/CSO - Large Send Offload and Checksum Offload; leave these on, unless you have a compelling reason to turn it off. The conventional wisdom is that it is better to have the HBA do whatever it can so that CPU resource utilization is minimized.
Preset the link speed rather than leaving it on auto - that used to be the conventional wisdom, but with 1GB and 10GB ethernet links, it is now considered best practice to leave this set to auto. There are cases where setting the link speed will actually cause the link to go offline.
VMQs - Microsoft has guidance on when to enable VMQs here. Not all Broadcom NICs support VMQs, so if the option is not available in BACS, it's likely your model doesn't support it. There should be no need to configure registry settings to enable the feature.
Oh, as a final note. It is VERY, VERY important that your teamed NICs are identically configured. Not only between the NICs on the same server, but the ones that will be used together in the cluster. Ideally, hardware should be identical, but if not, at least ensure that only identical capabilities are enabled.