I'm having an issue with a client of ours.
We've set up Round-robin multipathing to their san from their ESX cluster, we have it configured so that two NICs on each host are added to a port group, one NIC is connected to one switch, then the other to another switch. Each switch is then patched to one of the two Gigabit interfaces on the Equallogic. I've then configured dynamic discovery properly and enabled round-robin on each datastore for each host.
Our client has a stand-alone esxi server as part of their environment and, with this configuration, it works fine. It's only the cluster that now seems to be having an issue. When we originally reconfigured it to work earlier in the day I tested it by pulling a switch and it worked for the cluster + the standalone esxi host. I then went on to put a basic configuration on the switches; 1 port untagged on vlan 1 for management and the rest untagged on vlan 500 for data. I then put the first reconfigured switch in, then after a short while pulled the second switch, saw it fail over and then did the same. After reconfiguring it I put the second switch back in to the environment and it was working fine.
I then realised I'd run the power to the switches through the side of the cabinet and had to reroute them, so pulled the power on switch 2, ok, plugged it back in, ok, pulled the power on switch one, ok, plugged it back in... then all access to the datastores was lost. I had a look in at the datastores in vsphere and noticed all LUNs were disconnected, even after a refresh nothing would come up. So I pulled the power on switch one and access was restored.
The weird thing is this behaviour is only observed on the ESX cluster, the ESXi standalone host works fine and has, as far as I can tell, an identical configuration.
I'll admit I'm no storage genius, would anybody care to shed some light on where I'm going wrong?
0 Answers