Consider the following setup:
Windows 2008 R2, MPIO feature installed. Two iSCSI NICs (not bonded), 1Gb each.
Storage: Compellent, 2x 1Gb iSCSI ports, single controller.
In my tests I have confirmed that using Round Robin MPIO, both iSCSI NICs on the host are active during a single-worker IOMETER test. Both iSCSI NICs on the storage are also active during this test. I am seeing about 50% to 60% utilization on each host NIC, and I would expect more. I am using a crappy D-Link switch at the moment and this certainly is not helping, so I'm not super concerned about this yet.
My question is this: instead of "how can I make this particular setup perform", I would like to know, more generally, if round robin (active/active) MPIO allows me to get greater than 1Gb bandwidth from the host to the storage, using a single I/O stream (like copying a file, or running a single worker IOMETER test).
If yes, why? If no, why not?
MPIO has various policies available to it. As the Coding Gorilla points out, most of those policies allow for load balancing across multiple connections to aggregate bandwidth. Both your initiator and target have to have multiple connections for it to actually be faster than single link speed however. Round Robin is a poor choice of policy; you should be using either Weighted Distribution or Least Queue Depth.
The iSCSI SAN and Server I have here have 4 ports each and I can actually get ~3.2Gbps under fairly ideal circumstances. If you need something faster than that, you'd be looking at FC or IB.
Also, do not use Trunking/Link Aggregation/etc on iSCSI links. When one links fails the connection will fail. You must use MPIO to accomplish link redundancy.
I'm not an expert on the MPIO features and iSCSI, but from technet: (http://technet.microsoft.com/en-us/library/dd851699.aspx)
This to me says that it's simply distributing the traffic across the two and it's not going to try to push either one to it's limits in order to increase performance.
Also, from a purely networking perspective, if you have both NICs connected to the same switch, then you're not going to get more than 1Gb. Most "consumer" switches are only going to handle 1Gb of traffic max, not per port. There are higher end switches that have a better back-plane that can handle more traffic but I still doubt that you'd get much more out of them. You would be better of putting each NIC on a separate segment (ie. switch) to eliminate that potential "bottle neck".Like I said, I'm not an expert on the subject, but that's just my initial reactions. Feel free to correct me where I'm mistaken.
MPIO with Equallogic basically picks the best iSCSI HBA interface to leave, and the best interface on the SAN based on evaluated load. To my knowledge, you'll only get one stream per LUN, meaning you're not going to split the traffic in half over an ethernet link. so you'll never get more than 1Gbs per connection to that lun per host. Now if you have multiple LUNS, you can hit other interfaces on the SAN to balance out the the throughput. This however is based on my understanding of MPIO. Also, as mentioned, no need for link aggregation and the switch is probably not your problem (unless it has a throughput level that you're hitting ie, overcommit).
here's a good doc on getting it setup and going over the various options.
http://www.dellstorage.com/WorkArea/DownloadAsset.aspx?id=2140