I'm working on setting up a cluster of servers with an iSCSI MD3200i SAN for shared storage. Everything is working well but I have one small detail I can't seem to get working. Multipath seems to only want to do failover with the iSCSI connections to the SAN. I'd like to get this working in load balancing mode so that it uses each path and not just one or the other.
One always shows as ghost here, meaning it's not being used.
[root@kvm-01]~# multipath -ll
mpath2 (36842b2b0006b9d87000004383bf558d9) dm-5 DELL,MD32xxi
[size=2.2T][features=3 queue_if_no_path pg_init_retries 50][hwhandler=1 rdac][rw]
\_ round-robin 0 [prio=100][active]
\_ 8:0:0:0 sdb 8:16 [active][ready]
\_ 7:0:0:0 sdc 8:32 [active][ghost]
My multipathd conf:
[root@kvm-01]~# egrep -v '(#|^$)' /etc/multipath.conf
blacklist {
device {
vendor "*"
product "Universal Xport"
}
device {
vendor "*"
product "MD3000"
}
device {
vendor "*"
product "MD3000i"
}
device {
vendor "*"
product "Virtual Disk"
}
devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
devnode "^hd[a-z][[0-9]*]"
devnode "^cciss!c[0-9]d[0-9]*[p[0-9]*]"
devnode "^sda$"
}
defaults {
user_friendly_names yes
polling_interval 5
selector "round-robin 0"
path_checker rdac
path_grouping_policy multibus
rr_weight uniform
no_path_retry 30
failback immediate
rr_min_io 100
prio_callout "/sbin/mpath_prio_rdac /dev/%n"
max_fds 8192
}
devices {
device {
vendor "DELL"
product "MD32xxi"
hardware_handler "1 rdac"
features "2 pg_init_retries 50"
}
device {
vendor "DELL"
product "MD32xx"
hardware_handler "1 rdac"
features "2 pg_init_retries 50"
}
device {
vendor "DELL"
product "MD36xxi"
hardware_handler "1 rdac"
features "2 pg_init_retries 50"
}
}
I've tried a variety of group_by and rr_weight settings, all with the same result.
[root@kvm-01]~# lsmod | grep rdac
dm_rdac 41673 1
dm_multipath 58457 3 dm_round_robin,dm_rdac
scsi_mod 199001 14 dm_rdac,be2iscsi,ib_iser,iscsi_tcp,bnx2i,cxgb3i,libiscsi2,scsi_transport_iscsi2,scsi_dh,sr_mod,sg,libata,megaraid_sas,sd_mod
I've also tried loading scsi_dh_rdac by that didn't make a difference either.
[root@kvm-01]~# egrep -v '(#|^$)' /etc/iscsi/iscsid.conf
node.startup = automatic
node.session.timeo.replacement_timeout = 30
node.conn[0].timeo.login_timeout = 15
node.conn[0].timeo.logout_timeout = 15
node.conn[0].timeo.noop_out_interval = 5
node.conn[0].timeo.noop_out_timeout = 15
node.session.err_timeo.abort_timeout = 15
node.session.err_timeo.lu_reset_timeout = 20
node.session.initial_login_retry_max = 8
node.session.cmds_max = 128
node.session.queue_depth = 32
node.session.iscsi.InitialR2T = No
node.session.iscsi.ImmediateData = Yes
node.session.iscsi.FirstBurstLength = 262144
node.session.iscsi.MaxBurstLength = 16776192
node.conn[0].iscsi.MaxRecvDataSegmentLength = 262144
discovery.sendtargets.iscsi.MaxRecvDataSegmentLength = 32768
node.conn[0].iscsi.HeaderDigest = None
node.session.iscsi.FastAbort = No
node.session.xmit_thread_priority = -20
node.conn[0].iscsi.MaxXmitDataSegmentLength = 0
I've been researching this for awhile now, and I've found plenty of people getting this setup to work with a MD3000i, but no confirmation either way of the 3200i. I found one person saying it doesn't support it because the secondary controller is passive by design but I've been unable to confirm that in Dell's documentation.
[root@kvm-01]~# uname -a
Linux kvm-01 2.6.18-238.9.1.el5 #1 SMP Tue Apr 12 18:10:13 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux
Summary
The only load balancing you can do is by spreading the LUNs amongst the controllers. While it advertises itself as active-active it is actually a dual-active SAN. So a LUN may only be associated with at most one storage processor at any one time, yet both controllers can be active and drive LUNs dedicated to each controller. That's what's meant by Active/Active in this case, that SAN can be fully utilized, and not that a single LUN can be load balanced by two controllers at the same time.
Details
Your path status for sdc says it all, ghost == passive, so all your multipath configuration is good for is failover. Your configuration is Active/Passive by definition.
http://sourceware.org/lvm2/wiki/MultipathUsageGuide
That standby storage controller needs be configured for Active/Active mode to accomplish what you're after; It may be a limitation of the SAN.
Verification
In answering a different question using the same SAN I discovered the docs for the SAN on the web and verified that this make and model is in fact dual-active. See:
Dell PowerVault MD3200i dm-multipath configuration and performance snags in Debian 6.0 (squeeze)
MD3200i (as well as every other LSI rebrand) uses RDAC. That's an a/p algorithm.