I've set up a pair of identical servers with RAID arrays (8 cores, 16GB RAM, 12x2 TB RAID6), 3 10GigE interfaces, to host some highly available services.
The systems are currently running Debian 7.9 Wheezy oldstable (because corosync/pacemaker are not available on 8.x stable nor testing).
- Local disk performance is about 900 MB/s write, 1600 MB/s read.
- network throughput between the machines is over 700MB/s.
- through iSCSI, each machine can write to the other's storage at more than 700 MB/s.
However, no matter the way I configure DRBD, the throughput is limited to 100MB/s. It really looks like some hardcoded limit. I can reliably lower performance by tweaking the settings, but it never goes over 1Gbit (122MB/s are reached for a couple of seconds at a time). I'm really pulling my hair on this one.
- plain vanilla kernel 3.18.24 amd64
- drbd 8.9.2~rc1-1~bpo70+1
The configuration is split in two files: global-common.conf
:
global {
usage-count no;
}
common {
handlers {
}
startup {
}
disk {
on-io-error detach;
# no-disk-flushes ;
}
net {
max-epoch-size 8192;
max-buffers 8192;
sndbuf-size 2097152;
}
syncer {
rate 4194304k;
al-extents 6433;
}
}
and cluster.res
:
resource rd0 {
protocol C;
on cl1 {
device /dev/drbd0;
disk /dev/sda4;
address 192.168.42.1:7788;
meta-disk internal;
}
on cl2 {
device /dev/drbd0;
disk /dev/sda4;
address 192.168.42.2:7788;
meta-disk internal;
}
}
Output from cat /proc/drbd
on slave :
version: 8.4.5 (api:1/proto:86-101)
srcversion: EDE19BAA3D4D4A0BEFD8CDE
0: cs:SyncTarget ro:Secondary/Secondary ds:Inconsistent/UpToDate C r-----
ns:0 nr:4462592 dw:4462592 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:16489499884
[>....................] sync'ed: 0.1% (16103024/16107384)M
finish: 49:20:03 speed: 92,828 (92,968) want: 102,400 K/sec
Output from vmstat 2
on master (both machines are almost completely idle):
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 0 14952768 108712 446108 0 0 213 254 16 9 0 0 100 0
0 0 0 14952484 108712 446136 0 0 0 4 10063 1361 0 0 99 0
0 0 0 14952608 108712 446136 0 0 0 4 10057 1356 0 0 99 0
0 0 0 14952608 108720 446128 0 0 0 10 10063 1352 0 1 99 0
0 0 0 14951616 108720 446136 0 0 0 6 10175 1417 0 1 99 0
0 0 0 14951748 108720 446136 0 0 0 4 10172 1426 0 1 99 0
Output from iperf
between the two servers:
------------------------------------------------------------
Client connecting to cl2, TCP port 5001
TCP window size: 325 KByte (default)
------------------------------------------------------------
[ 3] local 192.168.42.1 port 47900 connected with 192.168.42.2 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 6.87 GBytes 5.90 Gbits/sec
Apparently initial synchronisation is supposed to be somewhat slow, but not this slow... Furthermore it doesn't really react to any attempt to throttle sync rate like drbdadm disk-options --resync-rate=800M all
.
In newer versions of DRBD (8.3.9 and newer) there is a dynamic resync controller that needs tuning. In older versions of DRBD setting the
syncer {rate;}
was enough; now it's used more as a lightly suggested starting place for the dynamic resync speed.The dynamic sync controller is tuned with the "c-settings" in the disk section of DRBD's configuration (see
$ man drbd.conf
for details on each of these settings).With 10Gbe between these nodes, and assuming low latency since protocol C is used, the following config should get things moving quicker:
If you're still not happy, try turning
max-buffers
up to 12k. If you're still not happy, you can try turning upc-fill-target
in 2M increments.Someone elsewhere suggested that I use these settings:
And the performance is excellent.
Edit: As per @Matt Kereczman and others suggestions, I've finally changed to this:
Resync speed is high:
Write speed is excellent during resync with these settings (80% of local write speed, full wire speed):
Read speed is OK:
Later edit:
After a full resync, the performance is very good ( wire speed writing, local speed reading). Resync is quick (5/6 hours) and doesn't hurt performance too much (wire speed reading, wire speed writing). I'll definitely stay with c-plan-ahead at zero. With non-zero values, resync is way too long.
c-plan-ahead have to set a positive value to enable dynamic sync rate controller. disk
c-plan-ahead 15; // 5 * RTT / 0.1s unit,in my case is 15 c-fill-target 24; c-max-rate 720M;