I have a great working version of DRBD across two Debian Stretch servers which I created by following this awesome guide: https://www.howtoforge.com/setting-up-network-raid1-with-drbd-on-debian-squeeze-p2/
But after each reboot I have to redo a number of things to get it into a working state again.
Here is what I see when it's working, before reboot:
root@server1:~# cat /proc/drbd
version: 8.4.7 (api:1/proto:86-101)
srcversion: AC50E9301653907249B740E
0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
ns:8 nr:0 dw:4 dr:1209 al:1 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
On server 2:
root@server2:~# cat /proc/drbd
version: 8.4.7 (api:1/proto:86-101)
srcversion: AC50E9301653907249B740E
0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----
ns:0 nr:8 dw:8 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
On server 1:
root@server1:~# mount
...
/dev/drbd0 on /var/www type ext3 (rw,relatime,data=ordered)
And here is what I see after reboot. All is working but mounting, service start, and primary/secondary configurations is lost. I've tried to add drbd
to the start by doing this:
update-rc.d drbd defaults
on both servers this doesn't seem to work. DRBD simply doesn't start but manually doing /etc/init.d/drdb start
works fine on both servers.
Also I'm unsure if I can just add the DRBD volumes to fstab
because surely it won't work if the DRBD service isn't even started? I've read about using _netdev
in fstab
but various combinations of fstab
entries didn't work out.
Finally I also have to set the primary and secondary status of DRBD every time I restart and then remount the volume manually.
So this is how I am getting it working after reboot:
On server 1:
root@server1:/etc# /etc/init.d/drbd status
● drbd.service - LSB: Control DRBD resources.
Loaded: loaded (/etc/init.d/drbd; generated; vendor preset: enabled)
Active: inactive (dead)
Docs: man:systemd-sysv-generator(8)
root@server1:/etc# /etc/init.d/drbd start
[ ok ] Starting drbd (via systemctl): drbd.service.
root@jmtest1:/etc# cat /proc/drbd
version: 8.4.7 (api:1/proto:86-101)
srcversion: AC50E9301653907249B740E
0: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
root@server1:/etc# drbdadm primary r0
root@server1:/etc# cat /proc/drbd
version: 8.4.7 (api:1/proto:86-101)
srcversion: AC50E9301653907249B740E
0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
On server 2:
root@server2:~# /etc/init.d/drbd status
● drbd.service - LSB: Control DRBD resources.
Loaded: loaded (/etc/init.d/drbd; generated; vendor preset: enabled)
Active: inactive (dead)
Docs: man:systemd-sysv-generator(8)
root@server2:~# /etc/init.d/drbd start
[ ok ] Starting drbd (via systemctl): drbd.service.
root@server2:~# cat /proc/drbd
version: 8.4.7 (api:1/proto:86-101)
srcversion: AC50E9301653907249B740E
0: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
root@server2:~# drbdadm secondary r0
root@server2:~# cat /proc/drbd
version: 8.4.7 (api:1/proto:86-101)
srcversion: AC50E9301653907249B740E
0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
Some fstab
entries that I tried:
/dev/drbd0 /var/www ext3 _netdev 0 2
UUID=042cc2e395b2b32 /var/www ext3 none,noauto 0 0
Not sure if you're supposed to use the UUID or just /dev/drbd0
I have these questions as to why it's not starting:
- What FSTAB entries are supposed to be there?
- Why does
update-rc.d drbd defaults
not work? - Why do I have to reset primary and secondary on both servers after each restart?
You should definitely consider using a product which is made for dealing with such problems.
I described in this post [ Nagios/Icinga: Don't show CRITICAL for DRBD partitions on standby node ] how I manage to do exactly what you expect using opensvc, and it works fine for years.
no need for fstab entries as mounts are described in the opensvc service configuration file, which is automatically synchronized between opensvc cluster nodes
no need to setup
update-rc.d drbd defaults
because opensvc stack is modprobing the drbd module when it sees that you have drbd resources in some services, and then brings up drbd in primary/secondary statesto reach primary/secondary at boot, just setup the
nodes
parameter in theDEFAULT
section of the opensvc service configuration file.If you want
server1
as primary andserver2
as secondary, just setnodes=server1 server2
using the commandsvcmgr -s mydrbdsvc set --kw DEFAULT.nodes="server1 server2"
if you only want the service to be started at boot on
server1
, set theorchestrate=start
parameter using the commandsvcmgr -s mydrbdsvc set --kw DEFAULT.orchestrate=start
if you only want the service to be orchestrated in high availability mode (meaning automatic failover between nodes), set the
orchestrate=ha
parameter using the commandsvcmgr -s mydrbdsvc set --kw DEFAULT.orchestrate=ha
to relocate the service from one node to the other, you can use
svcmgr -s mydrbdsvc switch
commandThere's a lot to unpack here, but I will start by saying: That DRBD version is old! You should upgrade it to either 8.4.11 (as of this writing, December 2018), or move to the 9 branch. But the questions you're asking can be solved with the version you're using.
Let's take a look at your three consice questions you posed at the bottom of your post:
None, ideally. DRBD devices must first be promoted before they can be used. The fstab is not the best option even for DRBD-9 block devices that can autopromote, as in most cases it will trip up the boot process of things aren't all fine. But it can technically work with many caveats.
Don't do it that way. Debian uses systemd, and DRBD has a systemd unit file. You should be using that UNLESS you're using a cluster manager, which I highly recommend. So in the absence of the Pacemaker cluster resource manager, you should be issuing things like
# systemctl enable drbd
to make it start at boot. Or# systemctl start drbd
to start the service after it's stopped. The most common commands are start, stop, restart, enable and disable.Because DRBD has no concept of leader elections. An external system must promote a DRBD resource to Primary, and there can only be one primary at a time. There are a lot of ways to do this.
You can "manually" promote a resource with
# drbdadm primary <resource>
and then mount things up from there -- but that's what you want to avoid. DRBD can "autopromote" in version 9, which will automatically promote a resource upon attempting to open its block device for access (like with a filesystem mount or a volume group activation) -- but the version you're running can't do that (upgrade? This might be enough for you). OR you could use a finite state leader election system to control promotion actions and ensure the state of both DRBD and the application stack it supports. That's Pacemaker.You want Pacemaker, I promise. Pacemaker is not overly difficult, but it is extremely large. You won't have to learn much of it in order to accomplish what you want. You can spend a lot of time and energy making a "perfect" cluster that will resist any failure under the sun, and that effort would be rewarded. It's a good system.
A combination of Corosync/Pacemaker, DRBD-9.x or DRBD-8.4.x latest, and whatever you have on top of that should accomplish what you want automatically. There are lots of docs out there which detail how to do this. Here's one that's up to date:
linbit users-guide-9.0
I would recommend reading that entire guide if you have the time. DRBD has undergone some serious evolution in the past few years.
Extending the Answer of @Spooler: See Chapter 5.8.3 of the drbd8 Manual for the become-primary-on directive. While the mentioned chapter is about dual primary mode, become-primary-on is also valid for active-passive setups.