This Consul Server node, in another DC, keeps joining some time after I remove it.
The goal:
A cluster of 5 Consul Servers in DC alpha0
, whose KV Store a alpha0
Vault cluster uses as a Storage backend:
alpha0consulserver1.alpha0
alpha0consulserver2.alpha0
alpha0consulserver3.alpha0
alpha0consulserver4.alpha0
alpha0consulserver5.alpha0
A cluster of 5 Consul Servers in DC prd0
, whose KV Store a prd0
Vault cluster uses as a Storage backend:
prd0consulserver1.prd0
prd0consulserver2.prd0
prd0consulserver3.prd0
prd0consulserver4.prd0
prd0consulserver5.prd0
WAN connection is OK. But I am concerned that if they sync their KV stores, this may affect the two separate HashiCorp Vault clusters that each use them as a Back-end.
The problem:
A poorly tested Puppet script I wrote has resulted in one Consul node prd0consulserver5
, connecting to another in a different DC, alpha0consulserver1
.
I have completely purged and re-installed Consul for prd0consulserver5, but alpha0consulserver1
keeps connecting to it.
Here is an example of one of the configuration files, specifically, the one for alpha0consulserver1.alpha0:
nathan-basanese-zsh8 % sudo cat /etc/consul/config.json
{
"bind_addr": "192.176.100.1",
"client_addr": "0.0.0.0",
"data_dir": "/opt/consul",
"domain": "consul.basanese.com",
"bootstrap_expect": 5,
"enable_syslog": true,
"log_level": "DEBUG",
"datacenter": "bts0",
"node_name": "alpha0consulserver1",
"ports": {
"http": 8500,
"https": 8501
},
"recursors": ["192.176.176.240", "192.176.176.241"],
"server": true,
"retry_join": ["192.176.100.3", "192.176.100.2", "192.176.100.1"]
}
Here are some relevant logs from prd0consulserver5
, but I can post more upon request:
2017/05/26 23:38:00 [DEBUG] memberlist: Stream connection from=192.176.100.1:47239
2017/05/26 23:38:00 [INFO] serf: EventMemberJoin: alpha0consulserver2.alpha0 192.176.100.2
2017/05/26 23:38:00 [INFO] serf: EventMemberJoin: alpha0consulserver1.alpha0 10.240.112.3
2017/05/26 23:38:00 [INFO] consul: Handled member-join event for server "alpha0consulserver2.bts0" in area "wan"
2017/05/26 23:38:00 [INFO] serf: EventMemberJoin: alpha0consulserver3.alpha0 192.176.100.3
2017/05/26 23:38:00 [INFO] consul: Handled member-join event for server "alpha0consulserver1.bts0" in area "wan"
2017/05/26 23:38:00 [INFO] consul: Handled member-join event for server "alpha0consulserver3.bts0" in area "wan"
Eventually, I get to this:
2017/05/26 23:39:02 [DEBUG] memberlist: Initiating push/pull sync with: 192.176.100.2
I shut down the node, as I don't want the keys I write to the KV store on alpha0
nodes to appear on prd0
nodes.
What I have tried so far:
I've tried the following:
https://www.consul.io/api/agent.html#graceful-leave-and-shutdown
I didn't try force-leave
since it doesn't work on nodes outside of the configured DC.
I've also tried deregistering ALL prod0
hosts from the alpha0
hosts.
https://www.consul.io/api/catalog.html#deregister-entity
I'm at my wit's end, here, and can't seem to find a way
I've searched it on search engines, using this query and many similar queries: https://duckduckgo.com/?q=totally+deregister+consul+node&t=hc&ia=software
The following two results seemed to have a slightly similar problem, but nothing as simple as keeping a cluster of 5 Consul servers separate from another cluster of 5 Consul Servers.
https://github.com/hashicorp/consul/issues/1188 https://groups.google.com/forum/#!msg/consul-tool/bvJeP1c3Ujs/EvSZoYiZFgAJ
I think this may be handled by the "join_wan":
configuration setting, but it doesn't seem to have a way to explicitly turn it off. Plus, that seems like a hack-ey way to fix this problem.
I've also considered IPTables.
Anyway, I feel like there's something missing. I've started digging in to the Raft protocol, but I feel like maybe I've started going off on a tangent in my search. Any guidance appreciated, be it a comment or an answer.
More precisely, how do I keep the prd0
Consul Server Nodes having their own separate KV store and Consul Leader from the alpha0
Consul Server Nodes?