I'm setting up my first Gluster 3.4 install and all is good up until I want to create a distributed replicated volume.
I have 4 servers 192.168.0.11, 192.168.0.12, 192.168.0.13 & 192.168.0.14.
From 192.168.0.11 I ran:
gluster peer probe 192.168.0.12
gluster peer probe 192.168.0.13
gluster peer probe 192.168.0.14
On each server I have a mounted storage volume at /export/brick1
I then ran on 192.168.0.11
gluster volume create gv0 replica2 192.168.0.11:/export/brick1 192.168.0.12:/export/brick1 192.168.0.13:/export/brick1 192.168.0.14:/export/brick1
But I get the error:
volume create: gv0: failed: Host 192.168.0.11 is not in 'Peer in Cluster' state
Sure enough if you run gluster peer status it shows 3 peers with the other connected hosts. i.e. Number of Peers: 3
Hostname: 192.168.0.12 Port: 24007 Uuid: bcea6044-f841-4465-88e4-f76a0c8d5198 State: Peer in Cluster (Connected)
Hostname: 192.168.0.13 Port: 24007 Uuid: 3b5c188e-9be8-4d0f-a7bd-b738a88f2199 State: Peer in Cluster (Connected)
Hostname: 192.168.0.14 Port: 24007 Uuid: f6f326eb-0181-4f99-8072-f27652dab064 State: Peer in Cluster (Connected)
But, from 192.168.0.12, the same command also shows 3 hosts and 192.168.0.11 is part of it. i.e.
Number of Peers: 3
Hostname: 192.168.0.11
Port: 24007
Uuid: 09a3bacb-558d-4257-8a85-ca8b56e219f2
State: Peer in Cluster (Connected)
Hostname: 192.168.0.13
Uuid: 3b5c188e-9be8-4d0f-a7bd-b738a88f2199
State: Peer in Cluster (Connected)
Hostname: 192.168.0.14
Uuid: f6f326eb-0181-4f99-8072-f27652dab064
State: Peer in Cluster (Connected)
So 192.168.0.11 is definitely part of the cluster.
The question is, why am I not able to create the volume on the first gluster server when running the gluster command. Is this normal behaviour or some sort of bug?
I was seeing an obscure error message about an unconnected socket with peer 127.0.0.1.
It turns out the problem I was having was due to NAT. I was trying to create gluster servers that were behind a NAT device and use the public IP to resolve the names. This is just not going to work properly for the local machine.
What I had was something like the following on each node.
A hosts file containing
The fix was to remove the trusted peers first
Then change the hosts file on each machine to be
etc
Then peer probe, and finally create the volume which was then successful.
I doubt that using IP addresses (the public ones) will work in this case. It should work if you use the private addresses behind your NAT. In my case, each server was behind a NAT in the AWS cloud.
Try explicitly defining the replica count as four nodes using this format: -
I assume this pure replica and no stripe?
try this from 192.168.0.11: -
detach everything first:
next re-add in this format
Note I have explicitly defined this a four node replica set. also I explicitly defined the transport over tcp.
should you wish to stripe across two devices in a replica set then you would use something like this: -
Keep with it, I discovered gluster recently and I am in love with this ideology for distributed filesystems.. a real piece of art.
I use gluster to provide HA redundancy to a KVM virtual datastores. magic stuff