After digging around to understand how to setup replication using gluster I've come across this question: Can Apache Read The GlusterFS Brick Directly But Write To The GlusterFS Mount?
I've also found a how that seems to explain the same thing, and I thought I understand it, but now I think I don't.
So in order to get this kinf of replication I need to have both machines function as servers and clients at once? Now I don't understand how the relationship works: isn't B, for example a client of A?
Are there more than one level of client-server relationships involved? Is A a client for A and B a client for B, each mounting in a folder a volume from the same machine and those 2 volumes somehow are in sync (from A to B) in a 3rd layer of relationships?
Why is the question above asking about writing to the file-system or to the mounted volume? When I make B a client to A, with A exporting a folder and B mounting it as a remote volume in a folder I never asked my self what I was writing on: i wrote into the original folder on A and into the mounted volume on B. Isn't this how it's supposed to work?
Let's say you have two machines, A and B. On each machine, you export
/opt/files
as a Gluster brick, and set up client-side replication. We then mount the resulting directory as/mnt/gluster-files
on both machines. This is important!Using that mount point, we now have a highly available file system across the two machines.
When you write a file - let's say
/mnt/gluster-files/example
on machine A, it will cause two things to happen:/opt/files
/opt/files
on machine B.This is good, because we want to have redundancy, which means we have to have more than one copy of the data.
Next up, let's say we want to read the same file. Again on machine A:
/mnt/gluster-files/example
(§ There is a
read-subvolume
client option, and it is sensible to set it to the local volume on any machine that is a Gluster client and server, as in this case. Otherwise, step 5 could be 'you are sent the file from a random node'.)Behind the scenes, GlusterFS keeps
/opt/files
on both machines in sync. Checking every node, especially for a large number of small files, adds a not-insignificant performance penalty.The question is therefore raised: if I am running a process on one of these two machines, and I know the files are in sync, why can't I just read the files from the local share?
It's not recommended, but you can do this. Read the files from
/opt/files
. Manually keep track of if you get out sync, and if you do, do something like als -laR
in/mnt/gluster-files
which will trigger a synchronization.So, what happens if you write to
/opt/files
on machine A?The file sits there unnoticed by GlusterFS. Gluster doesn't work that way. It doesn't get onto machine B unless you happen to do something which makes Gluster notice it on machine A.
Therefore, you can't just tell Apache to read and write to
/opt/files
. What seems like a good compromise is telling it to read from/opt/files
but write to/mnt/gluster-files
. This is only possible if your application lets you specify a different path for reading and writing files, which not many do.