newbie question. I need to build this:
/shared
folder ~500GB of files, ~1MB each one.- Two boxes (server1 and server2) connected by a 1Gbs LAN
- Every box needs to get r/w access to the files, so their are both clients
- I want that the files replicated on both boxes, every time a file is written in one server the same file should be present in the other one.
My questions regarding GlusterFS:
- It'll duplicate the files on the same box?. For example the files are on
/shared
and the mount in/mnt/shared
. It'll take 1GB space on every server? - Instead, should I use the filesystem directly, locally writing on
/shared
? Does the replication work in this way without mountin a client?
Also, if anyone know any other way to acomplish this setup I'll be very grateful. Thanks in advance.
I've finally managed to get this solved using GlusterFS in both boxes. Some things learned in the process:
option read-subvolume
. Off course to keep the RAID1 integrity GlusterFS always check other volumes as well, but the actual file is retrieved directly from diskModified client configuration:
Answering my both questions:
No, the fs is mounted using FUSE. Current /etc/fstab line:
/etc/glusterfs/client.vol /mnt/shared glusterfs defaults 0 0
No, always use mounted volumes to make read/writes, using directly the filesystem may lead to inconsistencies.
Actually Gluster is perfect for this scenario. You get bi-directional replication and the ability to mount the filesystem from either machine, giving you (theoretically) twice the effective I/O capacity of NFS and active failover should one of the boxes fail.
The problem with doing active rsync this way is blocking I/O due to file locks. Depending on your application and the change in data this could be irrelevant or disastrous! Distributed filesystems have very specific locking semantics that prevent this from happening. Even if inotify has better locking (when I last tried it it didn't) these days then your file accesses may block, depending on whether your network can cope with the changes. These are all theoretical caveats but worth looking into depending on what your app does.
It'd be much easier to setup rsync to do active mirroring, or to just setup a nfs share and have them both pull from the same actual drive.