By large file tree I mean about 200k files, and growing all the time. A relatively small number of files are being changed in any given hour though.
By bidirectional I mean that changes may occur on either server and need to be pushed to the other, so rsync doesn't seem appropriate.
By distant I mean that the servers are both in data centers, but geographically remote from each other. Currently there are only 2 servers, but that may expand over time.
By real-time, it's ok for there to be a little latency between syncing, but running a cron every 1-2 minutes doesn't seem right, since a very small fraction of files may change in any given hour, let alone minute.
EDIT: This is running on VPS's so I might be limited on the kinds of kernel-level stuff I can do. Also, the VPS's are not resource-rich, so I'd shy away from solutions that require lots of ram (like Gluster?).
What's the best / most "accepted" approach to get this done? This seems like it would be a common need, but I haven't been able to find a generally accepted approach yet, which was surprising. (I'm seeking the safety of the masses. :)
I've come across lsyncd to trigger a sync at the filesystem change level. That seems clever though not super common, and I'm a bit confused by the various lsyncd approaches. There's just using lsyncd with rsync, but it seems this could be fragile for bidirectionality since rsync doesn't have a notion of memory (eg- to know whether a deleted file on A should be deleted on B or whether it's a new file on B that should be copied to A). lipsync appears to be just a lsyncd+rsync implementation, right?
Then there's using lsyncd with csync2, like this: https://icicimov.github.io/blog/devops/File-system-sync-with-Csync2-and-Lsyncd/ ... I'm leaning towards this approach, but csync2 is a little quirky, though I did do a successful test of it. I'm mostly concerned that I haven't been able to find a lot of community confirmation of this method.
People on here seem to like Unison a lot, but it seems that it is no longer under active development and it's not clear that it has an automatic trigger like lsyncd.
I've seen Gluster mentioned, but maybe overkill for what I need?
UPDATE: fyi- I ended up going with the original solution I mentioned: lsyncd+csync2. It seems to work quite well, and I like the architectural approach of having the servers be very loosely joined, so that each server can operate indefinitely on its own regardless of the link quality between them.
DRBD in Dual-primary mode with a Proxy is an option.
In your case I would recommend a combination of DRBD in dual-primary-mode and gfs or ocfs.
The drawback of DRBD in dual-primary is that it will be running in syncronous mode. But write-speed does not seem to be important here right?
An alternative to DRBD might be a Soft-Raid1 using many (2+) iSCSI-Targets - but I would prefer DRBD with two nodes.
Rather than syncing, why not share the same filesystem over NFS?
Implementing a distributed filesystem is probably better than hacking this together with tools and scripts, especially if the cluster of servers will grow. You'll also be able to handle a downed node better.
I don't think Gluster (or AFS) is overkill at all.
As demonstrated above, many solutions are available, each with its advantages and drawbacks.
I think I would consider placing the whole tree under version control (Subversion, for instance) and periodically checking in/updating from both servers in cron jobs.
Having just ended somewhat of a quest regarding the same thing, I'm going with gluster. However, I haven't done or found any performance tests.