I've got 2 webservers, with the chance of having to add more servers along the way. Right now I keep these servers in synch using lsyncd + csync2. It works well performance wise because all files are on both servers (no network access required to open files locally), but not so well in other cases.
One example of this is if I delete a file on server 1 and immediately uploads a new file to server 1 which has the same name. The file would then be deleted from server 2 in the meantime, causing the newly uploaded file on server 1 to be deleted as server 2 sends the delete event on to server 1 to complete the "update circle".
I can't help thinking that there must be a better way to keep servers in synch. I've been looking at GlusterFS, and I see that a setup where all files are replicated to all servers are discouraged. However, I'm running CMS systems like Drupal on these servers. Such CMS systems often opens quite a few files, and I'm worried that too much network traffic to get hold of these files will slow down the requests.
Would it be an idea to look into replacing lsyncd + csync2 with GlusterFS set up to replicate all files to all nodes, or is that a bad idea?
BitTorrent Sync may do the deed for you. I'm using it to keep files in sync between a few internal servers at my house and it's doing the job wonderfully. The other thing you'll need to think about is the backend database when your app uses a CMS. Make sure that there's MySQL replication going on, or something of that sort.
GlusterFS is hard to deploy. For web data, file sync level like Unison is much easier to deploy and maintenance.
DRBD is a perfect solution for keep data sync at block level. But you have to format them to special format like OCFS2 or something similar.
Gluster would solve the problem you have because it can hold the locks, propagate changes - delete the file on all other nodes, but it may add additional latency that can be a problem for a webserver. The next alternative is DRBD+OCFS2 or GFS, but that's probably more complex as with gluster you are using the underlying filesystem - it doesn't operate at the block level so if servers are out of sync it's not too hard to fix, files can't get corrupted so easily because of split brains, etc...
We are using it for a mailserver and it is quite slow for directories with a lot of files. You should definitely test everything before deploying. I'm currently testing the NFS mount because it works better for small files.
Why don't you use a tool like puppet ? Write once in a source and once ready deploy it to the targets using "puppet kick" or mcolletive. It's well documented. And you can easily add servers later if needed.
You can also rely on tools using inotify, like lsyncd, working at the kernel level. It watches for changes in a folder and trigger a sync. But if a tool dedicated to the synchronization of files on a cluster like csync2 is not enough I don't know what will be.
Just to be sure, do the modifications happens also on server 2 or only on server 1 ?