Looking for some input and if anyone has conquered this problem with a solution they feel confident about.
Looking to setup a fault tolerant web environment. So the setup is a few nodes behind a load balancer. Now the web devs can ssh into 1 server to edit the code and such.
I'm thinking of glusterfs but putting a glusterfs filesystem as the doc root lead to around 20-30% decrease in pages the webserver could serve. I expect this since I'm only over ethernet and not infiband or such.
So I was thinking about using glusterfs+inotify. So I have a inotify script running that monitors the docroot and the gluster mount for changes and does an rsync on that file/dir that gets changed. This way apache can serve from the local disk and not gluster but it gives the effect that it is being served via a clustered filesystem.
My only issue with that is I'd need to run 2 of the inotify scripts and for the filecount we are running to add all the inotify watchers I'd being using around 700megs of RAM for them both.
So anyone have any advice or pointers?
thanks!
EDIT
Think of it like a webhost. Clients ssh into 1 server but the files they create/edit/delete are on all the other nodes
The reverse also needs to be true.. If the webserver creates files they need to be also on all the nodes too.
So that throws a straight rsync out since it's too slow.
Read @Zypher's comment. Read it over and over until you understand the wisdom of those words, see the light, and chase your developers off your production servers and into an appropriate sandbox.
You can borrow my pointy stick. :-)
Reframing your question in that light, "How do I keep the code on my webservers consistent?".Answer: puppet (or Chef), radmind, or any of the many wonderful configuration/deployment systems out there.
These tools give you a much simpler way of achieving your goal, take up a lot less RAM/CPU, and can be set up to guarantee consistency across all your nodes.This portion of the answer withdrawn based on edit to the original question
There's really only one solution for you that I can think of, and that's a SAN (or NAS device serving up files over NFS).
The reason I'd suggest this route is that you need to have files created by each of the servers available to all the others. Doing massive N-way synchronization will become unwieldy and slow. Centralizing on to a SAN will give better performance, good redundancy (SANs are pretty bullet-proof if you don't cheap out), and the ability to scale up easily as your needs increase.
It isn't without downsides: Unless you do a pair of mirrored, redundant SANs with redundant fabric you will be introducing a single point of failure. SANs are also not cheap, and redundancy just adds on more expenses.
Note that none of this obviates the need to keep the developers off the production box, unless you're guaranteed that they'll never call you when they break something. At the very least you should be strongly suggesting that they rent a dev environment from you (at a reasonable profit obviously - something to help pay for the cost of the SAN...)
Oh wow, I'm having flashbacks to a past job, where GFS was used for the exact rationale you're describing. The scenario: in excess of 2000 customers running their apps on a number of large-scale clouds.
Basically, you can't do what you think you want to do. You cannot get a clustered or network filesystem that will work at anywhere near the speed of a local filesystem. Let me emphasise that for a second: CANNOT. If you think you can, you're deluding yourself. If someone else says they can, they're lying. It's simple mathematics: disk speed + controller IO + network latency + cluster fu must be greater than disk speed + controller IO.
Now, down to the reasons you might be building this, and why what you want to do is useless:
Now that I've been a negative nelly for a screen or so, what can you do? Well, it basically comes down to helping your customers on an individualised level.
You can't build a cookie-cutter, one-size-fits-all hosting infrastructure for infinite scale. That's what my GFS-loving past employer tried to do, and by gum it didn't work then, and I'm confident it cannot be done with currently available development and operational technologies.
Instead, take a bit of time to assess your customers' requirements and help them towards a solution that meets their requirements. You don't necessarily have to do a full analysis of every customer; after the first few, you'll (hopefully) start to see patterns emerge, which will guide you towards a range of "standard" solutions. It becomes "OK, you've got requirements F, P, and Aleph-1, so you'd be best off with solution ZZ-23-plural-Z-alpha -- and here's our comprehensive set of documentation on deploying this solution, and our prices for custom consulting on this solution if you can't implement it yourself are at the bottom".
As far as specifics go, there's too many to list, but I'll give you a few hints: