I've looked into tools like DRDB (that can scale only up to 2 active-active nodes) and GlusterFS (that is great, but not for the small files that a web server usually serves).
I'm looking for a filesystem/cluster of storage servers:
- That is POSIX compilant (as many web apps require this)
- That works great with a 90% reads / 10% write workload of small files (think at index.php pages or small images)
- Is scalable, both in size and performance
- Access time is minimal
I have tough a lot on this and i think that there is nothing out there that could help me.
and welcome to this eternal problem.
Sometimes a separate file server and then X numbers of web servers having the web root mounted over NFS works fine. For the read oriented stuff, very likely. You didn't give us any numbers about the traffic so can't be sure.
Sometimes approach similar to "Puppet (or just good old rsync and some scripting) spreads the files to every web server node" works fine.
For some GFS/GPFS works fine -- would not recommend with lots of small files.
There is stuff like clustered LVM and clustered version of XFS called CXFS. Then you could have a single SAN LUN and have every web server node mounting it, similar to GFS/GPFS. I have no idea if this works any better with lots of small files, though. I always suspect this does not work.
Personally I would avoid everything that would act as a single point of failure. If someone has good suggestions for resolving this problem, I'm also VERY interested. +1 to your question, sir!