We have a number of Linux web servers behind a hardware load balancer, which serve a single website.
The need has arisen for visitors to be able to upload files. The files are typically 300-700KB in size and we expect in the region of 1 million of them. However, this poses an obvious problem: if a user uploads a file to the server handling their request, how do we keep all of the servers in sync?
The delay should be minimal, so using something like rsync on a set schedule isn't really an option. There should also not be a single point of failure, so NFS wouldn't be suitable, unless it was combined with DRBD to create a failover NFS server.
I have looked into shared/clustered filesystems (GlusterFS, MogileFS, OCFS2 and GFS), but have no experience of these, so am unsure how they perform in a production environment in terms of performance and reliability.
I welcome any advice as to how this problem is best solved.
Many thanks
GFS2/OCFS2 via DRBD allow a pair of servers to run dual primary as clustered storage. Your web frontends would pull from that shared pair. You could have multiple heads sharing a single FC attached media using either as well, or, could use NFS to have a single shared filesystem used by each of the web frontends. If you use NFS with DRBD, remember that you can only run that in primary/secondary mode due to the lack of cluster locks. This could cut your potential throughput in half.
GlusterFS sounds more like what you're looking for. It'll have some unique quirks, i.e. file requested on node that doesn't have it yet, metadata lookup says it is there, it gets transferred from one of the replicated nodes and then served. First time requested on a node will have some lag depending on the filesize.
OpenAFS is also another possibility. You have shared storage, each local resource has a local pool of recently used items. If the storage engine goes down, your local resource pools still serve.
Hadoop's HDFS is another alternative that just 'works'. A bit complicated to set up, but, would also meet your requirements. You're going to have a lot of duplicated content when using a distributed filesystem.
Another dirty method would be to have caches running on each of your web frontends that pull static/uploaded content from a single machine and use Varnish on each of the frontends to maintain a ram-cached version of your single copy. If your single machine fails, Varnish would cache existing items until the grace period, new items would be lost.
Much of this will be based on how reliable a backend you need. Distributed filesystems where your local machines are a replicating node are probably going to have the edge on speed since they don't involve network operations to get the data, but, with gigE and 10G cards being cheap, you probably won't experience significant latency.
All clustered filesystems have one central weakness: if a certain percentage of systems go offline, then the whole filesystem is useless but the nodes that are still up may not deal with it gracefully.
For example, assume that you have 30 servers in a rack and want to share their local space. You build a cluster filesystem and you even build it so that if just one node goes down, the data was replicated on enough other nodes that there is no problem. So far, so good. Then a card in the Ethernet switch dies. This switch interconnects all of the nodes in your cluster. The card shuts off communication to 15 of your 30 nodes. The questions you need to ask yourself are:
Think forward a couple of steps and what the system will do under failure of any major component, including network or power cables.
Now you are ready for a clustered file system and all the million new jargon words you didn't know before.
We use either NFS from NetApp boxes or OCFS2 volumes off FC LUNs, I don't know if they're the best options but they've worked for us for years. Certainly both perform perfectly adequately although I personally prefer the OCFS2 over FC LUNs option myself as I'm more of a storage guy really. I guess it really comes down to what shared-storage infrastructure you already have and are comfortable.