Building my web service onto EC2 right now and have a single instance behind a load balancer. I will of course cater for multiple instances.
My initial idea was to run all the instances a dumb slaves, and use S3 as local storage. For this, I've begun using S3FS but it's not really ready, from what I've seen, for production use in a web serving environment. Writing of logs seems to appear very late, if not never. Numerous issues with odd caching, even with no cache flags etc. Just generally nightmare to develop on.
But, the alternatives look few. One is obviously EBS volumes, which can be attached to a single instance. Some solutions to sharing this:-
- SMB sharing to other instances. Having one master and the rest slaves - obv needs redundancy built in here with multiple EBS volumes perhaps?
- Rsync sharing to other boxes. This seems painful, considering its not persistent and will updated periodically. Potentially ok, if there are forcing scripts to update when major changes have happened.
Question is... what DO people DO? It seems an entirely common use case, but the variety of answers found in forums and even here on SF, seems to suggest there's not a concise answer... help wanted!
An EBS volume that pushes to S3/CloudFront seems like the best move here, especially if you're worried about images, CSS, javascript, that type of stuff.
EBS will be easier to snapshot/backup than S3, especially for the server's filesystem.
You could also designate one server as the "master" and another as the "slave" and only make changes on the "master" for instance.
As for logging, have a look at some of the cloud logging services like http://loggly.com/ or https://papertrailapp.com/.
HTH
That's because there are a lot of answers, depending on your familiarity with the options, what you're comfortable with, what you're sharing and how often it's going to be synced, how "synced" it needs to be, how they'll be utilized (spare for heartbeat? Read only vs. a write instance? Balanced instances?) and how complicated you want the setup to be and what application you're using (databases that can sync themselves? Applications that are built for shared storage?...)
You could use Rsync to schedule syncs, shares from a file server, NFS server, DRBD "software RAID 1", etc...it depends on your particular use case and how you're backing up the data.
The short answer is there is no answer to your question because it depends on usage case.