Currently I am running a production environment with 4 dedicated memcached servers, each of them having 48Gb of RAM (42 dedicated to memcache). Right now they are doing fine, but traffic and content are growing and will surely be growing next year too.
What are your thoughts on strategies for scaling memcached further? How have you done until now:
Do you add more RAM to the boxes until their full capacity - effectively doubling the cache pool on the same number of boxes? Or do you scale horizontally by adding more of the same boxes, with the same amount of RAM.
The current boxes can surely handle more RAM as their CPU load is quite low, the only bottleneck being memory, but I wonder if it wouldn't be a better strategy to distribute the cache, making things more redundant and minimizing the impact on the cache of losing one box (losing 48Gb of cache versus losing 96Gb). How would you (or have you) handle this decision.