I am having a really odd problem and I am lost a bit as I do not know what to do any more.
We are running in production 4 dedicated memcached boxes. All boxes have 48Gb of RAM, they are running memcached and nothing else and the daemon's memory limit is set to 42Gb.
The problem is that no matter the amount of traffic and gets/sets the boxes receive the cache will fill up on all 4 of them to about 38Gb but then the amount of free RAM available to the operating system will start slowly dropping over the course of several days until the boxes will start swapping, filling up the swap and thrashing! Now this is really strange as there is nothing else running on the boxes that could fill up the rest of the RAM and memcached is eating up 38Gb and not growing (at least that is what the graphs and the stats show).
I have tried setting the swappiness to 0 but it didn't help. I have tried lowering the cache limit even more but I get the same behavior.
I am running Centos 5.6, 2.6.18-238, memcached 1.4.4 and libevent-1.4.13-1.
Have any of you ran into a similar problem before? Could memcached possibly be leaking memory and not showing up in the graphs or the usual Linux tools?
Thanks! Dan
First things first: is it really necessary for you to have such high memory limit for memcached? Would less than 42 GB be enough in practise?
Memcached could leak memory (these things happen), but it'd show up in the memory accounting. Absent a fairly unlikely kernel bug, memory accounting will be accurate. The long and the short of it is that you're missing something in your diagnostic actvities. Collect more data and keep staring at it.