From time to time "my" server stalls because it runs out of both memory and swap space. (it keeps responding to ping but nothing more than that, not even ssh).
I'm told linux does memory overcommitment, which as far as I understand is the same as banks do with money: it grants to processes more memory than actually available, assuming that most processes won't actually use all the memory they ask, at least not all at the same time.
Please assume this is actually the cause why my system occasionally hangs, let's not discuss here whether or not this is the case (see What can cause ALL services on a server to go down, yet still responding to ping? and how to figure out).
So,
how do I disable or reduce drastically memory overcommitment in CentOS? I've read there are two settings called vm.overcommit_memory (values 0, 1, or 2) and vm.overcommit_ratiom but I have no idea where I have to find and change them (some configuration file hopefully), what values should I try, and whether I need to reboot the server to make the changes effective.
and is it safe? What side effects could I expect? When googling for overcommit_memory I find scary things like people saying their server can't boot anymore....
Since what causes the sudden increase in memory usage is mysql because of queries that are made by php which in turn is called while serving http requests, I would expect just some php script to fail to complete and hence some 500 response from time to time when the server is too busy, which is a risk I can take (certainly better that have the whole server become inaccessible and have to hard reboot it).
Or can it really cause my server to be unable to reboot if I choose the wrong settings?
Memory overcommit can be disabled by
vm.overcommit_memory=2
0 is the default mode, where kernel heuristically determines the allocation by calculating the free memory compared to the allocation request being made. And setting it to 1 enables the wizardry mode, where kernel always advertises that it has enough free memory for any allocation. Setting to 2, means that processes can only allocate up to a configurable amount (
overcommit_ratio
) of RAM and will start getting allocation failure or OOM messages when it goes beyond that amount.Is it safe to do so, no. I haven't seen any proper use case where disabling memory overcommit actually helped, unless you are 100% certain of the workload and hardware capacity. In case you are interested, install
kernel-docs
package and go to/Documentation/sysctl/vm.txt
to read more, or read it online.If you set
vm.overcommit_memory=2
then it will overcommit up to the percentage of physical RAM configured invm.overcommit_ratio
(default is 50%).This will not survive a reboot. For persistence, put this in
/etc/sysctl.conf
file:and run
sysctl -p
. No need to reboot.Totally unqualified statement: Disabling memory overcommit is definitely "safer" than enabling it.
$customer has it set on a few hundred web servers and it helped with stability issues a lot. There's even a Nagios check calling out fire real loud if it's ever NOT disabled.
On the other hand, people might not consider it "safe" getting their processes going out of memory when they'd just like to overcommit a little ram and would never really use that. (i.e. SAP would be a very good example)
So, you're back to seeing if it improves things for you. Since You're already looking into it to get rid of related issues - I think it might help for you.
(I know I'll risk a downvote by some grumpy person)
I agree that disabling overcommit is safer than enabling it in some circumstances. If the server runs only few large memory jobs (like circuit simulations in my case), it is much safer to deny the application the memory request upfront rather than waiting for an OOM event (which is sure to follow shortly) Quite often we see servers having issues after the OOM killer has done its work.