Official Oracle docs say that for a machine with more than 16GiB of RAM we need to allocate 16GiB of swap.
Our servers are RHEL 7 and have 256GiB of RAM.
DBAs do not want to see the system swap, so they want us to monitor the 16GiB of swap very aggressively.
I suggested that we double the RAM to 512GiB (the expense is approved), and disable the swap. However, this is against Oracle recommendation of having 16GiB of swap, even though we double the RAM.
Honestly, I don't get it how having 3% of swap makes any sense, or why if I am adding more RAM than we had swap, we have to keep the swap.
So, are there any good arguments I can use to justify running Oracle without swap?
P.S. The only reason I mention the doubling of RAM is to demonstrate the ridiculousness of the argument I am having a hard time arguing. What I am really looking for is arguments to justify disabling of the swap.
Disabling swap is a good idea if
This sort of thing often happens with databases. I see it more with noSQL databases, but relational databases can suffer the same challenge.
There is nothing in the OS that requires swap to be there. Linux deals with this pretty gracefully by killing the last process that asked for memory. You don't want to get to that point so make sure you tune Oracle to use only ~90% of memory so there's some left over for the system daemons and margin for error. "Free" memory also gets used for buffering disk I/O which is a huge performance win so trying to consume more of memory by the database itself will eventually slow overall system performance enough to be counter-productive.
Even with systems that have a fraction of the memory from the question if the application is a database or a cache or similar system I'd default to no swap at this point.
Authorities
So that you're not just relying on my word for it:
Cassandra
Datastax explains for Cassandra:
riak
Basho explains for Riak that you should:
mysql
Percona is sitting on the fence and provides useful caveats for both sides of the question. MariaDB disagrees with disabling swap:
ServerFault
A well received answer here includes:
That answer has 22 upvotes today and is 4 years old. You can also see some other answers there extolling the value of swap, but there's no indication they're running databases. They don't have as many upvotes either.
:)
squid
While they don't overtly recommend disabling swap the squid guys say:
That's what you don't want to happen to your database.
redis
While redis officially recommends swap the users don't buy it:
hadoop
As seen in this answer with the most votes on hortonworks community:
I like this because it is talking about a Java app, but it reaches a lot of the same conclusions mentioned above about databases. Also, it mentions monitoring which is very helpful in tuning high performance applications. If you don't have numbers to compare everything is based on feelings which are harder to compare. Make graphs for every measurable metric - application-level latency and throughput down to CPU, disk, memory, and network graphs. Those provide the bulk of the real data you have to make decisions on.
Alex on Linux has an interesting read on this subject: "Swap vs. no swap" http://www.alexonlinux.com/swap-vs-no-swap
Bottom line is that, without swap:
Why not keep a reasonable amount of swap for unused pages, and modify the vfs cache pressure to modify swapiness to only swap on the case of OOM condition. Also you could pin the oracle processes to main memory to ensure they do not ever touch disk. This satisfies the database not being impacted by hitting slow IO systems, and also allows you to free garbage from main memory to be used by the database and buffers and cache. It's the best of both worlds.
This topic pops up frequently. Swap is just an extension of RAM, so let's buy more RAM, right? Wrong. Setup with 16 GiB swap and 512 GiB RAM makes perfect economical sense. Let me explain.
If you know the main software well, you know how much "stupid" memory it takes quite precisely. What "stupid" memory? Various code and data that initially shows in RAM, but will be never ever critically needed again. That is, the performance as seen by user will never suffer because this stuff is not readily available in memory.
Instead of fixing the software, you can just give it that amount of swap but not more than that amount. Yes, let it use 100% of swap. That's the point. Don't increase the swap, or you risk that some critical stuff accidentally ends up there. Document it, so people don't freak out seeing 100% swap usage. In case of Oracle that amount is 16 GiB and I can say from my experience it will get used, even on a 700 GiB box, and you will not experience swpin impacting the performance.
In effect you get 16 GiB of RAM to do real work and benefit your users. As of 2017, it reduces your organization's cost by approx $50. If your server has 256 GiB RAM, you configure swap and save $50. If your server has 10 TiB RAM, you configure swap and save... $50. See? Still the same.
Presently, it's always safe to have zero swap. It simply costs you that small $50, that's all.
If your organization is not capable of dealing with 100% used swap (for example a separate monitoring team, etc.) don't do it. If you make anyone think over this issue, you've already wasted the $50 of their time.
Some vendors really have zero wasted memory. And some vendors don't feel confident enough to estimate the amount of "stupid" allocations, so they say "zero swap" to avoid unknown problems just to save a bunch of your dollars. That's OK too! I would just trust the vendor on this, they support the installations, they know their stuff.