Question: Would it ever be a good idea to disable page cache inside guest VM's and instead rely on the ZFS ARC (and SSD based L2ARC) of the host?
Context: I'm asking since I'm running a Proxmox cluster which is always showing around 90% use of RAM for all VM's, regardless of how much it actually needs. This is to be expected due to the guest kernel's use of page cache. Since I've been hearing a lot of good things about ZFS's ARC it got me thinking that perhaps I could increase the reliance on those and reduce the reliance on the page cache of the guests. In essence the ARC would kind of be a shared page cache for all VM's.
By doing this I would get the additional benefit of more accurate proxmox statistics and graphs, thus giving me a better picture on how much memory each VM actually needs. This in turn would give me the information I need in order to better tune the sizes of each VM's RAM and allow me to increase the size of the host's ARC by the same amount.
I haven't actually tried any of this, I thought I would run it by you guys first. So, am I stupid for thinking this way?
Follow up question: HOW would I go about disabling (or limiting) the page cache in a Linux VM? ONE method would be to use a cronjob and regularly write "3" to /proc/sys/vm/drop_caches, like once every minute or something. But it feels kinda hacky, isn't there a better way?
P.S. Yes I realize that I'm only talking about read cache, not write cache which is affected by the amount of dirty pages. So I would probably still need some amount free RAM space to allow for that (but that should be visible in the VM's RAM usage statistics in Proxmox, so everything above should still apply).
I often (but not always, see below) optimize my hypervisors similar to what you suggest: let VMs to heavily relay on shared host disk cache.
However, using the
drop_caches
approach seems too heavy-handed to me, as it can evict too much cache memory from guest. At the same time, I don't know any method to limit pagecache (short of configuring your application for using direct I/O). So, the key is to correctly size your VM RAM resources: try to assign only the memory a guest really needs, plus 1 or 2 GB for having some "breathing room".Managing memory in this manner has some important advantages:
But there are some disvantages as well:
vmexit/vmenter
away, the peak and sustained speed of any host-based cache will be lower than the corrispettive guest-side cache (and this is the reason I suggest you to avoid repeateddrop_caches
in the guest);