We have an ESXi 4.1 server with 48 GB RAM.
For each VM, we are allocating 4GB of memory. Since the server will have 13 virtual machines, my manager thinks this is wrong.
I am going to explain to them that ESXi will actually manage memory itself, but they asked me how much memory I allocated for the ESXi server itself.
I did not allocate any (I have not even heard of an option for allocating memory for the ESXi server itself).
How is memory allocated for ESXi server? How does it over-allocate/distribute RAM among virtual machines without issue?
There is a lot more than just ESXi in question here,
Memory Allocation & Overcommitment Explained
Note that the hypervisor won't allocate all of that memory upfront, it is dependent on the VM's usage. However, it is worthwhile understanding what will happen should the VMs try to allocate and use all of memory allocated to them.
The maximum your VM + host will try to use will be approximately, 55 GBs milage may vary
There is another aspect to take into account and that's memory thresholds. By default VMware will aim to have 6% free (high memory threshold). So the 55 GBs of used memory needs to be reduced down to ~45GBs
That means the host will have approximatley 10,500 MBs of memory it needs to reclaim back from somewhere should the VMs use the memory they've been allocated. There are three things ESX does to find that additional 10.5 GBs.
Memory Reclamation Methods
You should read and understand Understanding Memory Resource Management in VMware® ESX™ Server.
Depending on a large number of factors, a combination of all three will / could happen on an over committed host. You need to test your envrionment and monitor these metrics to understand the impact of over committing.
Some rough rules that are worth knowing (all in the above paper and other sources).
Ballooning kicks in next (thresholds are configurable, by default this is when the host has les than 6% memory free (between high and software)). Make sure you install the driver, and watch out for Java and managed applications in general. The OS has no insight into what the garbage collector will do next and it will end up hitting pages that have been swapped to disk. It is not uncommon practice for servers that run java applications exclusively to disable swap entirely to guarantee that doesn't happen. Have a look at Page 17 of vSphere Memory Management, SPECjbb
Hypervisor swapping, from the three methods is the only one that guarantees "memory" being available to the hypervisor in a set time. This will be used if 1 & 2 do not give it enough memory to remain under the hard threshold (default of 2% free memory). When you read through the performance metrics (do your own), you'll realise this is the worst performing of the three. Aim to avoid it at all cost as the performance impact will be very noticable on nearly all applications double digit percentage
There is one more state to be aware of low (by default 1%). From the manual this can drastically cut your performance,
Summary
The key point to stress is it is impossible to predict from the whitepapers how your environment will behave.
Test your average scenarios, you're 95% percentile scenario, and finally your maximum to understand how your environment will run.
Edit 1
Worth adding that with vSphere 4 (or 4.1 can't recall), it is now possible to place the hypervisor swap on local disk but still vmotion the VM. If you're using shared storage I strongly recommend you move the hypervisor swap file to be on local disk by default. This ensures that when one host is under severe memory pressure, it doesn't end up impacting all the other vSphere hosts/VMs on the same shared storage.
Edit 2
Based on comments, made the fact that ESX doesn't allocate the memory upfront in bold...
Edit 3
Explained a little more about memory thresholds.
VMware (and other virtualisation technologies) share resources (memory, processor time, I/O of various kinds) between VMs according to various algorithms.
It is posssible to overcommit resources, because not all VMs will be using all the processing, memory or I/O that they need all the time. VMware's resource management guide is probably the best place to read up on what is possible in ESXi.
You can also manage the effect of the algorithms by weighting different VMs for different resources - e.g. you might give an application server VM a higher weighting for processor than a file server VM. However, the out of the box settings will handle most requirements very well. In some cases, doing some configuration here is enough to placate managers who don't quite get it, but of course be careful and read the docs for your version of VMware and understand what you are doing. If your manager doesn't need further placation, then just use the defaults.
Note that overcommitment isn't necessarily always a good idea, particularly if you have virtualised on a single server. You should monitor your use of resources in your ESXi estate, and if necessary add additional hosts/resources if you are frequently consuming all of any one or more resources.
Let your VMWare ESXi installation handle it. You can overcommit RAM resources on VMWare systems due to its use of memory ballooning, compression and deduplication techniques.
If the virtual machines are using a similar operating system, there's some savings there. Be sure to enable the VMWare tools inside of the guest VM's to make full use of these features.