vSphere is pretty smart when it comes to managing physical memory and determining where best to place a virtual machine’s memory given how busy each NUMA node is in the physical server.  If a VM is running in a busy NUMA node, the ESXi kernel will automatically migrate the virtual machine to a less busy NUMA node in the server in order to get better performance.

  There are 2 main reasons why ESXi might decide to migrate a VM from one NUMA node to another. ESXi might migrate to reduce CPU contention within a NUMA node, this type of migration is called “balance migration”.  The other reason for migration is called “locality migration”. Locality migrations occur when ESXi detects that most of the VM’s memory is in a remote node, in such a case it is generally better to move/schedule the virtual machine to run from the NUMA node where most of the virtual machine’s memory is, as long as it doesn’t cause CPU contention to occur in that NUMA node.

  ESXi keeps a constant eye on your virtual machines and their NUMA placement and performs NUMA migrations either for balance or locality reasons in order to optimize the overall system performance. ESXi generally does a great job at this, but there can be hiccups. If you notice performance problems caused by either unbalanced NUMA nodes causing high ready time or poor virtual machine NUMA locality causing high memory latency there are a few things you can try.

  First and most importantly, try and size your VMs as a multiple of your physical server’s NUMA node size. For instance, if your physical server has 6 cores per NUMA node, size your VMs as either 2, 3, or 6 way.  This KB article explains the potential impact of mismatched VM sizes to NUMA node sizes in more detail

  You might also try and tweak the advance NUMA settings for the VM. For instance, you can manually set the NUMA node affinity for the VM on the resources tab of the virtual machine edit settings screen in vCenter. Administrators should be cautious about manually setting VM – NUMA affinity as it may become difficult to manage manually balancing NUMA resources as your environment grows.  Another setting that might be beneficial to look at is the numa.vcpu.maxPerMachineNode. By default ESXi tries to place a single VM in as few NUMA nodes as possible, this generally provides the best performance because it provides the best memory locality and reduces memory latency.  But some applications may be memory bandwidth sensitive instead of memory latency sensitive, these memory bandwidth sensitive applications may benefit from the increased memory bandwidth that comes from using more NUMA nodes and thus more paths to memory. For these applications you may want to modify the VM advanced parameter numa.vcpu.maxPerMachineNode. Setting that parameter to a lower value will split the VM’s vCPUs up across more NUMA nodes.  

  New in vSphere 5.0 is the vNUMA feature that presents the physical NUMA typology to the guest operating system.  vNUMA is enabled by default on VMs greater than 8 way, but if you have VMs that are not greater than 8 way but are still larger than your physical server’s NUMA node size, than you might want to enable vNUMA on those VMs. To enable vNUMA on 8 way or smaller VMs, modify the numa.vcpu.min setting.  See “Advanced Virtual NUMA Attributes” in the vSphere 5.0 Resource Management guide for more details.

  Lastly, If you are noticing any significant and unexplained performance problem, it might be best to call in the experts and notify VMware Support.  They will be able to take a detailed look at the NUMA client stats for your VM.  The NUMA client stats can show detailed VM NUMA counters like the number of Balance Migrations and Locality Migrations a VM has had, if these values are roughly the same then it might indicate NUMA migration thrashing.

 Bottom Line… ESXi does its best at placing VMs optimally for best NUMA performance, but there are things you can do to help it, most importantly of which is sizing and configuring your VMs with NUMA in mind.