CPU Hot Add is a feature that allows the addition of vCPUs to a running virtual machine. Enabling this feature, however, disables vNUMA for that virtual machine, resulting in the guest OS seeing a single vNUMA node. Without vNUMA support, the guest OS has no knowledge of the CPU and memory virtual topology of the ESXi host. This in turn could result in the guest OS making sub-optimal scheduling decisions, leading to reduced performance for applications running in large virtual machines. For this reason, enable CPU Hot Add only if you expect to use it. Alternatively, plan to power down the virtual machine before adding vCPUs, or configure the virtual machine with the maximum number of vCPUs that might be needed by the workload. If choosing the latter option, note that unused vCPUs incur a small amount of unnecessary overhead. Unused vCPUs could also cause the guest OS to make poor scheduling decisions within the virtual machine, again with the potential for reduced performance. For additional information see VMware KB article 2040375.
The reason for this is that if you enable CPU Hot Add, virtual NUMA is disabled. This means that the VM is not aware of which of its vCPUs are on the same NUMA node and might increase remote memory access. This removes the ability for the guest OS and applications to optimize based on NUMA and results in a possible reduction in performance.
Virtual NUMA (vNUMA) exposes NUMA topology to the guest operating system, allowing NUMA-aware guest operating systems and applications to make the most efficient use of the underlying hardware’s NUMA architecture. (For more information about NUMA, see page 27 in the Performance Best Practices Guide for vSphere 6.7 U2.)
To get an idea of what the performance impact can be by enabling CPU Hot Add, a simple test was run in our lab environment. This test found performance with the default setting of CPU Hot Add disabled performed from 2% to 8% better than when CPU Hot Add was enabled.
We have just published a new whitepaper on the performance of Oracle databases on vSphere 6.5 monster virtual machines. We took a look at the performance of the largest virtual machines possible on the previous four generations of four-socket Intel-based servers. The results show how performance of these large virtual machines continues to scale with the increases and improvements in server hardware.
Oracle Database Monster VM Performance on vSphere 6.5 across 4 generations of Intel-based four-socket servers
In addition to vSphere 6.5 and the four-socket Intel-based servers used in the testing, an IBM FlashSystem A9000 high performance all flash array was used. This array provided extreme low latency performance that enabled the database virtual machines to perform at the achieved high levels of performance.
Some similar tests with Microsoft SQL Server monster virtual machines were also recently completed on vSphere 6.5 by my colleague David Morse. Please see his blog post and whitepaper for the full details.