At VMworld US 2019, VMware announced Project Pacific, an evolution of vSphere into a Kubernetes-native platform. Project Pacific (among other things) introduces a vSphere Supervisor Cluster, which enables you to run Kubernetes pods natively on ESXi (called vSphere Native Pods) with the same level of isolation as virtual machines. At VMworld, we claimed that vSphere Native Pods running on Project Pacific, isolated by the hypervisor, can achieve up to 8% better performance than pods on a bare-metal Linux Kubernetes node. While it may sound a bit counter-intuitive that virtualized performance is better than bare metal, let’s take a deeper look to understand how this is possible.
Why are vSphere Native Pods faster?
This benefit primarily comes from ESXi doing a better job at scheduling the natively run pods on the right CPUs, thus providing better localization to dramatically reduce the number of remote memory accesses. The ESXi CPU scheduler knows that these pods are independent entities and takes great efforts to ensure their memory accesses are within their respective local NUMA (non-uniform memory access) domain. This results in better performance for the workloads running inside these pods and higher overall CPU efficiency. On the other hand, the process scheduler in Linux may not provide the same level of isolation across NUMA domains.
Many organizations today have a ROBO environment with local IT infrastructure. These remote locations usually have anywhere from a few servers running a few workloads to support local needs, to numerous servers spanning a large-scale datacenter. The distributed and remote nature of this infrastructure makes it hard to manage, difficult to protect, and costly to maintain. Further, the remote nature of servers makes it more challenging to perform important VM/host-related operations.
vSphere is designed to address these ROBO use cases, including IT infrastructure located in remote, distributed sites. VMware vCenter Server provides a centralized way to control and monitor the virtual infrastructure, including ESXi hosts, virtual machines, storage, and networking resources. It has been widely deployed in a ROBO environment to manage ESXi hosts that are distributed over large geographical distances over a wide range of networks with different network characteristics, including low/high bandwidth, network latency, and packet error rates. In the paper, we test:
LAN with high-bandwidth and low-latency links.
WAN with low-bandwidth and high-latency links.
Various networks in between; for example, DSL, T1, 4G, 5G, …
We demonstrate that vCenter Server performs well in the ROBO environment for both network bandwidth use, as well as virtual machine and ESXi host task execution times. Instead of a bandwidth restriction, we observe that network latency has a bigger impact on the overall performance. As the network latency between vCenter Server and ESXi hosts increases, the average operation latency also increases. The experimental results also show how efficiently vCenter Server executes VM operations in high-latency networks: The average VM operation execution time increases much more slowly when network latency increases by several times.