vSphere vMotion is one of the most popular features of VMware vSphere. vMotion enables live migration of workloads and provides flexibility and business continuity even in the face of physical server downtime due to patch updates or troubleshooting. Although vMotion has been used successfully since the earliest versions of VMware vSphere, certain workloads running in large VMs—configured with more than 64 vCPUs and 512GB memory—experienced undesirable performance loss during vMotion.
VMware vSphere 7.0 U1 addresses this challenge by delivering an innovative hypervisor page tracing mechanism, among other things, that greatly reduce the performance impact on guest workloads during live migration.
A paper, vMotion Innovations in vSphere 7.0 U1, is now available. In that paper, we describe the completely rearchitected vMotion memory pre-copy with the new page-tracing mechanism along with several other architectural enhancements to minimize the stun time and provide a comprehensive look at the performance of live migrating virtual machines running typical Tier-1 applications. Tests measure characteristics such as total migration time and application performance during live migration.
In this blog, we share some of the performance highlights from the paper.
Figure 1 compares the performance impact of vMotion on an Oracle Database server running inside a 72-vCPU/1TB VM on both vSphere 6.7 and vSphere 7.0 U1. The figure plots Oracle DB transactions with 1-second granularity—before, during and after vMotion—when running the HammerDB workload.
Thanks to all the new performance optimizations discussed in the paper, we observed great improvement in guest performance during a 7.0 U1 vMotion. The takeaways from this performance test are:
- The overall live-migration time in vSphere 7.0 U1 was cut more than half from 271 seconds to 120 seconds
- Total time spent in installing traces reduced from 86 seconds to 7.5 seconds (11x improvement)
- Average throughput loss of 50 percent during 6.7 vMotion was brought to less than 5 percent during 7.0 U1 vMotion
- Oracle response time during the switch-over improved by an order of magnitude (from over 6 seconds to under one second)
The performance optimizations added to vSphere 7.0 U1 enormously benefitted monster VMs—configured with hundreds of vCPUs and terabytes of memory—and had a trickle-down effect of benefitting the typical-sized VM deployments as well.
Figure 2 compares the performance impact of vMotion on an Oracle DB server running inside a 12-vCPU/64B VM on both vSphere 6.7 and vSphere 7.0 U1. The figure plots Oracle DB transactions with half-a-second granularity—before, during and after vMotion—when running the HammerDB workload. The takeaways from this test are:
- The overall live-migration time went down from 23 seconds in vSphere 6.7 to 11 seconds in vSphere 7.0 U1 (a 50% reduction)
- Total time spent in installing traces reduced from 2.8 seconds to 0.4 seconds (7x improvement)
- Nearly 45 percent improvement in Oracle throughput during 7.0 U1 vMotion (average 3480 transactions per second) compared to 6.7 vMotion (average 2403 transactions per second)
In summary, the improvements in vSphere 7.0 U1 over vSphere 6.7 are twofold: the duration of vMotion and the impact on guest performance during vMotion. These improvements are seen for all the VM deployment sizes.
For the full paper, see vMotion Innovations in vSphere 7.0 U1.