Virtualization has just begun to remake the datacenter. One only needs to look at the rapid pace of innovation to know that we are in the midst of a revolution. This is true not only for virtualization software, but also for the underlying hardware. A perfect example of this is new hardware support for virtualized page tables provided by both Intel’s Extended Page Tables (EPT) and AMD’s Rapid Virtualization Indexing (RVI). In general, these features reduce virtualization overhead and improve performance. A previous paper showed how RVI performs with data for a range of individual workloads. As a follow-on, we decided to measure the effects of RVI in a heterogeneous environment using VMmark™, the tile-based mixed-workload consolidation benchmark from VMware®.
VMware ESX has the following three modes of operation: software virtualization (Binary Translation, abbreviated as BT), hardware support for CPU virtualization (abbreviated in AMD systems as AMD-V™), and hardware support for both CPU and MMU virtualization utilizing AMD-V and RVI (abbreviated as AMD-V + RVI). For most workloads, VMware recommends that users let ESX automatically determine if a virtual machine should use hardware support, but it can also be valuable to determine the optimal settings as a sanity check.
Environment Configuration:
System under Test |
Dell PowerEdge 2970 |
CPU |
2 x Quad-Core AMD Opteron 8384 (2.5GHz) |
Memory |
64GB DDR2 Reg ECC |
Hypervisor |
VMware ESX (build 127430) |
Application |
VMmark v1.1 |
Virtual Hardware (per tile) |
10 vCPUs, 5GB memory, 62GB disk |
· AMD RVI works in conjunction with AMD-V technology, which is a set of hardware extensions to the x86 system architecture designed to improve efficiency and reduce the performance overhead of software-based virtualization solutions. For more information on AMD virtualization technologies see here.
· VMmark is a benchmark intended to measure the performance of virtualization environments in an effort to allow customers to compare platforms. It is also useful in studying the effect of architectural features. VMmark consists of six workloads (Web, File, Database, Java, Mail and Standby servers). Multiple sets of workloads (tiles) can be added to scale the benchmark load to match the underlying hardware resources. For more information on VMmark see here.
Test Methodology
By default, ESX automatically runs 32bit VMs (Mail, File, and Standby) with BT, and runs 64bit VMS (Database, Web, and Java) with AMD-V + RVI. For these tests, we first ran the benchmark using the default configuration and determined the number of tiles it would take to saturate the CPU resources. All subsequent benchmark tests used this same load level. We next measured the baseline benchmark score with all VMs under test except Standby configured to use BT (i.e., no hardware virtualization features). A series of benchmark tests was then executed while varying the hardware virtualization settings for different workloads to assess their effects in a heavily-utilized mixed-workload environment. All of the results presented are relative to the baseline score and illustrate the percentage performance gains achieved over the BT-only configuration.
We began by setting the Standby servers to use both AMD-V + RVI. We then stepped through each of the available workloads and altered the CPU/MMU hardware virtualization settings for that specific workload type. After determining which setting was best (BT, AMD-V, or AMD-V + RVI) we used that setting for the subsequent tests.
Results
The test results summarized in Table 1 are both interesting and insightful. ESX’s efficient utilization of AMD-V + RVI for each workload highlights a leap forward in virtualization platform performance. Remember that once we determined AMD-V + RVI to be the best for a workload, we continued to use that setting for that workload during all subsequent tests unless otherwise noted. For example in the AMD-V File run below, the Web server VMs were set to AMD-V + RVI, File server VMs were set to use just AMD-V, and all other non-Standby servers were set to BT.
By taking advantage of hardware-assist features in the processor, ESX is able to achieve significant performance gains over using software-only virtualization. The default or “out of the box” settings produced good results, and further tuning for this particular set of workloads yielded additional performance gains of nearly 6% for our SUT.
It should be noted that these performance gains may or may not be true for dissimilar workload, but for this configuration the improvement made by utilizing an all AMD-V and RVI enabled environment was very impressive. In addition, older processor versions with different cache sizes, clock rates, etc. may produce different results.
It’s probably safe to say that hardware technologies seem to be trending to continued improvements for virtualized environments. ESX’s ability to provide proficient deployment of the latest and greatest hardware innovation, combined with its flexibility in allowing users to run different workloads with different levels of hardware assist is what truly sets it apart.
All information in this post regarding future directions and intent are subject to change or withdrawal without notice and should not be relied on in making a purchasing decision of VMware's products. The information in this post is not a legal obligation for VMware to deliver any material, code, or functionality. The release and timing of VMware's products remains at VMware's sole discretion.