Uncategorized

Performance Evaluation of AMD RVI Hardware Assist

We recently released a technical paper that demonstrates huge performance gains provided by VMware ESX Server on the latest third-generation AMD Opteron processors. These processors introduce AMD’s second-generation hardware support for virtualization that incorporates memory management unit (MMU) virtualization, called Rapid Virtualization Indexing (RVI). Intel has also announced MMU virtualization support in their “Nehalem” processors called Extended Page Tables (EPT). ESX has been adopting these technologies as they are introduced and many workloads see performance benefits as a result, such as providing higher throughput and lower CPU utilization, improving user experience, and freeing up servers for greater consolidation. The paper can be found here: Performance Evaluation of AMD RVI Hardware Assist.

The performance gains observed in this paper were up to 42% for MMU-intensive benchmarks and up to 500% for MMU-intensive microbenchmarks compared to software-only virtualization. We also observed that although RVI increases memory access latencies for a few workloads, this cost can be reduced by effectively using large pages in the guest and the hypervisor. For optimal performance, ESX aggressively tries to use large pages for its own memory when RVI is used.

Prior to the introduction in 2006 of first-generation hardware support for x86 virtualization, AMD-V, and Intel VT-x, the VMware virtual machine monitor (VMM) relied upon software-only techniques for virtualizing x86 processors.

We used:

  • Binary Translation (BT) for instruction-set virtualization
  • Shadow Paging for MMU virtualization
  • Device Emulation for device virtualization

With the advent of first-generation hardware support, the VMM could make use of the hardware features for instruction-set virtualization. However, MMU and device virtualization were still done in software. Now with the introduction of second-generation virtualization hardware support, the VMM can take advantage of hardware-assist for both instruction-set and MMU virtualization, which allows the guest to access only those memory locations that belong to it. In software MMU virtualization, this requires the VMM to intercept guest execution when the guest updates its virtual memory data structures (page tables). In hardware MMU virtualization, the hardware provides a mechanism by which the VMM no longer needs to intercept guest execution during page table updates. This results in significant performance improvements for workloads that stress the x86 MMU. Continue reading here.