Home > Blogs > VMware VROOM! Blog > Author Archives: Bruce Herndon

Author Archives: Bruce Herndon

Performance Best Practices for vSphere 6.0 is Available

We are pleased to announce the availability of Performance Best Practices for VMware vSphere 6.0. This is a book designed to help system administrators obtain the best performance from vSphere 6.0 deployments.

The book addresses many of the new features in vSphere 6.0 from a performance perspective. These include:

  • A new version of vSphere Network I/O Control
  • A new host-wide performance tuning feature
  • A new version of VMware Fault Tolerance (now supporting multi-vCPU virtual machines)
  • The new vSphere Content Library feature

We’ve also updated and expanded on many of the topics in the book. These include:

  • VMware vStorage APIs for Array Integration (VAAI) features
  • Network hardware considerations
  • Changes in ESXi host power management
  • Changes in ESXi transparent memory sharing
  • Using Receive Side Scaling (RSS) in virtual machines
  • Virtual NUMA (vNUMA) configuration
  • Network performance in guest operating systems
  • vSphere Web Client performance
  • VMware vMotion and Storage vMotion performance
  • VMware Distributed Resource Scheduler (DRS) and Distributed Power Management (DPM) performance

The book can be found here http://www.vmware.com/files/pdf/techpaper/VMware-PerfBest-Practices-vSphere6-0.pdf.




Performance Best Practices for vSphere 5.5 is Available

We are pleased to announce the availability of Performance Best Practices for vSphere 5.5. This is a book designed to help system administrators obtain the best performance from vSphere 5.5 deployments.

The book addresses many of the new features in vSphere 5.5 from a performance perspective. These include:

  • vSphere Flash Read Cache, a new feature in vSphere 5.5 allowing flash storage resources on the ESXi host to be used for read caching of virtual machine I/O requests.
  • VMware Virtual SAN (VSAN), a new feature (in beta for vSphere 5.5) allowing storage resources attached directly to ESXi hosts to be used for distributed storage and accessed by multiple ESXi hosts.
  • The VMware vFabric Postgres database (vPostgres).

We’ve also updated and expanded on many of the topics in the book. These include:

  • Running storage latency and network latency sensitive applications
  • NUMA and Virtual NUMA (vNUMA)
  • Memory overcommit techniques
  • Large memory pages
  • Receive-side scaling (RSS), both in guests and on 10 Gigabit Ethernet cards
  • VMware vMotion, Storage vMotion, and Cross-host Storage vMotion
  • VMware Distributed Resource Scheduler (DRS) and Distributed Power Management (DPM)
  • VMware Single Sign-On Server

The book can be found here.

VMmark 2.5 Released

I am pleased to announce the release of VMmark 2.5, the latest edition of VMware’s multi-host consolidation benchmark. The most notable change in VMmark 2.5 is the addition of optional power measurements for servers and servers plus storage. This capability will assist IT architects who wish to consider trade-offs in performance and power consumption when designing datacenters or evaluating new and emerging technologies, such as flash-based storage.

VMmark 2.5 contains a number of other improvements including:

  • Support for the VMware vCenter Server Appliance.
  • Support for VMmark 2.5 message and results delivery via Growl/Prowl.
  • Support for PowerCLI 5.1.
  • Updated workload virtual machine templates made from SLES for VMware, a free use version of SLES 11 SP2.
  • Improved pre-run initialization checking.

Full release notes can be found here.

Over the past two years since its initial release, VMmark 2.x has become the most widely-published virtualization benchmark with over fifty published results. We expect VMmark 2.5 and its new capabilities to continue that momentum. Keep an eye out for new power and power-performance results from our hardware partners as well as a series of upcoming blog entries presenting interesting power-performance experiments from the VMmark team.

The power measurement capability in VMmark 2.5 utilizes the SPEC®™ PTDaemon (Power Temperature Daemon). The PTDaemon provides a straightforward and reliable building block with support for the many power analyzers that have passed the SPEC Power Analyzer Acceptance Test.

All currently published VMmark 2.0 and 2.1 results are comparable to VMmark 2.5 performance-only results. Beginning on January 8th 2013, any submission of benchmark results must use the VMmark 2.5 benchmark kit.

Performance Best Practices for VMware vSphere 5.1

We’re pleased to announce the availability of Performance Best Practices for vSphere 5.1.  This is a book designed to help system administrators obtain the best performance from vSphere 5.1 deployments.

The book addresses many of the new features in vSphere 5.1 from a performance perspective.  These include:

  • Use of a system swap file to reduce VMkernel and related memory usage
  • Flex SE linked clones that can relinquish storage space when it’s no longer needed
  • Use of jumbo frames for hardware iSCSI
  • Single Root I/O virtualization (SR-IOV), allowing direct guest access to hardware devices
  • Enhancements to SplitRx mode, a feature allowing network packets received in a single network queue to be processed on multiple physical CPUs
  • Enhancements to the vSphere Web Client
  • VMware Cross-Host Storage vMotion, which allows virtual machines to be moved simultaneously across both hosts and datastores

We’ve also updated and expanded on many of the topics in the book.

These topic include:

  • Choosing hardware for a vSphere deployment
  • Power management
  • Configuring ESXi for best performance
  • Guest operating system performance
  • vCenter and vCenter database performance
  • vMotion and Storage vMotion performance
  • Distributed Resource Scheduler (DRS), Distributed Power Management (DPM), and Storage DRS performance
  • High Availability (HA), Fault Tolerance (FT), and VMware vCenter Update Manager performance
  • VMware vSphere Storage Appliance (VSA) and vCenter Single Sign on Server performance

The book can be found at: http://www.vmware.com/pdf/Perf_Best_Practices_vSphere5.1.pdf.

Updated VMmark 2.1 Benchmarking Guide Available

Just a quick note to inform all benchmarking enthusiasts that we have released an updated VMmark 2.1 Benchmarking Guide. You can get it from the VMmark 2.1 download page. The updated guide contains a new troubleshooting section as well as more comprehensive instructions for using virtual clients and Windows Server 2008 clients.


VMmark 2.1 Released and Other News

VMmark 2.1 has been released and is available here. We had a list of improvements to VMmark 2.0 even as we finished up the initial release of the benchmark last fall. Most of the changes are intended to improve usability, managability, and scale-out-ability of the benchmark. VMmark 2.0 has already generated tremendous interest from our partners and customers and we expect VMmark 2.1 to add to that momentum.

Only the harness and vclient directories have been refreshed for VMware VMmark 2.1. The notable changes include the following:

  • Uniform scaling of infrastructure operations as tile and cluster sizes increase. Previously, the dynamic storage relocation infrastructure workload was held at a single thread.
  • Allowance for multiple Deploy templates as tile and cluster sizes increase.
  • Addition of conditional support for clients running Windows Server 2008 Enterprise Edition 64-bit.
  • Addition of support for virtual clients, provided all hardware and software requirements are met.
  • Improved host-side reporter functionality.
  • Improved environment time synchronization.
  • Updates to several VMmark 2.0 tools to improve ease of setup and running.
  • Miscellaneous improvements to configuration checking, error reporting, debug output, and user-specified options.

All currently published VMmark 2.0 results are comparable to VMmark 2.1. Beginning with the release of VMmark 2.1, any submission of benchmark results must use the VMmark 2.1 benchmark kit.

In other news, Fujitsu published their first VMmark 2.0 result last week.

Also, Intel has joined the VMmark Review Panel. Other members are AMD, Cisco, Dell, Fujitsu, HP, and VMware. Every result published on the VMmark results page is reviewed for correctness and compliance by the VMmark Review Panel. In most cases this means that a submitter's result will be examined by their competitors prior to publication, which enhances the credibility of the results.

That's all for now, but we should be back soon with more interesting experiments using VMmark 2.1.

Cisco Publishes First VMmark 2.0 Result

Our partners at Cisco recently published the first official VMmark 2.0 result using a matched pair of UCS B200 M2 systems. You can find all of the details at the VMmark 2.0 Results Page. Using a matched pair of systems provides a close analogue to single-system benchmarks like VMmark 1.x while providing a more realistic performance profile by including infrastructure operations such as Vmotion. Official VMmark 2.0 results are reviewed for accuracy and compliance by the VMmark Review Panel consisting of AMD, Cisco, Dell, Fujitsu, HP, and VMware.


VMmark 2.0 Release

VMmark 2.0, VMware’s next-generation multi-host virtualization benchmark, is now generally available here.

We were motivated to create VMmark 2.0 by the revolutionary advancements in virtualization since VMmark 1.0 was conceived. The rapid pace of innovation in both the hypervisor and the hardware has quickly transformed datacenters by enabling easier virtualization of heavy and bursty workloads coupled with dynamic VM relocation (vMotion), dynamic datastore relocation (storage vMotion), and automation of many provisioning and administrative tasks across large-scale multi-host environments. In this paradigm, a large fraction of the stresses on the CPU, network, disk, and memory subsystems is generated by the underlying infrastructure operations. Load balancing across multiple hosts can also greatly effect application performance. The benchmarking methodology of VMmark 2.0 continues to focus on user-centric application performance while accounting for the effects of infrastructure activity on overall platform performance. This approach provides a much more accurate picture of platform capabilities than less comprehensive benchmarks.

I would like to thank all of our partners who participated in the VMmark 2.0 beta program. Their thorough testing and insightful feed back helped speed the development process while delivering a more robust benchmark. I anticipate a steady flow of benchmark results from partners over the coming months and years.

I should also acknowledge the hard work of my colleagues in the VMmark team that completed VMmark 2.0 on a relatively short timeline. We have performed a wide array of experiments during the development of VMmark 2.0 and will use the data as the basis for a series of upcoming posts in this forum. Some topics likely to be covered are cluster-wide scalability, performance of heterogeneous clusters, and networking tradeoffs between 1Gbit and 10Gbit for vMotion. I hope we can inspire others to use VMmark 2.0 to explore performance characteristics in multi-host environments in novel and interesting ways all the way up to cloud-scale.


VMmark 2.0 Beta Overview

As I mentioned in my last blog, we have been developing VMmark 2.0, a next-generation multi-host virtualization benchmark that models not only application performance in a virtualized environment but also the effects of common virtual infrastructure operations. This is a natural progression from single-host virtualization benchmarks like VMmark 1.x and SPECvirt_sc2010. Benchmarks measuring single-host performance, while valuable, do not adequately capture the complexity inherent in modern virtualized datacenters. With that in
mind, we set out to construct a meaningfully stressful virtualization benchmark with the following properties:

  • Multi-host to model realistic datacenter deployments
  • Virtualization infrastructure workloads to more accurately capture overall platform performance
  • Heavier workloads than VMmark 1.x to reflect heavier customer usage patterns enabled by the increased capabilities of the virtualization and hardware layers
  • Multi-tier workloads driving both VM-to-VM and external network traffic
  • Workload burstiness to insure robust performance under variable high loads

The addition of virtual infrastructure operations to measure their impact on overall system performance in a typical multi-host environment is a key departure from
traditional single-server benchmarks. VMmark 2.0 includes the execution of the
following foundational and commonly-used infrastructure operations:

  • User-initiated vMotion 
  • Storage vMotion
  • VM cloning and deployment
  • DRS-initiated vMotion to accommodate host-level load variations

The VMmark 2.0 tile features a significantly heavier load profile than VMmark 1.x and consists of the following workloads:

  • DVD Store 2 – multi-tier OLTP workload consisting of a 4-vCPU database VM and three 2-vCPU webserver VMs driving a bursty load profile
  • OLIO – multi-tier social networking workload consisting of a 4-vCPU web server and a 2-vCPU database server
  • Exchange2007 – 4-vCPU mailserver workload
  • Standby server – 1 vCPU lightly-loaded server

We kicked off an initial partner-only beta program in late June and are actively polishing the benchmark for general release. We will be sharing a number of interesting experiments using VMmark 2.0 in our blog leading up to the general release of the benchmark, so stay tuned.

Surveying Virtualization Performance Trends with VMmark

The trends in published VMmark scores are an ideal illustration of the historical long-term performance gains for virtualized platforms. We began work on what
would become VMmark 1.0 almost five years ago. At the time, ESX 2.5 was the state-of-the-art hypervisor. Today’s standard features such as DRS, DPM, and Storage VMotion were in various prototype and development stages. Processors like the Intel Pentium4 5xx series (Prescott) or the single-core AMD 2yy-series Opterons were the high-end CPUs of choice. Second-generation hardware-assisted virtualization features such as AMD’s Rapid Virtualization Indexing (RVI) and Intel’s Extended Page Tables (EPT) were not yet available. Nevertheless, virtualization’s first wave was allowing customers to squeeze much more value from their existing resources via server consolidation. Exactly how much value was difficult to quantify. Our VMmark odyssey began with the overall goal of
creating a representative and reliable benchmark capable of providing meaningful comparisons between virtualization platforms.

VMmark 1.0 released nearly three years ago after two years of painstaking work and multiple beta releases of the benchmark. The reference architecture for VMmark 1.x is a two-processor Pentium4 (Prescott) server running ESX 3.0. That platform was capable of supporting one VMmark tile (six VMs) and by definition achieved a score of 1.0. (All VMmark results are normalized to this reference score.) The graph below shows a sampling of published two-socket VMmark scores for each successive processor generation. 

Blog_slide_3 ESX 3.0, a vastly more capable hypervisor than ESX 2.5, had arrived by the time of the VMmark 1.0 GA in mid-2007. Greatly improved CPU designs were also available. Two processors commonly in use by that time were the dual-core Xeon 51xx series and the quad-core Xeon 53xx series. ESX 3.5 was released with a number of performance improvements such as TCP Segmentation Offloading (TSO) support for networking in the same timeframe as the Xeon 54xx. Both ESX 4.0 and Intel 55xx (Nehalem) CPUs became available in early 2009. ESX 4.0 was a major new release with a broad array of performance enhancements and supported new hardware feature such as EPT and simultaneous multi-threading (SMT), providing a significant boost in overall performance. The recently released hexa-core Intel 56xx CPUs (Westmere) show excellent scaling compared to their quad-core 55xx brethren. (Overall, ESX delivers excellent scaling and takes advantage increased core-counts on all types of servers.) What is most striking to me in this data is the big picture: the performance of virtualized consolidation workloads as measured by VMmark 1.x has roughly doubled every year for the past five years.

In fact, the performance of virtualized platforms has increased to the point that the focus has shifted away from consolidating lightly-loaded virtual machines on a single server to virtualizing the entire range of workloads (heavy and light) across a dynamic multi-host datacenter. Not only application performance but also infrastructure responsiveness and robustness must be modeled to characterize modern virtualized environments. With this in mind, we are currently developing VMmark 2.0, a much more complex, multi-host successor to VMmark 1.x. We are rapidly approaching a limited beta release of this new benchmark, so stay tuned for more. But in this post, I’d like to look back and remember how far we’ve come with VMmark 1.x. Let’s hope the next five
years are as productive.