Home > Blogs > VMware VROOM! Blog > Tag Archives: ESXi

Tag Archives: ESXi

Custom Power Management Settings for Power Savings in vSphere 5.5

VMware vSphere serves as a common virtualization platform for a diverse ecosystem of applications. Every application has different performance demands which must be met, but the power and cooling costs of running these applications are also a concern. vSphere’s default power management policy, “Balanced”, meets both of these goals by effectively preserving system performance while still saving some power.

For those who would like to prioritize energy efficiency even further, vSphere provides additional ways to tweak its power management under the covers. Custom power management settings in ESXi let you create your own power management policy, and your server’s BIOS also typically lets you customize hardware settings which can maximize power savings at a potential cost to performance.

When choosing a low power setting, we need to know whether it is effective at increasing energy efficiency, that is, the amount of work achieved for the power consumed. We also need to know how large of an impact the setting has on application throughput and latencies. A power saving setting that is too aggressive can result in low system performance. The best combination of power saving techniques will be highly individualized to your workload; here, we present one case study.

We used the VMmark virtualization benchmark to measure the effect of ESXi custom power settings and BIOS custom settings on energy efficiency and performance. VMmark 2.5 is a multi-host virtualization benchmark that uses diverse application workloads as well as common platform level workloads to model the demands of the datacenter. VMs running a complete set of the application workloads are grouped into units of load called tiles. For more details, see the VMmark 2.5 overview.

In this study, the best custom power setting produced an increase in energy efficiency of 17% with no significant drop in performance at moderate levels of load.

Test Methodology

All tests were conducted on a two-node cluster running VMware vSphere 5.5 U1. Each custom power management setting was tested independently to gauge its effects on energy efficiency and performance while all other settings were left at their defaults. The settings tested fall into two categories: ESXi custom power settings and BIOS custom settings. We discuss how to modify these settings at the end of the article.

Systems Under Test: Two Dell PowerEdge R720 servers
Configuration Per Server  
            CPUs: Two 12-core Intel® Xeon® E5-2697 v2 @ 2.7 GHz, Turbo Boost Enabled, up to 3.5 GHz, Hyper-Threading enabled
            Memory: 256GB ECC DDR3 @ 1866Mhz
            Host Bus Adapter: QLogic ISP2532 Dual Port 8Gb Fibre Channel to PCI Express
           Network Controller: Integrated Intel I350 Quad-Port Gigabit Adapter, one Intel I350 Dual-Port Gigabit PCIe Adapter
            Hypervisor: VMware ESXi 5.5 U1
Shared Resources  
            Virtualization Management: VMware vCenter Server 5.5
            Storage Array: EMC VNX5800
30 Enterprise Flash Drives (SSDs) and 32 HDDs, grouped as two 10-SSD RAID0 LUNs and four 8-HDD RAID0 LUNs. FAST Cache was configured from 10 SSDs.
            Power Meters: One Yokogawa WT210 per server

Each configuration was tested at five different load points: 1 tile (the lowest load level), 4, 7, 10, and 12 tiles, which was the maximum number of tiles that met Quality of Service (QoS) requirements. All datapoints are the mean of three tests in each configuration.

ESXi Custom Power Settings

ESXi custom power settings influence the power state of the processor. We tested two custom power management settings which had the greatest impact on our workload: Power.MaxFreqPct and Power.CstateResidencyCoef. The advanced ESXi setting Power.MaxFreqPct (default value 100) reduces the processor frequency by placing a cap on the highest operating frequency it can reach. In practice, the processor can operate only at certain set frequencies (P-states), so if the frequency cap requested by ESXi (e.g. 2160MHz) does not match to a set frequency state, the processor will run at the nearest lower frequency state (e.g. 2100MHz). Setting Power.MaxFreqPct = 99 put the cap at 99% of the processor’s nominal frequency, which limited Turbo Boost. Power.MaxFreqPct = 80 further limited the maximum frequency of the processor to 80% of its nominal frequency of 2.7GHz, for a maximum of 2.1GHz. Setting Power.CstateResidencyCoef = 0 (default value 5) puts the processor into its deepest available C-state, or lowest power state, when it is idle. As a prerequisite, deep C-states must be enabled in the BIOS. For a more in-depth discussion of power management techniques and other custom options, please see the vSphere documentation and the whitepaper Host Power Management in VMware vSphere 5.5.

VMmark models energy efficiency as performance score per kilowatt of power consumed. VMmark scores in the graph below have been normalized to the default “Balanced” 1-tile result, which does not use any custom power settings.

VMware ESXi Custom Power Management Settings improve efficiency

A major trend can be seen here; an increase in load is correlated with greater energy efficiency. As the CPUs become busier, throughput increases at a faster rate than the required power. This can be understood by noting that an idle server will still consume power, but with no work to show for it. A highly utilized server is typically the most energy efficient per request completed, and the results bear this out.

To more closely examine the relative impact of each custom setting compared to the default setting, we normalized all results within each load level to the default “Balanced” result for that number of tiles. The figure below shows the percent change at each load level.

VMware ESXi Custom Power Management Settings Change in Efficiency and Performance Results

All custom settings showed improvements in efficiency compared to the default “Balanced” setting. The improvements varied depending on load. Setting MaxFreqPct to 99 had the greatest benefit to energy efficiency, between 5% and 15% at varying load levels. The greatest improvement was seen at 4 tiles, which increased efficiency by 17%, while resulting in a performance decrease of only 3%. The performance cost increased with load to 9% at 12 tiles. However, limiting processor frequency even further to a maximum of 80% of its nominal frequency does not produce an additive effect. Not only did efficiency actually decrease relative to MaxFreqPct=99, but it profoundly curtailed performance from 96% of baseline at light load to 84% of baseline for a heavily loaded machine. CstateResidency=0 produced some modest increases in efficiency for a lightly loaded server, but the effect disappeared at higher load levels.

VMmark 2.5 performance scores are based on application and infrastructure workload throughput, while application latency reflects Quality of Service. For the Mail Server, Olio, and DVD Store 2 workloads, latency is defined as the application’s response time. We wanted to see how custom power management settings affected application latency as opposed to the VMmark score. All latencies are normalized to the lowest 1-tile results.

VMware ESXi Custom Power Management Settings Effect on Application Latencies

Naturally, latencies increase as load increases from 1 to 12 tiles. Fortunately, the custom power management policies caused only minimal increases in application latencies, if any, except for the MaxFreqPct=80 setting which did create elevated latencies across the board.

BIOS Custom Power Settings

The Dell PowerEdge R720 BIOS provides another toolbox of power-saving knobs to tweak. Using the BIOS settings, we manually disabled Turbo Boost and reduced memory frequency from its default maximum speed of 1866MT/s (megatransfers per second) to either 1333MT/s or 800MT/s.

Custom-Power-Management-BIOS-Efficiency

The Turbo Boost Disabled configuration produced the largest increase in efficiency, while 800MT/s memory frequency actually decreased efficiency at the higher load levels.
Again, we normalized all results within each load level to its default “Balanced” result. The figure below shows the percent change at each load level.

Custom-Power-Management-BIOS-Efficiency-and-Perf
Disabling Turbo Boost was the most effective setting to increase energy efficiency, with a performance cost of 2% at low load levels to 8% at high load levels. Reducing memory frequency to 1333MT/s had a reliable but small boost to efficiency and no effect on performance, leading us to conclude that a memory speed of 1866MT/s is simply faster than needed for this workload.

Custom-Power-Management-BIOS-Application-Latencies
Disabling Turbo Boost and reducing memory frequency to 800MT/s increased DVD Store 2 latencies at 10 tiles by 10% and 12 tiles by 30%, but all latencies were still well within Quality of Service requirements.  Reducing memory frequency to 1333MT/s had no effect on application latencies.

Reducing the use of Turbo Boost, using either ESXi custom setting MaxFreqPct or BIOS custom settings, proved to be the most effective way to increase energy efficiency in our VMmark tests. The impact on performance was small, but increased with load. MaxFreqPct is the preferred setting because, like all ESXi custom power management settings, it takes effect immediately and can easily be reversed without reboots or downtime. Other custom power management settings produced modest gains in efficiency, but, if taken to the extreme, not only harm performance but fail to increase efficiency. In addition, energy efficiency is strongly related to load; the most efficient server is also one that is heavily utilized. Taking steps to increase server utilization, such as server consolidation, is an important part of a power saving strategy. Custom power management settings can produce gains in energy efficiency at a cost to performance, so consider the tradeoff when choosing custom power management settings for your own environment.


 How to Configure Custom Power Management Settings

Disclaimer: The results presented above are a case study of the impact of custom power management settings and a starting point only. Results may not apply to your environment and do not represent best practices.

Exercise caution when choosing a custom power management setting. Change settings one at a time to evaluate their impact on your environment. Monitor your server’s power consumption either through its UPS, or consult your vendor to find the rated accuracy of your server’s internal power monitoring sensor. If it is highly accurate, you can view the server’s power consumption in esxtop (press ‘p’ to view Power Usage).

To customize power management settings, enter your server’s BIOS. Power Management settings vary by vendor but most include “OS Controlled” and “Custom” policies.

In the Dell PowerEdge R720, choosing the “Performance Per Watt (OS)” System Profile allows ESXi to control power management, while leaving hardware settings at their default values.

Screenshot of R720 BIOS Selecting OS controlled power managment

Choosing the “Custom” System Profile and setting CPU Power Management to “OS DBPM” allows ESXi to control power management while enabling custom hardware settings.

Screenshot-R720-BIOS

Using ESXi Custom Power Settings

To enable the vSphere custom power management policy,

  1. Browse to the host in the vSphere Web Client navigator.
  2. Click the Manage tab and click Settings.
  3. Under Hardware, select Power Management and click the Edit button.
  4. Select the Custom power management policy and click OK.

The power management policy changes immediately and does not require a server reboot.

Screenshot-VMware-ESXi-Host-Power-Management-SettingScreenshot-VMware-ESXi-Custom-Power-Manangement-Setting

To modify ESXi custom power management settings,

  1. Browse to the host in the vSphere Web Client navigator.
  2. Click the Manage tab and click Settings.
  3. Under System, select Advanced System Settings.
  4. Power management parameters that affect the Custom policy have descriptions that begin with In Custom policy. All other power parameters affect all power management policies.
  5. Select the parameter and click the Edit button.

Note: The default values of power management parameters match the Balanced policy.

Screenshot-VMware-ESXi-Advanced-System-Settings

 

Virtual SAP HANA Achieves Production Level Performance

VMware CEO Pat Gelsinger announced production support for SAP HANA on VMware vSphere 5.5 at EMC World this week during his keynote. This is the end result of a very thorough joint testing project over the past year between VMware and SAP.

HANA is an in-memory platform (including database capabilities) from SAP that has enabled huge gains in performance for customers and has been a high priority for SAP over the past few years.  In order for HANA to be supported in a virtual machine on vSphere 5.5 for production workloads, we worked closely with SAP to enable, design, and measure in-depth performance tests.

In order to enable the testing and ongoing production support of SAP HANA on vSphere, two HANA appliance servers were ordered, shipped, and installed into SAP’s labs in Waldorf Germany.  These systems are dedicated to running SAP HANA on vSphere onsite at SAP.  Each system is an Intel Xeon E7-8870 (Westmere-EX) based four-socket server with 1TB of RAM.  They are used for performance testing and also for ongoing support of HANA on vSphere.  Additionally, VMware has onsite support engineering to assist with the testing and support.

SAP designed an extensive performance test suite that used a large number of test scenarios to stress all functions and capabilities of HANA running on vSphere 5.5.  They included OLAP and OLTP with a wide range of data sizes and query functions. In all, over one thousand individual test cases were used in this comprehensive test suite.  These same tests were run on identical native HANA systems and the difference between native and virtual tests was used as the key performance indicator.

In addition, we also tested vSphere features including vMotion, DRS, and VMware HA with virtual machines running HANA.  These tests were done with the HANA virtual machine under heavy stress.

The test results have been extremely positive and are one of the key factors in the announcement of production support.  The difference between virtual and native HANA across all the performance tests was on average within a few percentage points.

The vMotion, DRS, and VMware HA tests were all completed without issues.  Even with the large memory sizes of HANA virtual machines, we were still able to successfully migrate them with vMotion while under load with no issues.

One of the results of the extensive testing is a best practices guide for HANA on vSphere 5.5. This document includes a performance guide for running HANA on vSphere 5.5 based on this extensive testing.  The document also includes information about how to size a virtual HANA instance and how VMware HA can be used in conjunction with HANA’s own replication technology for high availability.

Line-Rate Performance with 80GbE and vSphere 5.5

With the increasing number of physical cores in a system, the networking bandwidth requirement per server has also increased. We often find many networking-intensive applications are now being placed on a single server, which results in a single vSphere server requiring more than one 10 Gigabit Ethernet (GbE) adapter. Additional network interface cards (NICs) are also deployed to separate management traffic and the actual virtual machine traffic. It is important for these servers to service the connected NICs well and to drive line rate on all the physical adapters simultaneously.

vSphere 5.5 supports eight 10GbE NICs on a single host, and we demonstrate that a host running with vSphere 5.5 can not only drive line rate on all the physical NICs connected to the system, but can do it with a modest increase in overall CPU cost as we add more NICs.

We configured a single host with four dual-port Intel 10GbE adapters for the experiment and connected them back-to-back with an IXIA Application Network Processor Server with eight 10GbE ports to generate traffic. We then measured the send/receive throughput and the corresponding CPU usage of the vSphere host as we increased the number of NICs under test on the system.

Environment Configuration

  • System Under Test: Dell PowerEdge R820
  • CPUs: 4 x  Intel Xeon Processors E5-4650 @ 2.70GHz
  • Memory: 128GB
  • NICs:8 x Intel 82599EB 10GbE, SFP+ Network Connection
  • Client: Ixia Xcellon-Ultra XT80-V2, 2U Application Network Processor Server

Challenges in Getting 80Gbps Throughput

To drive near 80 gigabits of data per second from a single vSphere host, we used a server that has not only the required CPU and memory resources, but also the PCI bandwidth that can perform the necessary I/O operations. We used a Dell PowerEdge Server with an Intel E5-4650 processor because it belongs to the first generation of Intel processors that supports PCI Gen 3.0. PCI Gen 3.0 doubles the PCI bandwidth capabilities compared to PCI Gen 2.0. Each dual-port Intel 10GbE adapter needs at least a PCI Gen 2.0 x8 to reach line rate. Also, the processor has Intel Data Direct I/O Technology where the packets are placed directly in the processor cache rather than going to the memory. This reduces the memory bandwidth consumption and also helps reduce latency.

Experiment Overview

Each 10GbE port of the vSphere 5.5 server was configured with a separate vSwitch, and each vSwitch had two Red Hat 6.0 Linux virtual machines running an instance of Apache web server. The web server virtual machines were configured with 1 vCPU and 2GB of memory with VMXNET3 as the virtual NIC adapter.  The 10GbE ports were then connected to the Ixia Application Server port. Since the server had two x16 slots and five x8 slots, we used the x8 slots for the four 10GbE NICs so that each physical NIC had identical resources. For each physical connection, we then configured 200 web/HTTP connections, 100 for each web server, on an Ixia server that requested or posted the file. We used a high number of connections so that we had enough networking traffic to keep the physical NIC at 100% utilization.

Figure 1. System design of NICs, switches, and VMs

The Ixia Xcellon application server used an HTTP GET request to generate a send workload for the vSphere host. Each connection requested a 1MB file from the HTTP web server.

Figure 2 shows that we could consistently get the available[1] line rate for each physical NIC as we added more NICs to the test. Each physical NIC was transmitting 120K packets per second and the average TSO packet size was close to 10K. The NIC was also receiving 400K packets per second for acknowledgements on the receive side. The total number of packets processed per second was close to 500K for each physical connection.

Figure 2. vSphere 5.5 drives throughput at available line rates. TSO on the NIC resulted in lower packets per second for send.

Similar to the send case, we configured the application server to post a 1MB file using an HTTP POST request for generating receive traffic for the vSphere host. We used the same number of connections and observed similar behavior for the throughput. Since the NIC does not have support for hardware LRO, we were getting 800K packets per second for each NIC. With eight 10GbE NICs, the packet rate reached close to 6.4 million packets per second. VMware does Software LRO for Linux and as a result we see large packets in the guest. The guest packet rate is around 240K packets per second. There was also significant traffic for TCP acknowledgements and for each physical NIC. The host was transmitting close to 120K acknowledgement packets for each physical NIC, bringing the total packets processed close to 7.5 million packets per second for eight 10Gb ports.

Figure 3. Average vSphere 5.5 host CPU utilization for send and receive

We also measured the average CPU reported for each of the tests. Figure 3 shows that the vSphere host’s CPU usage increased linearly as we added more physical NICs to the test for both send and receive. This indicates that performance improves at an expected and acceptable rate.

Test results show that vSphere 5.5 is an excellent platform on which to deploy networking-intensive workloads. vSphere 5.5 makes use of all the physical bandwidth capacity available and does this without incurring additional CPU cost.

 


[1]A 10GbE NIC can achieve only 9.4 Gbps of throughput with standard MTU. For a 1500 byte packet, we have 40 bytes for the TCP /IP header and 38 bytes for the Ethernet frame format.

Power Management and Performance in ESXi 5.1

Powering and cooling are a substantial portion of datacenter costs. Ideally, we could minimize these costs by optimizing the datacenter’s energy consumption without impacting performance. The Host Power Management feature, which has been enabled by default since ESXi 5.0, allows hosts to reduce power consumption while boosting energy efficiency by putting processors into a low-power state when not fully utilized.

Power management can be controlled by the either the BIOS or the operating system. In the BIOS, manufacturers provide several types of Host Power Management policies. Although they vary by vendor, most include “Performance,” which does not use any power saving techniques, “Balanced,” which claims to increase energy efficiency with minimal or no impact to performance, and “OS Controlled,” which passes power management control to the operating system. The “Balanced” policy is variably known as “Performance per Watt,” “Dynamic” and other labels; consult your vendor for details. If “OS Controlled” is enabled in the BIOS, ESXi will manage power using one of the policies “High performance,” “Balanced,” “Low power,” or “Custom.” We chose to study Balanced because it is the default setting.

But can the Balanced setting, whether controlled by the BIOS or ESXi, reduce performance relative to the Performance setting? We have received reports from customers who have had performance problems while using the BIOS-controlled Balanced setting. Without knowing the effect of Balanced on performance and energy efficiency, when performance is at a premium users might select the Performance policy to play it safe. To answer this question we tested the impact of power management policies on performance and energy efficiency using VMmark 2.5.

VMmark 2.5 is a multi-host virtualization benchmark that uses varied application workloads as well as common datacenter operations to model the demands of the datacenter. VMs running diverse application workloads are grouped into units of load called tiles. For more details, see the VMmark 2.5 overview.

We tested three policies: the BIOS-controlled Performance setting, which uses no power management techniques, the ESXi-controlled Balanced setting (with the BIOS set to OS-Controlled mode), and the BIOS-controlled Balanced setting. The ESXi Balanced and BIOS-controlled Balanced settings cut power by reducing processor frequency and voltage among other power saving techniques.

We found that the ESXi Balanced setting did an excellent job of preserving performance, with no measurable performance impact at all levels of load. Not only was performance on par with expectations, but it did so while producing consistent improvements in energy efficiency, even while idle. By comparison, the BIOS Balanced setting aggressively saved power but created higher latencies and reduced performance. The following results detail our findings.

Testing Methodology
All tests were conducted on a four-node cluster running VMware vSphere 5.1. We compared performance and energy efficiency of VMmark between three power management policies: Performance, the ESXi-controlled Balanced setting, and the BIOS-controlled Balanced setting, also known as “Performance per Watt (Dell Active Power Controller).”

Configuration
Systems Under Test: Four Dell PowerEdge R620 servers
CPUs (per server): One Eight-Core Intel® Xeon® E5-2665 @ 2.4 GHz, Hyper-Threading enabled
Memory (per server): 96GB DDR3 ECC @ 1067 MHz
Host Bus Adapter: Two QLogic QLE2562, Dual Port 8Gb Fibre Channel to PCI Express
Network Controller: One Intel Gigabit Quad Port I350 Adapter
Hypervisor: VMware ESXi 5.1.0
Storage Array: EMC VNX5700
62 Enterprise Flash Drives (SSDs), RAID 0, grouped as 3 x 8 SSD LUNs, 7 x 5 SSD LUNs, and 1 x 3 SSD LUN
Virtualization Management: VMware vCenter Server 5.1.0
VMmark version: 2.5
Power Meters: Three Yokogawa WT210

Results
To determine the maximum VMmark load supported for each power management setting, we increased the number of VMmark tiles until the cluster reached saturation, which is defined as the largest number of tiles that still meet Quality of Service (QoS) requirements. All data points are the mean of three tests in each configuration and VMmark scores are normalized to the BIOS Balanced one-tile score.

Effects of Power Management on VMmark 2.5 score

The VMmark scores were equivalent between the Performance setting and the ESXi Balanced setting with less than a 1% difference at all load levels. However, running on the BIOS Balanced setting reduced the VMmark scores an average of 15%. On the BIOS Balanced setting, the environment was no longer able to support nine tiles and, even at low loads, on average, 31% of runs failed QoS requirements; only passing runs are pictured above.

We also compared the improvements in energy efficiency of the two Balanced settings against the Performance setting. The Performance per Kilowatt metric, which is new to VMmark 2.5, models energy efficiency as VMmark score per kilowatt of power consumed. More efficient results will have a higher Performance per Kilowatt.

Effects of Power Management on Energy Efficiency

Two trends are visible in this figure. As expected, the Performance setting showed the lowest energy efficiency. At every load level, ESXi Balanced was about 3% more energy efficient than the Performance setting, despite the fact that it delivered an equivalent score to Performance. The BIOS Balanced setting had the greatest energy efficiency, 20% average improvement over Performance.

Second, increase in load is correlated with greater energy efficiency. As the CPUs become busier, throughput increases at a faster rate than the required power. This can be understood by noting that an idle server will still consume power, but with no work to show for it. A highly utilized server is typically the most energy efficient per request completed, which is confirmed in our results. Higher energy efficiency creates cost savings in host energy consumption and in cooling costs.

The bursty nature of most environments leads them to sometimes idle, so we also measured each host’s idle power consumption. The Performance setting showed an average of 128 watts per host, while ESXi Balanced and BIOS Balanced consumed 85 watts per host. Although the Performance and ESXi Balanced settings performed very similarly under load, hosts using ESXi Balanced and BIOS Balanced power management consumed 33% less power while idle.

VMmark 2.5 scores are based on application and infrastructure workload throughput, while application latency reflects Quality of Service. For the Mail Server, Olio, and DVD Store 2 workloads, latency is defined as the application’s response time. We wanted to see how power management policies affected application latency as opposed to the VMmark score. All latencies are normalized to the lowest results.

Effects of Power Management on VMmark 2.5 Latencies

Whereas the Performance and ESXi Balanced latencies tracked closely, BIOS Balanced latencies were significantly higher at all load levels. Furthermore, latencies were unpredictable even at low load levels, and for this reason, 31% of runs between one and eight tiles failed; these runs are omitted from the figure above. For example, half of the BIOS Balanced runs did not pass QoS requirements at four tiles. These higher latencies were the result of aggressive power saving by the BIOS Balanced policy.

Our tests showed that ESXi’s Balanced power management policy didn’t affect throughput or latency compared to the Performance policy, but did improve energy efficiency by 3%. While the BIOS-controlled Balanced policy improved power efficiency by an average of 20% over Performance, it was so aggressive in cutting power that it often caused VMmark to fail QoS requirements.

Overall, the BIOS controlled Balanced policy produced substantial efficiency gains but with unpredictable performance, failed runs, and reduced performance at all load levels. This policy may still be suitable for some workloads which can tolerate this unpredictability, but should be used with caution. On the other hand, the ESXi Balanced policy produced modest efficiency gains while doing an excellent job protecting performance across all load levels. These findings make us confident that the ESXi Balanced policy is a good choice for most types of virtualized applications.

Comparing ESXi 4.1 and ESXi 5.0 Scaling Performance

In previous articles on VROOM! we used VMmark 2 to investigate the effects of altering a single hardware component, such as a storage array or server model, in a vSphere cluster. In contrast to these earlier studies, we now examine the effects of upgrading the hosts’ software from ESXi 4.1 to ESXi 5.0 on the performance of a VMmark 2 cluster.

vSphere 5 includes many new features and virtual machine enhancements, the details of which can be found here. To the IT professional weighing the costs and benefits of upgrading their existing infrastructure to vSphere 5, an often important question is whether ESXi 5.0 can outperform ESXi 4.1 in the same environment. VMmark 2 is an ideal tool for answering this question with measurable results. We used VMmark 2.1.1 to see how ESXi 5.0 stacked up to ESXi 4.1 on an identically configured cluster.

VMmark 2 is a multi-host virtualization benchmark that models application performance as well as the effects of common infrastructure operations such as vMotion, Storage vMotion, and virtual machine deployments. Each VMmark tile contains a set of VMs running diverse application workloads as a unit of load. VMmark 2 scores are computed as a weighted average of application workload throughput and infrastructure operation throughput. For more details, see the overview, release notes for VMmark 2.1, and for 2.1.1.

Testing Methodology

All VMmark 2 tests were conducted on a cluster of four identically configured entry-level Dell Power Edge R310 servers. To determine the impact of the vSphere 5 environment on performance, a series of tests was conducted with these hosts running ESXi 4.1, then with ESXi 5.0. In addition, for the vSphere 5 environment, the virtual machine hardware and VMware Tools were upgraded on all workload VMs, and LUNs were reformatted as VMFS5. All other components in the environment remained unchanged during testing.

Configuration
Systems Under Test: Four Dell PowerEdge R310 Servers
CPUs: One Quad-Core Intel® Xeon® X3460 @ 2.8 GHz, hyper-threading enabled per server
Memory: 32GB DDR3 ECC @ 800 MHz per server
Storage Array: EMC VNX5500
Hypervisors under test:
VMware ESXi 4.1
VMware ESXi 5.0
Virtualization Management: VMware vCenter Server 5.0
VMmark version: 2.1.1

Results

To characterize cluster performance at multiple load levels, we increased the number of tiles until the cluster reached saturation, defined as when the run failed to meet Quality of Service (QoS) requirements. Scaling out the number of tiles until saturation allows us to determine the maximum VMmark 2 load the cluster could support and to compare the ESXi 4.1 and ESXi 5.0 configurations at each level of load.

The graph below shows the results of the VMmark 2 testing as described above with identically configured clusters running ESXi 4.1 and ESXi 5.0. All data points are the mean of three tests in each configuration.

  Scaling

 

The ESXi 4.1 cluster reached saturation at 3 tiles, but ESXi 5.0 was able to support 4 tiles while still meeting workload Quality of Service requirements. The ESXi 5.0 cluster also outperformed ESXi 4.1 by 3% and 4% on the two and three-tile runs, respectively. Differences in CPU utilization were negligible. The results show that, in an equivalent environment, vSphere 5 handled greater load than ESXi 4.1 before reaching saturation, and showed increased performance at lower levels of load as well. At saturation, vSphere 5 showed a 22% increase in overall VMmark 2 scores over ESXi 4.1. In this cluster, vSphere 5 supported 33% more VMs and twice the number of infrastructure operations while meeting Quality of Service requirements.

VMmark 2 scores are based on application and infrastructure workload throughput, while application latency reflects Quality of Service. For the Mail Server, Olio, and DVD Store 2 workloads, latency is defined as the application’s response time. The completion time for vMotion, Storage vMotion, and VM Deploy is used as the latency measurement for the infrastructure operations. Latency can be very informative about the functioning of the environment and how the cluster as a whole performs under increasing loads. Examining latency at a 3-tile load, as seen in the figure below, reveals significant differences between the hypervisor versions. Latencies were normalized to the ESXi 4.1 results.

Latency

We saw decreases in latency for all VMmark 2 workloads with vSphere 5. The latency decreases were most striking in Olio, Storage vMotion, and DVD Store 2, with decreases of 20%, 19%, and 15%, respectively. These improvements to vMotion and Storage vMotion are consistent with publicized improvements in vMotion and Storage vMotion latency for vSphere 5 (details here).

A VMmark 2 run passes when all of its application QoS metrics, or latencies, remain below a specified threshold. These decreases in latency with ESXi 5.0 are directly related to why ESXi 5.0 was able to support an additional tile relative to ESXi 4.1.

Our comparison has shown that upgrading an ESXi 4.1 cluster to vSphere 5 had two high-level effects on performance. The vSphere 5 cluster supported 33% more VMs at saturation than the ESXi 4.1 cluster, and it also exhibited improved latency and throughput at lower levels of load, showing that ESXi 5.0 does outperform ESXi 4.1.

Performance Best Practices for VMware vSphere 5.0

A new version of Performance Best Practices for vSphere is now available.  This is a book designed to help system administrators obtain the best performance from vSphere deployments.

We've addressed many of the new features in vSphere 5.0 from a performance perspective.  These include:

  • Storage Distributed Resource Scheduler (Storage DRS), which performs automatic storage I/O load balancing
  • Virtual NUMA, allowing guests to make efficient use of hardware NUMA architecture
  • Memory compression, which can reduce the need for host-level swapping
  • Swap to host cache, which can dramatically reduce the impact of host-level swapping
  • SplitRx mode, which improves network performance for certain workloads
  • VMX swap, which reduces per-VM memory reservation
  • Multiple vMotion vmknics, allowing for more and faster vMotion operations

We've also significantly updated and expanded many of the topics we've covered in previous editions of the book.  These include:

  • Choosing hardware for a vSphere deployment
  • Power management
  • Configuring ESXi for best performance
  • Guest operating system performance
  • vCenter and vCenter database performance
  • vMotion and Storage vMotion performance
  • Distributed Resource Scheduler (DRS) and Distributed Power Management (DPM) performance
  • High Availability (HA), Fault Tolerance (FT), and VMware vCenter Update Manager performance

The book can be found at: Performance Best Practices for VMware vSphere 5.0.