VMware

January 25, 2012

VoIP Performance on vSphere 5

The majority of business-critical applications such as Web applications, database servers, and enterprise messaging systems have been successfully virtualized, proving the benefits of virtualization for reducing cost and streamlining IT management. However, the adoption of virtualization in the area of latency-sensitive applications has been slow partly due to unsubstantiated performance concerns. By taking VoIP service as an example, a newly published white paper demonstrates that vSphere 5 brings the same virtualization benefits to latency-sensitive applications. In particular, the paper shows that vSphere 5 delivers excellent out-of-the-box performance in terms of voice quality when running VoIP service.

The evaluation results demonstrate that good voice quality is maintained when the number of users (number of voice streams) and media server instance increased, while fully utilizing CPU.  For example, vSphere 5 is able to maintain great VoIP performance even when running 12 instances of VoIP media server configured with a total of 48 vCPUs on a system with 8 cores. It is further shown that the NetIOC feature is able to prevent packet loss successfully, thereby helping to preserve voice quality under severe contention for network.

Read more about the VoIP Performance Evaluation on VMware vSphere 5.

 


January 23, 2012

Site Recovery Manager 5.0 Performance and Best Practices

VMware vCenter Site Recovery Manager (SRM) 5.0 provides business continuity and disaster recovery protection for VMware virtual environments. Protection can range from virtual machines (VMs) residing on a single, replicated datastore to all the VMs in a datacenter.

A new technical white paper about SRM has been published. In it, we look at several performance characteristics of SRM, including scalability and recovery, and how they behave in an environment that simulates real-life workloads. The paper includes several recommendations to enhance the performance of SRM and reduce recovery time. A couple of recommendations include:

  • It is a good practice to have fewer but larger NFS volumes so that the time taken to mount a large number of such volumes decreases during the recovery. This might also translate to fewer protection groups on your setup leading to reduced recovery time.
  • Configuring VM dependencies across priority groups instead of setting per VM dependencies is usually the best idea because VMs within each priority group will be started in parallel.

Please refer to VMware vCenter Site Recovery Manager 5.0 Performance and Best Practices for more recommendations, charts, and key takeaways.

 


January 19, 2012

VMware vCloud Director 1.5 Performance and Best Practices

VMware vCloud Director gives enterprise organizations the ability to build secure private clouds that dramatically increase datacenter efficiency and business agility. Lots of new features have been added to vCloud Director 1.5 to accelerate application delivery in the cloud. In this paper, we discuss some of the features of the vCloud Director 1.5 release, performance characterizations including latency trends, resource consumptions, sizing guidelines and hardware requirements, and performance tuning tips.

Some highlights of vCloud Director performance and best practices include:

  • When using fast provisioning (linked clones) and a VMFS datastore, do not exceed eight hosts in a cluster.
  • Be aware that there is a chance to hit the snapshot chain length limit. If the current clone has become very slow compared to the prior clone, the clone may have hit the snapshot chain length limit 30. This can be resolved by virtual machine consolation.
  • For virtual machines that are not generating I/O-intensive workloads, linked clones offer the flexibility and agility of instant provisioning.
  • For cross-vCenter and cross-datastore linked clones, pre-allocating the vApp to the target datastore helps shorten the subsequent copy time.

For more details and performance tips, please refer to VMware vCloud Director 1.5 Performance and Best Practices.


November 16, 2011

SAP Three-Tier Benchmark on vSphere 5 Achieves New Levels of Performance

HP recently published an impressive SAP three-tier Sales and Distribution (SD) benchmark result that was running entirely on vSphere 5. A total of 32,125 SAP SD benchmark users and 170,320 SAPS was achieved using an HP VirtualSystem Solution that was hosting 22 application server VMs and one database server VM.  A "Monster VM" with 20 vCPUs was configured for the Database server VM, which was a key component in enabling the new record setting results.

The configuration used to run the SAP three-tier benchmark is impressive and is an indication of how vSphere is capable of running large enterprise workloads.  HP's SAP three-tier SD benchmark result used 11 servers with a total of 132 cores and 264 logical threads with HT enabled.  The high number of users and large number of systems in this benchmark are representative of a large SAP landscape, which can all be run on vSphere 5.

The HP SAP three-Tier SD benchmark (certification #2011044) was run on Windows Server 2008 R2 with SQL Server 2008 for the database and SAP ERP 6.0 with EHP 4. The servers were HP ProLiant BL460c G7 model with Intel X5675 processors (2P/12C/24T) and 96GB of RAM. More details are available in a performance brief that HP has published and also on the SAP benchmark site (http://www.sap.com/benchmark).


November 07, 2011

VMware View 5 resource optimization

In last week’s post, we discussed 4 simple settings that we have observed deliver significant resource savings, while preserving user experience for typical desktop users. While we discussed the benefits of each setting in isolation, I just wanted to illustrate the overall gains. For runs using View Planner (which simulates a typical office user, with MS Office apps, browsers, Adobe reader, video playback, photo albums etc – more details can be found here), we observe a significant reduction in bandwidth when these 4 resource control settings are applied in unison:

View-bw

From the above plot it is apparent that the bandwidth reductions resulting from i) disabling build-to-loss, ii) setting the maximum frame rate to 15, iii) setting maximum audio bandwidth to 100, and iv ) performing simple in-guest operations (such as selecting “optimize for visual performance”  and disabling ClearType) are mainly additive, and the cumulative benefit is pretty substantial – around a 1.8X reduction from the default! [Particularly compelling, given that for typical office users there is very little difference in user experience]


November 04, 2011

Storage vMotion of a Virualized SQL Server Database

vSphere Storage vMotion (svMotion) enables the live migration of disk files belonging to virtual machines (VMs). svMotion helps to eliminate the down time of the applications running in VMs when the virtual disk  files containing the applications’ data have to be moved between storage devices for the purpose of hardware maintenance, upgrades, load-balancing storage resources, or proactive disaster recovery.

svMotion is the missing piece in liberating VMs and VMs’ associated files completely from the physical hardware on which they reside. Because of the importance of svMotion in the virtual landscape, we at VMware Performance Engineering Labs conducted a study involving the svMotion of the virtual disk files of a VM hosting a large SQL Server database. The focus of the study was to understand:

  • The impact on performance of the SQL Server database when migrating physical files of different database components such as data, index, and log.
  • The effect of the I/O characteristics of the database components on the migration time of the virtual disk containing the files of those components.

The results from the study show:

  • A consistent and predictable disk migration time that was largely influenced by the capabilities of the source and the destination storage hardware.
  • That the I/O characteristics of the database components do influence disk migration time.
  • A 5% to 22% increase, depending on the VM load conditions, in the CPU cost of a transaction of the database workload while migrating a virtual disk containing the physical files of the database.

For more details, refer to the white paper “Storage vMotion of a Virtualized SQL Server Database


November 03, 2011

Comparing ESXi 4.1 and ESXi 5.0 Scaling Performance

In previous articles on VROOM! we used VMmark 2 to investigate the effects of altering a single hardware component, such as a storage array or server model, in a vSphere cluster. In contrast to these earlier studies, we now examine the effects of upgrading the hosts’ software from ESXi 4.1 to ESXi 5.0 on the performance of a VMmark 2 cluster.

vSphere 5 includes many new features and virtual machine enhancements, the details of which can be found here. To the IT professional weighing the costs and benefits of upgrading their existing infrastructure to vSphere 5, an often important question is whether ESXi 5.0 can outperform ESXi 4.1 in the same environment. VMmark 2 is an ideal tool for answering this question with measurable results. We used VMmark 2.1.1 to see how ESXi 5.0 stacked up to ESXi 4.1 on an identically configured cluster.

VMmark 2 is a multi-host virtualization benchmark that models application performance as well as the effects of common infrastructure operations such as vMotion, Storage vMotion, and virtual machine deployments. Each VMmark tile contains a set of VMs running diverse application workloads as a unit of load. VMmark 2 scores are computed as a weighted average of application workload throughput and infrastructure operation throughput. For more details, see the overview, release notes for VMmark 2.1, and for 2.1.1.

Testing Methodology

All VMmark 2 tests were conducted on a cluster of four identically configured entry-level Dell Power Edge R310 servers. To determine the impact of the vSphere 5 environment on performance, a series of tests was conducted with these hosts running ESXi 4.1, then with ESXi 5.0. In addition, for the vSphere 5 environment, the virtual machine hardware and VMware Tools were upgraded on all workload VMs, and LUNs were reformatted as VMFS5. All other components in the environment remained unchanged during testing.

Configuration
Systems Under Test: Four Dell PowerEdge R310 Servers
CPUs: One Quad-Core Intel® Xeon® X3460 @ 2.8 GHz, hyper-threading enabled per server
Memory: 32GB DDR3 ECC @ 800 MHz per server
Storage Array: EMC VNX5500
Hypervisors under test:
     VMware ESXi 4.1
     VMware ESXi 5.0
Virtualization Management: VMware vCenter Server 5.0
VMmark version: 2.1.1

Results

To characterize cluster performance at multiple load levels, we increased the number of tiles until the cluster reached saturation, defined as when the run failed to meet Quality of Service (QoS) requirements. Scaling out the number of tiles until saturation allows us to determine the maximum VMmark 2 load the cluster could support and to compare the ESXi 4.1 and ESXi 5.0 configurations at each level of load.

The graph below shows the results of the VMmark 2 testing as described above with identically configured clusters running ESXi 4.1 and ESXi 5.0. All data points are the mean of three tests in each configuration. 

  Scaling

The ESXi 4.1 cluster reached saturation at 3 tiles, but ESXi 5.0 was able to support 4 tiles while still meeting workload Quality of Service requirements. The ESXi 5.0 cluster also outperformed ESXi 4.1 by 3% and 4% on the two and three-tile runs, respectively. Differences in CPU utilization were negligible. The results show that, in an equivalent environment, vSphere 5 handled greater load than ESXi 4.1 before reaching saturation, and showed increased performance at lower levels of load as well. At saturation, vSphere 5 showed a 22% increase in overall VMmark 2 scores over ESXi 4.1. In this cluster, vSphere 5 supported 33% more VMs and twice the number of infrastructure operations while meeting Quality of Service requirements.

VMmark 2 scores are based on application and infrastructure workload throughput, while application latency reflects Quality of Service. For the Mail Server, Olio, and DVD Store 2 workloads, latency is defined as the application’s response time. The completion time for vMotion, Storage vMotion, and VM Deploy is used as the latency measurement for the infrastructure operations. Latency can be very informative about the functioning of the environment and how the cluster as a whole performs under increasing loads. Examining latency at a 3-tile load, as seen in the figure below, reveals significant differences between the hypervisor versions. Latencies were normalized to the ESXi 4.1 results.

Latency

We saw decreases in latency for all VMmark 2 workloads with vSphere 5. The latency decreases were most striking in Olio, Storage vMotion, and DVD Store 2, with decreases of 20%, 19%, and 15%, respectively. These improvements to vMotion and Storage vMotion are consistent with publicized improvements in vMotion and Storage vMotion latency for vSphere 5 (details here).

A VMmark 2 run passes when all of its application QoS metrics, or latencies, remain below a specified threshold. These decreases in latency with ESXi 5.0 are directly related to why ESXi 5.0 was able to support an additional tile relative to ESXi 4.1.

Our comparison has shown that upgrading an ESXi 4.1 cluster to vSphere 5 had two high-level effects on performance. The vSphere 5 cluster supported 33% more VMs at saturation than the ESXi 4.1 cluster, and it also exhibited improved latency and throughput at lower levels of load, showing that ESXi 5.0 does outperform ESXi 4.1.


November 01, 2011

4 simple resource optimizations for VMware View 5

By default the VMware View PCoIP protocol dynamically optimizes for the best user experience for the given resource constraints. In the majority of environments, this is the desired approach. However, there can be times where individual users or group administrators are interested in different resource utilization policies and in past blogs and whitepapers we have discussed in detail how to configure PCoIP to optimize for constrained resource consumption. In this post, I just wanted to provide a concise summary of these recommendations by highlighting 4 simple optimizations that our extensive internal testing has shown yield significant benefits:

  1. Disable build-to-lossless: setting enable_build_to_lossless to 0 delivers about a 1.3X reduction in bandwidth for typical office workloads. And, PCoIP still builds to a high quality lossy image that is virtually indistinguishable from fully lossless for office workloads.
  2. Optimize video frame-rate: setting maximum_frame_rate to 15 reduces video bandwidth by almost 1.7X in many situations, yet continues to deliver a smooth motion experience.
  3. Optimize audio bandwidth: setting audio_bandwidth_limit to 100 reduces audio bandwidth by around 5X, while continuing to deliver good quality sound.
  4. In-guest optimization: setting Windows visual settings to "optimize for performance" reduces bandwidth by over 1.1X for typical office workloads. Additionally, disabling ClearType reduces bandwidth by a further 1.05X. Disabling desktop wallpaper, and setting the screen saver to none, can also deliver bandwidth savings, although the new client image caching support in View 5 often significantly reduces the additional bandwidth traditionally associated with these options. Finally, disabling Windows update, Super-fetch and Windows index significantly reduces redo-log growth, minimizing storage requirements. Full details of in-guest optimizations can be found here.

[N.B. the PCoIP settings can be set via the Windows registry, or via GPO.]

These simple changes significantly decrease bandwidth consumption, increase consolidation ratios, have minimal impact on typical user experiences and represent good defaults in many environments.


October 18, 2011

Virtualized Hadoop Performance on vSphere 5

In recent years the amount of data stored worldwide has exploded. This has led to the birth of the term 'Big Data'. While the scale of data brings with it complexity associated with storing and handling it, these large datasets are known to have business information buried in them that is critical to continued growth and success.  The last few years have seen the birth of several new tools to manage and analyze such large datasets in a timely way (where traditional tools have had limitations). A natural question to ask is how these tools perform on vSphere. As the start of an ongoing effort to qauntify the performance of big data tools on vSphere, we've chosen to test one of the more popular tools - Hadoop.

Hadoop has emerged as a popular platform for the distributed processing of data. It scales to thousands of nodes while maintaining resiliency to disk, node, or even rack failure. It can use any storage, but is most often used with local disks. A whitepaper giving an overview of Hadoop and the details of tests on commondity hardware with local storage is available here.  One of the findings in the paper is that running 2 or 4 smaller VMs per physical machine usually resulted in better performance, often exceeding native performance.

As we continue our performance testing, stay tuned for results on a larger cluster with bigger data, with other Big Data tools, and on shared storage.


Running latency-sensitive applications on vSphere

For those of us interested in running latency-sensitive applications on vSphere, Bhavesh Davda, from the CTO's office, has created a comprehensive guide for tuning vSphere for such applications. 

Some of the tuning options are very familiar to those working with low-latency applications, e.g., interrupt coalescing settings, and some of them are relatively obscure vSphere specific options. Using a combination of these options, we saw noticeable improvement in performance of some latency-bound benchmarks. As a bonus, the guide provides in-depth reasoning for some options.

You can find more details here and the complete whitepaper here.


About this Blog

The VROOM! Blog from VMware's Performance Engineering Team.

Subscribe via RSS  

VMware Performance & VMmark Community


Discussions and resources for performance and VMmark.

Visit Now

Twitter


Facebook

YouTube


    VMware Blogs