Home > Blogs > VMware VROOM! Blog > Monthly Archives: September 2011

Monthly Archives: September 2011

VMware vCenter Update Manager 5.0 Performance and Best Practices

VMware vCenter Update Manager (also known as VUM) provides a patch management framework for VMware vSphere. IT administrators can use it to patch and upgrade ESX/ESXi hosts, upgrade VMware Tools and virtual hardware for virtual machines, as well as upgrade virtual appliances.

A white paper that examines the performance of VUM is available. This paper includes some interesting information, including:

  • VUM deployment recommendations that maximize performance
  • A look at the latencies of common VUM operations
  • The resource consumption profile of VUM operations for CPU, network, disk, and database
  • How much time it takes to remediate a cluster sequentially vs. in parallel
  • VUM performance in a low bandwidth, high latency, or lossy network
  • How the use of bandwidth throttling affects host staging and remediation over LAN and WAN
  • Performance tips and best practices

For the full paper, see VMware vCenter Update Manager 5.0 Performance and Best Practices.

Host Power Management in vSphere 5

Host power management (HPM) on ESXi 5.0 saves energy by placing certain parts of a computer system or device into a reduced power state when the system or device is inactive or does not need to run at maximum speed. This feature can be used in conjunction with distributed power management (DPM), which redistributes VMs among physical hosts in a cluster to enable some hosts to be powered off completely.

The default power policy for HPM in vSphere 5 is “balanced,” which reduces host power consumption while having little or no impact on performance for most workloads. The balanced policy uses the processor’s P-states, which save power when the workloads running on the system do not require full CPU capacity.

A technical white paper has been published that describes:

  • What to adjust in your ESXi host’s BIOS settings to achieve the maximum benefit of HPM.
  • The different power policy options in ESXi 5.0 and how to set a custom policy.
  • Using esxtop to obtain and understand statistics related to HPM, including the ESXi host’s power usage, the processor’s P-states, and the effect of HPM on %USED and %UTIL.
  • An evaluation of the power that HPM can save while different power policies are enabled. The amount of power saved varies depending on the CPU load and the power policy.

For the full paper, see Host Power Management in VMware vSphere 5.

Note: Some performance-sensitive workloads might require the "high performance" policy.


VMware View PCoIP & Build-to-lossless

We have talked in previous posts about the ability in View 5 to disable build to lossless (BTL). When BTL is disabled, PCoIP rapidly builds the client image to a high quality, but lossy image — by default, if the image remains constant, PCoIP would continue to refine the image in the background until it reaches a fully lossless state. Stopping the build process when the image reaches the "perceptually lossless" stage can deliver significant bandwidth savings — for typical office workflows, we are seeing around a 30% bandwidth reduction.

Furthermore, in many situations, the difference between fully lossless and perceptually lossless images can be virtually impossible to discern. During our VMworld presentation, we used the following image to emphasize the quality of perceptually lossless:


In this qualitative comparison, we present a zoom-in of two small images. For both images, View fully lossless and View perceptually lossless (no BTL) images are shown side-by-side for comparison — hopefully conveying how difficult it is, even when zoomed, to find differences.

To further emphasize the perceptually lossless quality, it’s also interesting to examine quantitative data — for example, PSNR (peak signal to noise ratio) and RMS (root-mean-square) error data. For a fairly complex image — a fall-colors landscape with significant fine detail in the background tree colors — comparing the perceptually lossless build to a fully lossless build (RGB space), yields a PSNR value of 45.8dB, and RMS error value of 1.3! This clearly illustrates how little loss in quality is associated with perceptually lossless images. For instance, consider the RMS error of 1.3: for 32-bit colors, each rgba component has 8-bits of precision, with values ranging from 0 to 255. For this image, perceptually lossless is introducing an average error of +/-1.3 to these values — fairly negligible for most use cases!!

[While PSNR ratio obviously varies from image to image, I'm seeing ~45dB much of the time]


Zimbra Collaboration Server Performance on vSphere 5

Zimbra Collaboration Server is VMware’s mail, calendaring, and collaboration software. A performance study looks at how the mail server performs on vSphere 5. Test results demonstrate the following:

  • Due to optimizations made within Zimbra Collaboration Server and the tickless timer within Red Hat Enterprise Linux (RHEL) 6, Zimbra Collaboration Server in a virtual machine can scale up effectively, and performs within 95% of a physical host.
  • Zimbra Collaboration Server scales out effortlessly, with only a 10% drop in sendmail latency as up to eight virtual machines are added.
  • Zimbra Collaboration Server consumes less than half of the CPU of Microsoft Exchange Server 2010 for the same number of users. The user provisioning time for the tests is also orders of magnitude better.

For the full paper, see Zimbra Collaboration Server Performance on VMware vSphere 5.


View 5 PCoIP Client-Side Image Caching

At the recent VMworld we mentioned that VMware View 5 introduces PCoIP support for client-side image caching. In our VMworld presentation, we highlighted that, on average, this caching optimization reduces bandwidth consumption by about 30%. However, there are a number of important scenarios where the ability of the PCoIP image cache to capture spatial, as well as temporal, redundancy delivers even bigger benefits.

For instance, consider scrolling through a PDF document.  As we scroll down, new content appears along the bottom edge of the window, and the oldest content disappears from the top edge. All the other content in the application window remains essentially constant, merely shifted upward. The PCoIP image cache is capable of detecting this spatial and temporal redundancy. As a result, for scrolling operations, the display information sent to the client device is primarily just a sequence of cache indices — delivering significant bandwidth savings.

This efficient scrolling has a couple of key benefits;

  • On LAN networks, where bandwidth is relatively unconstrained,  there’s sufficient bandwidth available for high quality scrolling even when client-side caching is disabled. In these situations, enabling client-side image caching delivers significant bandwidth savings – experimenting with a variety of different applications and content types (text heavy, image heavy etc), I'm seeing bandwidth reductions of over 4X (compared with caching disabled. mileage may vary, but I’m seeing this fairly consistently)!
  • On WAN networks, where bandwidth is fairly scarce, when client-side caching is disabled, scrolling performance is often degraded to stay within the available bandwidth. In these situations, in addition to bandwidth reductions (which vary based on the degree to which scrolling performance is degraded when client-side caching is disabled), client-side caching also ensures smooth, highly responsive scrolling operations even in WAN environments with very constrained bandwidth.


Microsoft Exchange Server 2010 Performance on vSphere 5

A white paper has been published that examines how Microsoft Exchange Server 2010 performs on vSphere 5 in terms of scaling up (adding more virtual CPUs) and scaling out (adding more VMs).  Having the choice to scale up or out while maintaining a positive user experience gives IT more flexibility to right-size system deployments and maximize total cost of ownership with respect to licensing and hardware purchases.

Testing shows the effectiveness of vSphere 5 to add compute power by scaling up Exchange Server VMs, in increments, from 2 to 12 virtual CPUs. This allowed the total number of very heavy Exchange users to increase from 2,000 to 12,000 while sendmail latency remained well within the range of acceptable user responsiveness. Processor utilization remained low, at about 15% of the total host processing capacity for 12,000 very heavy Exchange users.

Testing also shows that scaling out to eight Exchange Server VMs supports a workload of up to 16,000 very heavy users, with the load consuming only 32% of the ESXi host processing capacity.

Additional tests were undertaken to show the performance improvements of vMotion and Storage vMotion in vSphere 5. vMotion migration time for a 4-vCPU Exchange mailbox server VM showed a 34% reduction in vSphere 5 over vSphere 4.1. Storage vMotion migration time for a 350GB database VMDK showed an 11% reduction in vSphere 5 over vSphere 4.1.

For the full paper, see Microsoft Exchange Server 2010 Performance on vSphere 5.


VMworld 2011 VMware View 5 PCoIP

Slides from our recent VMworld presentation on View 5 PCoIP performance (EUC1987) are now available here. In addition to discusing the latest PCoIP optimizations and best practices in detail, it also presents competitive data.

Understanding Memory Resource Management in vSphere 5

Memory resource management is a key player in the ability of vSphere systems to over-commit resources and thereby maximize the utilization of an ESXi host. Over-commitment allows the active memory of a system to perform as close to 100% as possible. vSphere achieves this by using several innovative techniques to reclaim virtual machine memory, which are:

  • Transparent page sharing (TPS)—removes redundant pages with identical content
  • Ballooning—artificially increases the memory pressure inside the guest
  • Memory compression—compresses the pages that need to be swapped out
  • Hypervisor swapping—ESXi directly swaps out the virtual machine’s memory
  • Swap to host cache (swap to SSD)—reclaims memory by storing the swapped out pages in the host cache on a solid-state drive

New to vSphere 5 is swap to host cache, also known as swap to SSD. This is a memory management technique that takes place after ballooning, transparent page sharing, and memory compression have already been tried to free memory. It is an alternative that gives better performance than hypervisor swapping. In the test environment employed, swapping to SSD performed 30% better than hypervisor swapping in the same test system.

 A recently-published technical white paper describes how memory management works in ESXi, details the configuration options available, and provides results to show the performance impact of these options.

 Data in several tests that appeared in earlier versions of the paper is also updated for vSphere 5.

 For the full paper, see Understanding Memory Resource Management in VMware vSphere 5.


vMotion Architecture, Performance, and Best Practices in VMware vSphere 5

VMware vSphere vMotion enables the live migration of virtual machines from one VMware vSphere 5 host to another, with no perceivable impact to the end user. vMotion brings invaluable benefits to administrators—it enables load balancing, helps prevent server downtime, and provides flexibility for troubleshooting. vMotion in vSphere 5 incorporates a number of performance enhancements which allow vMotion to be used with minimal overhead on even the largest virtual machines running heavy-duty, enterprise-class applications.

A new white paper, vMotion Architecture, Performance, and Best Practices in VMware vSphere 5, is now available. In that paper, we describe the vMotion architecture and present the features and performance enhancements that have been introduced in vMotion in vSphere 5.  Among these improvements are multiple–network adaptor capability for vMotion, better utilization of 10GbE bandwidth, Metro vMotion, and optimizations to further reduce impact on application performance.

Following the overview and feature description of vMotion in vSphere 5, we provide a comprehensive look at the performance of migrating VMs running typical Tier 1 applications including Rock Web Server, MS Exchange Server, MS SQL Server and VMware View. Tests measure characteristics such as total migration time and application performance during vMotion. Test results show the following:

  • Remarkable improvements in vSphere 5 towards reducing the impact on guest application performance during vMotion
  • Consistent performance gains in the range of 30% in vMotion duration on vSphere 5
  • Dramatic performance improvements over vSphere 4.1 when using the newly added multi–network adaptor feature in vSphere 5 (for example, vMotion duration time is reduced by a factor of more than 3x)

Finally, we describe several best practices to follow when using vMotion.

For the full paper, see vMotion Architecture, Performance, and Best Practices in VMware vSphere 5.