VMware

« April 2009 | Main | June 2009 »

May 26, 2009

Java Performance on vSphere 4

VMware ESX is an excellent platform for deploying Java applications.  Many customers use it to support Java applications from the desktop to business-critical enterprise servers.  However, we haven't published any results recently highlighting the excellent performance of Java applications on VMware ESX.  As a first step at remedying this situation, we compared native and virtualized performance using SPECjvm2008.  This workload is a benchmark suite containing several real life applications and benchmarks focusing on core java functionality. The results demonstrate that Java applications run on VMware vSphere at greater than 94% of native performance over a range of VM sizes.  This is up to a 9% improvement over VMware ESX 3.5, which already runs this workload at close to or better than 90% of native performance.

We ran SPECjvm2008 on Red Hat Enterprise Server 5 Update 3 using the latest JVM from Sun Microsystems, JRE 1.6 Update 13.  Tests were conducted with both 32-bit and 64-bit  versions of the OS and JVM.  An HP DL380G5 equipped with two quad-core Intel Xeon X5460 (Harpertown) processors running at 3.16GHz was used.  This server had 32GB of memory.  For native runs using less than the full number of available CPU cores, we used the kernel boot parameter maxcpus= to limit the OS to a given number of cores.  We also used the kernel boot parameter mem= to limit the memory to 16GB in all 64-bit runs.  The runs on VMware vSphere 4.0 and VMware ESX 3.5 Update 4 were done in virtual machines (VMs) using the stated number of virtual CPU s and 16GB of memory. 

The runs of SPECjvm2008 were all base runs, meaning that no Java tuning parameters were used.   All SPECjvm2008 results are required to include a base run.  Unfortunately, the default heap size of the Sun JVM in the 1 CPU case is not large enough to run the SPECjvm2008 workload.  As a result, we were not able to generate 1 CPU results which would be compliant with the run-rules for SPECjvm2008.  We did generate native and vSphere 4.0 results for 2, 4, and 8 CPUs, and ESX 3.5 results for 2 and 4 CPUs.

Figure 1 shows the SPECjvm2008 results for the native, VMware vSphere 4.0, and VMware ESX 3.5 cases.  Figure 2 presents the same results normalized to the native result for that server and CPU count.  These results show that VMs running on VMware vSphere 4.0 perform at greater than 95% of native on this benchmark at all VM sizes.  Even with 8 vCPUs running on a server with only 8 physical cores, the vSphere 4.0 VM achieves 99% of native performance.   The VMware ESX 3.5 VMs ran at close to or greater than 90% of native, which is still excellent for a virtualized environment.  However, for 64-bit VMs, vSphere 4.0 gives a performance improvement over ESX 3.5U4 of 9% in the 4 vCPU case, and about 3% in the 2 vCPU case.

Figure 1 SPECjvm2008 on 8-Core Intel Harpertown Server

SPECjvm2008_blog_fig1

Figure 2 SPECjvm2008 performance relative to native

SPECjvm2008_blog_fig2

In order to sanity-check the native results, we compared the 8-Core Harpertown result using the 64-bit OS and JVM to the closest published result.  There is no directly comparable result, but there is a result generated by Sun on a 16-Core Intel Tigerton Server.  The Tigerton is architecturally similar to the Harpertown, but the Harpertown has a larger L2 cache.  The Sun 16-core Tigerton result, using Solaris 10, a special performance build of the Sun JVM (1.6.0_06p), and 64GB of memory, achieved 260 SPECjvm2008 ops/m.   Our native result on the 8-core Harpertown  with 16GB of memory was  145 SPECjvm2008 ops/m.   A native run on the Harpertown with 32GB and using the Sun 1.6.0_06p JVM achieved 174 SPECjvm2008 ops/m.  This is well more than half of the Tigerton result, and indicates that our native configuration is producing reasonable results.

Figure 3 shows the scaling of the results as we move from 2 to 4 and 8 CPUs for the 64-bit case.  The scaling is essentially the same for 32-bit.  The results are normalized to the 2 CPU results on the same platform.  These results show that VMware vSphere 4.0 scales as well as or better than native for this workload.  VMware ESX 3.5 scaling is just slightly below native.

Figure 3 SPECjvm2008 Scaling from 2 CPUs

 

SPECjvm2008_blog_fig3

The SPECjvm2008 results presented here show that core Java functionality runs extremely well on VMware vSphere 4.0 and VMware ESX 3.5.  No special tuning was required to get results that are remarkably close to native performance.  We hope to soon produce additional results to demonstrate that this excellent performance extends to multi-tier Java Enterprise Edition applications as well.  For comments or questions, please join us in the VMware Performance Community at this thread: http://communities.vmware.com/message/1262696

May 21, 2009

VMware vCenter Update Manager Sizing Estimator Posted

VMware vCenter Update Manager is a component of VMware Infrastructure that automates patches and upgrades of ESX hosts, virtual machine Tools and hardware, Windows and Linux virtual machines, and virtual appliance. A new sizing tool, VMware vCenter Update Manager Sizing Estimator, is now available.

 

The following input parameters are used to estimate database size, patch store disk space, and temporary disk space:

-       Feasibility for virtual machine remediation

-       Number of ESX and ESXi flavors in the deployment

-       Number of hosts, virtual machines, Windows distributions, average number of locales for Windows distribution, average number of different Service Pack levels for Windows distribution,

-       Patch scan frequency for virtual machines

-       VMware Tools upgrade scan frequency for virtual machines

-       Virtual machine hardware upgrade scan frequency

-       Patch scan frequency for hosts

-       Upgrade scan frequency for hosts

 

The following are the outputs from the tool:

-       VMware vCenter Update Manager 4.0 database deployment model recommendations

-       VMware vCenter Update Manager 4.0 server deployment model recommendations

-       Initial disk space utilization in MB for database, patch store, and temporary space

-       Monthly disk space utilization growth in MB for database and patch store

-       The upper and lower bounds on the estimation, assuming a 20% variance

 

 

 

 

VMware vCenter Update Manager Performance and Best Practices White Paper Posted

VMware vCenter Update Manager is a component of VMware Infrastructure that automates patches and upgrades of ESX hosts, virtual machine Tools and hardware, Windows and Linux virtual machines, and virtual appliance. A new white paper, VMware vCenter Update Manager Performance and Best Practices, is now available.

In this paper we discuss VMware vCenter Update Manager 4.0 host deployment, latency, resource consumption, guest OS tuning, high-latency networks, and the impact of on-access virus scanning. We also provide performance tips to help customers tune the system for better performance.

Exchange 2007 performance on vSphere 4

VMware recently released a whitepaper showing the performance scalability of Exchange 2007 on VMware vSphere. This paper shows that vSphere 4.0 achieves excellent performance and scalability both with regards to scale up (adding more vCPUs) and scale out (adding more VMs).  The results indicate that vSphere can easily support 4,000 heavy Exchange users with a single 8 vCPU VM or 8,000 heavy Exchange users with multiples of either 2 or 4 vCPU VMs. While supporting these high user counts, the latencies of most of our virtualized Exchange configurations are half the recommended threshold (500 ms) with little overhead compared to physical.

 

Even the largest configuration, which supports 8,000 Heavy users with 16 vCPUs on an 8-way server, provides outstanding user experience. For our 8,000 heavy user mailbox configuration, the 95th Percentile Send Mail latency Is 273 ms with eight 2 vCPU VMs and 304 ms with four 4 vCPU VMs.

 

95th Percentile Send Mail Latency (2 vCPU VM vs. 4 vCPU VM)

 

  

VMs-Latency

 

 

In addition to these low latencies, this paper also shows that the 8,000 mailbox configuration consumes less than 60% of host CPU resources, which leaves room for further user growth and further consolidation. In addition, the paper shows that ESX provides consistent performance across all consolidated virtual machines. For example, the response times of the Exchange transactions in the eight 2 vCPU configuration were within 2% of each other. For more information on this research, read the full paper: Microsoft Exchange Server 2007 Performance on VMware vSphere.

May 18, 2009

350,000 I/O operations per Second, One vSphere Host

Summary

VMware vSphere includes a number of enhancements that enables it to deliver very high I/O performance. In this study, we demonstrate that vSphere can easily support even an extreme demand for I/O throughput made possible by new products like Enterprise Flash Drives (EFD) offered by EMC. In the experiments conducted at EMC labs, we were able to achieve just above 350,000 I/O operations per second with

  • Single vSphere host with just three virtual machines running on it
  • Latencies under 2ms
  • I/O block size of 8KB

What does such a high throughput mean to customers? Consider this: the entire database of Wikipedia is supported by 20 MySQL servers each 200GB to 300GB in size. On an average Wikipedia receives 50,000 http requests or 80,000 SQL queries per second1, which translates to 4.3 billion hits per day. With the storage infrastructure used in our experiments we could easily accommodate the entire database of Wikipedia and still be left with enough space. A single vSphere host driving more than 350,000 I/O requests per second could easily support the throughput requirements of Wikipedia.

Background

In late May 2008, we published a blog article on achieving 100K I/O operations per second with ESX 3.5. To achieve that, we had used 495 15K RPM Fibre Channel disks spread across three CX3-80 arrays. If we were to push the envelope further with vSphere, we needed more storage bandwidth. It would have taken approximately 1750 15K rpm Fibre Channel drives with 120 Disk Array Enclosures to provide the 350,000 I/O operations per second throughput. If we were to have some redundancy in the storage then the numbers would increase further and go as high as 3500 drives for a RAID 1/0 configuration doubling the entire SAN infrastructure.

Instead only 30 EFDs housed in three CX4-960 arrays provided enough storage bandwidth for vSphere to drive just above 350,000 I/O requests per second.

I/O workload


We could have achieved higher I/O operations per second with a smaller block size, but we focused our studies on 8KB block because it is the most  representative of real applications. We chose an I/O pattern that was 100% random in nature.

Key Findings


  • 3 VMs on one vSphere host supported 350,000 I/O operations per second with 8KB block size (Figure. 1)
  • A single VM with 2 vCPU and 4GB memory provided just under 120,000 I/O operations per second with 8KB block size
  • I/O latency as measured in ESX was just under 2 ms
  • VMware’s new paravirtualized SCSI adapter (pvSCSI) offered 12% improvement in throughput at 18% less CPU cost compared to LSI virtual adapter

350k
Figure.1 Scaling I/O performance through vSphere

We are documenting all the experiments in detail in a white paper that will be posted on the VMware website. We encourage readers to refer to that white paper for more details.

This testing was the result of a joint effort between VMware and EMC. We would like to thank the Midrange Partner Solutions Engineering team at EMC, Santa Clara for providing access to the hardware, for the use of their lab, and for their joint collaboration throughout this project.

For more comments or questions, please join us in the VMware Performance Community website.

About the Authors:
Chethan Kumar is a member of Performance Engineering team at VMware. Radhakrishnan Manga is a member of Midrange Partner Solutions Engineering team at EMC.