SPECweb2005 Performance on VMware ESX Server 3.5

I got a chance to attend the VMworld 2007 conference in San Francisco a little over three months ago. During the conference, many of my Performance group colleagues and I had the opportunity to speak with a number of customers from various segments of industry. They all seem to love VMware products and are fully embracing virtualization technology across IT infrastructure, clearly reflecting a paradigm shift. As Diane Greene described in her keynote, virtualization has become a mainstream technology. However, among the customers we spoke to there were a few who had some concerns about virtualizing I/O-intensive applications. Not surprisingly, the concerns had more to do with perception than with their actual experience.

Truth be told, with a number of superior features and performance optimizations in VMware ESX Server 3.5, performance is no longer a barrier to virtualization, even for the most I/O-intensive workloads. In order to dispel the misconceptions these customers had, we decided to showcase the performance of ESX Server by benchmarking with industry-standard I/O-intensive benchmarks. We looked at the whole spectrum of I/O-intensive workloads. My colleague has already addressed database performance. Here, I’d like to focus on web server performance; in particular, the performance of a single virtual machine running the highly-network intensive SPECweb2005 benchmark.

SPECweb2005 is a SPEC benchmark for measuring a system’s ability to act as a web server. It is designed with three workloads to characterize different web usage patterns: Banking (emulates online banking), E-commerce (emulates an E-commerce site), and Support (emulates a vendor support site providing downloads). The three benchmark components have vastly different workload characteristics and we thus look at results from all three.

In our test environment we used an HP ProLiant DL385 G1 server as the system under test (SUT). The server was configured with two 2.2 GHz dual-core AMD Opteron 275 processors and 8GB of memory. In the native tests the system was booted with 1 CPU and 6GB of memory and ran RHEL4 64-bit. In the virtualized tests, we used a 1-vCPU virtual machine configured with 6GB of memory, running RHEL4 64-bit, and hosted on ESX Server 3.5. We used the same operating system version and web server software (Rock Web Server, Rock JSP/Servlet container) in both the native and virtualized tests. Note that neither the storage configuration nor the network configuration in the virtual environment required any additional hardware. In fact we used the same physical network and storage infrastructure when we switched between the native and virtual machine tests.

There are different dimensions to performance. For real-world applications the most significant of these are usually overall latency (execution time) and system throughput (maximum operations per second). We are also concerned with the physical resource utilization per request/response. We used the SPECweb2005 workloads to evaluate all these aspects of performance in a virtualized environment.

Figure 1 shows the performance we obtained using the SPECweb2005 Banking workload. The graph plots the latency in seconds against the total number of SPECweb2005-banking users. The blue dashed line corresponds to the performance observed in a physical environment and the green solid line corresponds to the performance observed in a virtual environment.

Figure 1. Response Time Curves

You can see from the graph that both curves have similar shapes. Both exhibit behavior observed in a typical response time curve. There are three regions of particular interest in the graph: the performance plateau, the stressed region, and the knee of the curve.

The part of the curve marked “Performance plateau” represents the behavior of the system under moderate stress, with CPU utilizations typically well below 50%. Interestingly, we observed lower latency in the virtual environment than in the native environment. This may be because ESX Server intelligently offloads some functionality to the available idle cores, and thus in certain cases users may experience slightly better latency in a virtual environment.

The part of the curve marked “Stressed region” represents the behavior of the system under heavier stress, with utilizations above approximately 60%. Response time gradually starts to increase with the increase in the load in both curves. But the response times are still below the reasonable limits.

The knee of each curve is marked by a point where the solid red line intersects the curve. The knee represents the maximum throughput (or load) that can be sustained by the system while meeting reasonable response time requirements. Beyond that point the system can no longer gracefully handle higher loads.

From this graph we can draw the following conclusions:

When the CPU resources in the system are not saturated, you may not notice any difference in the application latency between the virtual and physical environments.
The behavior of the system in both the virtual and physical environments is nearly identical, albeit the knee of the curve in the virtual environment occurs slightly earlier (due to moderately more CPU resources being used by the virtualized system).

We have similar results for the Support and E-commerce workloads. For brevity, I’ll focus on a portion of the response time curve that interests most of the system administrators. We have chosen a load point that is approximately 80% of the peak throughput obtained on a native machine. This represents the center of the ‘stressed region’ of the native response time curve, with CPU utilization level of 70% to 80%. We applied the same load in the virtual environment to understand the latency characteristics.

As you can see from Figure 2, we did not observe any appreciable difference in application latency between the native and virtual environments.

Figure 2. SPECweb2005 Latency

Figure 3 compares the knee points of the response time curves obtained in all three workloads.

The knee points represent the peak throughput (or the maximum connections) sustained by both the native and virtual systems while still meeting the benchmark latency requirements.

Figure 3. SPECweb2005 Throughput

As shown in Figure 3, we obtained close to 90% of native throughput performance on the SPECweb2005 Banking workload, close to 80% of native performance on the E-commerce workload, and 85% of native performance on Support workload.

If you’d like to know more about the test configuration, tuning information, and performance statistics we gathered during the tests, check out our recently published performance study.

These tests clearly demonstrate that performance in a virtualized environment can be close to that of a native environment even when using the most I/O-intensive applications. Virtualization does require moderately higher processor resources, but this is typically not a concern given the highly underutilized CPU resources in many IT environments. In fact, with so many additional benefits, such as server consolidation, lower maintenance costs, higher availability, and fault tolerance, a very compelling case can be made to virtualize any application, irrespective of its workload characteristics.