Scaling real-life Web server workloads

In an earlier blog, we compared the performance aspects (such as latency, throughput and CPU resource utilization) of real-life web server workloads in a native environment and a virtualized data center environment. In this post, we focus on yet another important dimension of performance – scalability.

For our scalability evaluation tests, we used the widely deployed Apache/PHP as the Web serving platform. We used the industry-standard SPECweb2005 as the web server workload. SPECweb2005 consists of three workloads: banking, e-commerce, and support. The three workloads have vastly different characteristics, and we thus evaluated the results from all three.

First, we evaluated the scalability of the Apache/PHP Web serving platform in the native environment with no virtualization by varying the number of available CPUs at boot time. Note that in all these native configurations, there was a conventional, single operating environment that consisted of single RHEL5 kernel system image and a single Apache/PHP deployment. We applied all the well documented performance tunings to Apache/PHP configuration – for example, increasing the number of Apache worker processes, and using an Opcode cache to improve PHP performance.

The figure below shows the scaling results of SPECweb2005 workload in the native environment. The scaling curve plots the aggregate SPECweb2005 metric (a normalized metric based on the throughput scores obtained on all the three workloads -banking, e-commerce and support) as the number of processors was increased. In our test configuration, there were no bottlenecks in the hardware environment.

As shown in the figure, the scalability was severely limited as we increased the number of processors. In a single CPU configuration, we achieved the processor utilization of above 95%. But, as we increased the number of processors, we failed to achieve such high processor utilization. The performance was limited by software serialization points in the Apache/PHP/SPECweb2005 software stack. Analysis using the Intel Vtune performance analyzer confirmed increasing hot spot contention as we increased the number of CPUs. For the same size workload of 1,800 banking sessions, the CPI (Cycles Per Instruction) jumped by a factor of roughly four as we increased the number of CPUs from three to eight, indicating a software scaling issue. As observed in our test configuration, such issues often show up as unacceptable latencies even when there are plenty of compute resources available on the system. More often than not, diagnosing and fixing these issues is not practical in the time available.

Most real life web server workloads suffer from scalability issues such as those observed in our tests. In order to circumvent these issues, lots of businesses choose to deploy web server workloads on a multitude of one-CPU or dual-CPU machines. However, such approach leads to proliferation of servers in a data center environment resulting in higher costs in both power and space usage. Virtualization offers an easier alternative to avoid software scaling issues as well as provide efficiency in power and space usage. This is because, virtualization enables several complex operating environments that are not easily scalable to run concurrently on a single physical machine and exploit the vast compute resources offered by today’s power and space efficient multi-core systems. To quantify the effectiveness of this approach we measured SPECweb2005 performance by deploying multiple Apache/PHP configurations in a virtual environment. We have submitted our test results to the SPEC committee and they are under review.

In our virtualized tests, we configured the virtual machines in accordance with the general performance best practices recommended by VMware. Each VM was assigned one virtual CPU, and 4 GB of memory. We then varied the number of simultaneously running virtual machines from one to six, stopping at six, as this workload is highly network intensive and ESX offloads some of the network processing to the other available cores. Stopping short of allocating virtual machines to all cores ensured that with I/O intensive workloads such as this one, ESX Server has enough resources to take care of virtual machine scheduling, I/O processing and other housekeeping tasks. The following figure compares the SPECweb2005 scaling results between the native and the virtual environments.

As shown in the above figure, we observed good scaling in the virtual environment as we increased the number of virtual machines. The aggregate SPECweb2005 performance obtained in the tests with up to two virtual machines was slightly lower than the performance observed in corresponding native configurations. However, as we further increased the number of processors, the cumulative performance of the configuration using multiple virtual machines well exceeded the performance of a single native environment.

These results clearly demonstrate the benefit of using VMware Infrastructure to bypass software scalability limitations and improve overall efficiency when running real-life web server workloads.

To find out more about the test configuration, tuning information, and detailed results of all the individual SPECweb2005 workloads, check out our recently published performance study.