A recent virtualization performance test claimed a high performance overhead for a virtualized web application compared to a physical system. Unfortunately, the test was not an apples-to-apples comparison. Here VMware points out some of the problems with this test configuration and gives some guidance how the benchmark might be improved.
Response to the article "Load Testing a Virtual Web Application"
All test results
are obtained using VMware Server, our freely available hosted product. VMware Server
uses a hosted architecture, as opposed to the bare-metal hypervisor architecture
used by VMware ESX Server. VMware expects the performance results for the same
test using VMware Infrastructure 3 would be better than those published in this
results are another example of "apples to oranges" comparison, and
the reason VMware requires a benchmark review to ensure that benchmark test
methodology is correct. Here are more details about how the physical and
virtual configurations differ:
CPU: In this case, the physical environment consists of
dual Intel Xeon processors with hyperthreading enabled i.e. there are 4 logical
CPUs in the physical environment. The virtual environment details are not
provided, but assuming default values, we imagine the virtual machine is using a
single virtual CPU. So in essence the test is comparing results from a 4-processor
physical environment to a 1-processor virtual configuration. This can have a
huge impact on multi-threaded apps such as this .Net application.
Memory: The physical environment used 2GB memory available
to the machine. In the virtual environment, the VM was also assigned 2GB (the
article implies that the physical machine has more memory). While all the
details are not available, this memory configuration may result in swapping
since the host operating system and VMware Server have their own memory
The article does
provide one data point that validates that the physical-to-virtual comparison is
When the tests were performed with hyperthreading
disabled, they saw 481 maximum users on physical versus 403 on virtual. This is a 16% decrease in performance (versus
the 43% claim). Performance within 16% of native is considered to be relatively
good (especially for a product that uses hosted architecture).
elaborate further, the test with hyperthreading disabled compares the
performance of a 2-physical CPU machine to a single virtual CPU machine. While
this data point still does not reflect a completely fair comparison, it is a very
good indication that the virtualization overheads for a web application are
So what would
you change with the test methodology to make this a fair comparison? There are
several ways to do this and here are just a few:
- Boot the physical environment as single CPU with hyperthreading disabled, and then compare with 1-VCPU virtual machine.
- Or, configure the virtual machine with two virtual processors. VMware Server supports dual processor SMP virtual machines.
- Run multiple virtual machines so that the number of virtual and physical processors is equal.
- Make sure that the memory, disk and network
configurations for both physical and virtual environments match.
important point to notice is that the test measures the performance of a single
virtual machine. The best way to achieve performance scaling in a virtual
environment is to use multiple virtual machines. A good test would have been to
measure the performance of 1, 2, 3, 4 and 5 single CPU virtual machines versus the
performance of a physical box. Not only would this give more data points for
fair comparison, it will also illustrate how customers are using virtualization
technology in real life to get around some application problems and achieve
better scalability using existing hardware investments
VMware has an
externally published benchmark for a J2EE application based on IBM Websphere.
Furthermore, there are several white papers that use another web application,
the open source DVD Store application,
at Dell Power Solutions.