VMware Server is a best-in-class hosted virtualization solution and provides an ideal way for new users to start using virtualization. It also works well for small deployments as well as application test and development. Potential VMware users often ask how VMware Server differs from VMware's ESX Server, our hypervisor-based virtualization solution. A complete response to that question would include long discourses on ESX Server's superior feature set, manageability, and overall reliability and robustness. Since I am hardly qualified to attempt such a thorough description, I'm going to stick with what I know - performance. At my VMworld 2006 talk, I presented a performance comparison between ESX Server and VMware Server using VMmark about which I am often asked. We ran VMmark using ESX Server on an HP DL585 with four 2.2GHz dual-core processors. We then ran VMmark on VMware Server using a similar HP DL585 with four 2.4 GHz dual-core processors. The results slide from that talk is shown below. (Thanks to my colleague Lisa Roderick in our Cambridge office for collecting the VMware Server numbers.)
These results show that ESX Server not only achieves higher throughput than VMware Server for a single VMmark tile (6 workload VMs) but also exhibits better scalability when a second tile is added. This behavior is a natural consequence of the different virtualization approaches taken by the two products. VMware Server runs on top of a heavyweight, general-purpose host operating system, which manages the hardware resources. (The host OS was Windows 2003 Server in these experiments.) On the other hand, ESX Server manages the hardware resources directly and is highly tuned to efficiently support virtual machines. This optimized design reduces the overhead for individual VMs and produces higher benchmark throughput in general. It is also unsurprising that ESX Server's highly tuned VM resource management provides superior scalability as more workloads are run.
I was looking through some slides from VMworld 2006 and came across an interesting project by one of my colleagues, Chirag Bhatt, who works on dynamic resource scheduling (DRS) performance. He was looking for a way to study the behavior of DRS under representative workloads and ended up using a variant of VMmark. He had only a single client machine available in his lab, so we suggested he omit the mailserver workload since driving multiple mailservers with a single client can be problematic due to domain controller issues. Luckily for Chirag, the VMmark harness is flexible and can easily be reconfigured for this type of test. I've included a couple of slides of his results below for those who couldn't make his VMworld talk.
A two-node DRS cluster was created for these experiments using IBM eserver 336 systems. Two modified VMmark tiles, consisting of five server VMs each (file, database, java, standby, and web servers), were used. A baseline throughput measurement was first taken to determine the optimal expected throughput (Scenario 1 in the slide). Subsequent tests were started with all workloads running on a single server within the DRS cluster (Scenario 2). The DRS scheduler was then run with both moderate and aggressive settings. The resulting VM placements are shown in Scenario 2a and Scenario 2b, respectively. The aggressive setting results in the intuitive placement of essentially splitting the tiles across the two systems except for the standby server. That the standby server does not migrate is not surprising since it is consuming few resources.
Chirag's second slide quantifies throughput performance of the server VMs. In both cases, DRS provides results close to the optimal. In fact, the nearly perfect workload division generated by the aggressive DRS setting was within 99% of the optimal case. Not only does DRS work well, but VMmark also shows good robustness along with the ability to measure cluster-level performance.
When we set out to design VMmark, we understood the need for a well-designed, representative multi-VM benchmark. However, as ESX Server has evolved into a complete datacenter infrastructure with VI3, we now see how broadly such a characterization tool is needed. VMmark displays the potential to develop along the same lines into a cluster-wide benchmark measuring much more than a single system's performance.