vSphere has a new feature called Fault Tolerance that allows for a VM to be running in vLockstep on two physical servers at the same time. In the event of a failure of the primary VM, the secondary VM will immediately take over with no downtime for the VM. There is a great whitepaper that covers FT architecture and performance. There have also been a couple of blog posts on VROOM! recently that cover FT performance as well. One uses VMmark to show how FT has excellent performance in a heavily loaded multi workload environment. The other blog post shows how an Exchange VM maintains excellent performance while supporting 2000 users with FT enabled.
FT currently requires that 1vCPU VMs be used. This presents a challenge for some applications that have traditionally been run in 2vCPU VM configurations. At the same time, new processors have features that provide much higher performance than in the past. When combined with the performance enhancements of ESX 4, it is now possible to get much better performance per core.
A series of Exchange Server 2007 tests were conducted to compare the performance of 1vCPU current processor generation VMs with previous processor generation 2vCPU VMs. For the 1vCPU tests the Intel Xeon X5570 (Nehalem) processor was used with FT enabled. (For detailed test results comparing FT turned on and turned off on the same VMs read my previous blog post on Exchange with FT Performance.) For the 2vCPU tests, two previous generation Intel processors were used: a Xeon x5355 (Clovertown) and Xeon x5460 (Harpertown). The specific servers used were a Dell M600 and Dell 2950 respectively. Storage for all the tests was provided by several Dell EqualLogic PS5000XV iSCSI arrays. Microsoft Exchange Load Generator (LoadGen) was used to run the tests.
The VM was configured with 10GB of RAM and installed with Windows Server 2008 x64 Enterprise Edition and the Exchange Server 2007 mailbox role. A VM running on another ESX server served as the domain controller and Exchange Client Access and Hub Transport server roles.
The graph below shows the results in terms of the average latency for the sendmail action from LoadGen and the sum of the vCPU utilizations of the VM. For these results the sum was used instead of the average because some VMs had 1vCPU and some had 2vCPUs.
There are a couple of interesting things to note about the results.
The first is that the sendmail average latency results with FT enabled on a 1vCPU Xeon 5570 based VM with 1500 users was within 5ms of the 2vCPU Xeon 5460 VM with 2000 users. This means that the Nehalem based 1vCPU VM was getting an extra 50% more users per vCPU than the 2vCPU Harpertown based VM.
Average CPU utilization on the 1vCPU VM with 2000 users and FT enabled was only 45% which leaves head room for spikes in usage. This means that 2000 heavy online LoadGen users ran comfortably in a 1vCPU VM.
A 1vCPU Xeon X5500 series based Exchange Server VM can support 50% more users per core than a 2vCPU VM based on previous generation processors while maintaining the same level of performance in terms of Sendmail latency. This is accomplished while the VM’s CPU utilization remains below 50%, allowing plenty of capacity for peaks in workload and making an FT VM practical for use with Exchange Server 2007.
9 comments have been added so far
Really an interesting comparison!
I would like to ask a detail: the X5570 had the hyperthreading enabled or disabled?
Thanks in advance.
Hyperthreading or Simultaneous MultiThreading (SMT) was enabled for the these tests.
How do these numbers stack up against physical Harpertown or Nehalem CPUs? Thanks.
Very nice article. I would be curious to see what a dual core Nahalem added to these graphs.
We have not published a direct comparison of virtual vs physical of Exchange on Nehalem. We did publish some virtual vs physical numbers on Tigerton earlier this year in a whitepaper – Microsoft Exchange Server 2007 Performance on VMware vSphere 4 – http://www.vmware.com/files/pdf/perf_vsphere_exchange-per-scaling.pdf
The results in that paper showed that vSphere VM performance was within 5% of physical as tested with Exchange LoadGen.
I’m wonder if having EVC enabled would mask out the specific features that benefited Exchange on the Nahalem?
It would have been nice to see the 5460 and 5355 with 1 CPU to see if it really was double in comparison.