Uncategorized

Using a Private Cloud to Improve Exchange Performance

There has been quite a bit of talk about Clouds lately.  VMware has used the terms private cloud and external cloud to distinguish between two different types of clouds.  Although you may not realize it, if you are an existing VMware customer you might be running a private or internal cloud right now.  By running a number of ESX servers in a cluster with vSphere features like VMware High Availability (HA), Distributed Resource Scheduling (DRS or load balancing), and quick provisioning with clones and templates there is in fact an actual cloud in your datacenter right now (even if your datacenter is more like a data closet – it still counts).  Our recent testing of a simulated full business day across four time zones showed up to an 18% performance advantage with Exchange 2007 on a private cloud.

Putting Exchange in the Cloud

A couple of years ago I did some testing with Exchange Server 2007 and VMotion  and presented the results at VMworld 2007.  The findings from that test showed that VMs could be moved with VMotion with minimal impact to performance.   VMotion is one of the enabling features for DRS within a vSphere cluster, so it made sense to test Exchange in that internal cloud environment and measure the performance.

Private Cloud Configuration

A simple but powerful private cloud was created, in the Dell TechCenter lab, with two Dell M905 blades and some EqualLogic PS5000 iSCSI storage arrays.  Each server had four AMD Opteron 8356 quad-core processors and 96 GB of RAM.  Four EqualLogic PS5000XV arrays with 16 15K RPM disks were used to host the Exchange mailbox databases and logs.  An EqualLogic PS5000E with 16 7200 RPM SATA disks was used to host the VM OS partitions. 

Test Scenario

A 16,000 user company in the US has its employees evenly spread across four time zones: East, Central, Mountain, and Pacific.  There are two Exchange mailbox server VMs per time zone, with each VM supporting 2000 users.  Each VM had 4vCPUs and 14 GB of RAM. As each group of users starts their eight hour workday, load increases on the Exchanges servers.  Also as each group of users ends their day, load decreases.  This scenario was simulated by scripting Microsoft Exchange LoadGenerator (LoadGen) to start each group of users 1 hour apart.  The chart below shows the IO Operations per Second (IOPS) during the test and has the points where users start and stop their activity markedIOPsTimeZonesChart

The test was run once with no DRS and once with DRS enabled.  The starting VM placement was the same in both cases with the same four VMs on each server.  The VMs were placed “in order” on the servers with the East and Central VMs on the first server, and the Mountain and Pacific VMs on the second server.

The first graph below shows the CPU utilization of the ESX Servers during the test without DRS.  The load is uneven between the two servers at the beginning and end of the test when load is changing, but is even during the middle of the test when all users are active.

CPUwithoutDRSgraph

Adding the Exchange VMs into a DRS enabled cloud allows for the load across the servers to be kept more even over the course of the test.  The next graph shows the CPU utilization of the ESX hosts while DRS was enabled.  Additionally the VMotion events that were initiated by DRS are called out by the vertical lines.  Each VMotion event was in response to a divergence of the two server’s CPU utilization levels.

CPUwithDRSgraph

The result was that performance was better by as much as 18% for some users and an average of 8% for all  users with their Exchange VMs running in the internal cloud.  The table below shows the complete results in terms of LoadGen Send Mail 95thPercentile latency.

Resultschartimage

Exchange LoadGen only reports performance at the end of the run, but the advantage provided by the test with DRS is due to the advantage gained by keeping the initial CPU spikes lower when new users logon at the beginning of their workday.  There is some variation from group to group as a result of the noise of the Exchange workload.  In seven of the eight user groups latency was better when DRS was enabled showing a clear advantage that is not due to a single group having a big advantage.

Conclusion

The CPUs on the two servers with all 16,000 users running are under 50%, meaning that this workload is not CPU constrained.  In cases where CPUs are oversubscribed and running at near 100% utilizations it is easy to see how DRS can achieve performance gains by moving VMs to servers where there is more available CPU cycles. There is also a small cost in performance each time a VMotion occurs which would negatively affect the DRS enabled test.  Despite these factors, there was still up to an 18% advantage for a given group of users and an average of an 8% advantage for the vSphere private cloud. 

Additionally, simply measuring the performance advantage of DRS does not take into account all of the other advantages of running Exchange on a vSphere cloud.  Potential savings in power by using Distrubuted Power Management (DPM), time savings with easy deployment using templates and clones, and increased availability with HA and Fault Tolerant features are all things that could be considerd as advantages for running in the cloud.