vSAN

VSAN vs. Nutanix — Head-to-Head Performance Testing — Part 4 — Exchange!

stopwatchFor those of you have been following this thread for a while, you know we’re in the midst of head-to-head performance testing on two identical clusters: one running VSAN, the other running Nutanix.  Recently, we’ve updated the Nutanix cluster to vSphere 6 and 4.1.3 — however, no differences have been observed performance since the change.

Up to now, we’ve only been able to share our VSAN results.  That’s because Nutanix recently changed their EULA to prohibit any publishing of any testing by anyone.  It’s very hard to find any sort of reasonable Nutanix performance information as a result.   That’s unfortunate.

By comparison, VMware not only regularly publishes the results of our own tests, but also frequently approves such publication by others, once we’ve had a chance to review the methodology — simply by submitting to [email protected].

Since the results are so interesting, we’re continuing to test!

As we start to move from synthetic workloads to specific application workloads, we recently finished a series of head-to-head Jetstress testing against our two identical clusters.  Previous results can be found here and here.

If you’re not familiar, Jetstress is a popular Microsoft tool for testing the storage performance of Exchange clusters. A busy Exchange environment can present a demanding IO profile to a storage subsystem, so it’s an important tool in the testing arsenal.

TL:DR our basic 4-node VSAN configuration passed 1000 heavy Exchange users with flying colors — and with ample performance to spare.  We can’t share with you how the identical Nutanix cluster did, but it’s certainly a worthwhile test if you have the time and inclinations.

That being said, there were no surprises — each product performed (or didn’t perform) as we would expect based on both prior testing as well as customer anecdotes.

Now, on to the details!

Our Test Environments

detailsThe background on our head-to-head clusters can be found here: basically, a very modest entry-level 4-node configuration — nothing exotic.  Relative pricing information for the two configurations is available here, although not a precise match with the tested environment — but certainly indicative.

Jetstress — at a high level — is a pass-fail test. If the Exchange mailbox servers meet their target IOPS — and do so in less than the required response time — you pass.  Historically, Jetstress results have shown a good track record for accurately predicting production storage performance with Exchange.

However, the devil is in the details.  One of the appealing aspects of Jetstress is that it is very configurable to simulate a wide range of user profiles: number of users, level of activity, size of mailbox, etc.

The Tests Ran

We ran through a large number of profiles in our testing, and thought we’d share the result of a single, demanding test to give you a flavor of what we saw.

For the test shared here, we decided to run a particularly demanding profile — a healthy number of busy users!

testingSpecifically, 1000 active Exchange users, each with a 1 GB mailbox, using the “heavy” user profile of 0.2 IOPS per user. Yes, the mailboxes are a bit on the small side, as we only had a modest number of capacity devices in this head-to-head cluster test.

Other tests — not shared here — tried fewer users and larger mailboxes (e.g. 500 users and 2GB mailboxes), and/or lighter profiles (e.g. light 0.13, medium 0.17). Across all tests, the results were largely consistent with all profiles tested — plenty of predictable performance to spare.  We’ll be publishing a white paper with all the VSAN results before too long.

Behind this test profile, we ran four mailbox servers — one per physical server — each in a VM configured with 8 vCPUs and 32 GB of RAM. Each mailbox server was configured with 8 x 175 GB databases, and 8 x 32 GB of log files. Default VSAN policies were used throughout.

For the specific test shown here, we used a moderate number of threads — 4 — per database instance. Generally speaking, more threads means more parallelism and more performance — if the storage can keep up, that is! We also configured Exchange non-DAG, as VSAN is providing the required resiliency.

Software versions:
– Windows Server 2012 Datacenter 6.2.9200.0
– ESE version 15.00.0516.026
– Jetstress version 15.00.0995.000
– vSphere 6

The reason we’re providing all these details is that if you care to reproduce our results, you can!

The Results

Each of our four mailbox servers showed results like these when running 1000 heavy users with 1GB mailbox each:

Achieved Transactional IO per second: 331.8
Target Transactional IO per second: 200

Observed Read Latencies (avg):  8.57 msec
Target Read Latency:  <20 msec

Observed Write Latencies (avg):  6.9 msec
Targe Write Latency: <10 msec

The Bottom Line

caged2In this test, our humble 4-node test cluster supported 1000 busy Exchange users with stellar performance, and had more than enough resources available to support other applications on the same cluster.  More test results are available if you’re interested.

If you’re considering running Exchange on your hyperconverged environment, you would be well-served to run your own Jetstress tests.

In addition, you might be interested in this Microsoft reference architecture which shows VSAN running a significant Exchange workload (2,500 users) while concurrently pushing transactions using SQLserver.  Plenty of performance to spare …

Is this the maximum we can see from VSAN in an Exchange environment?  Hardly.  For one thing, we were capacity-limited in our test environment.  More hardware = more mailboxes = more users.  Larger clusters and multiple disk groups = more parallelism = more users.

And all of that is *before* we start talking all-flash VSAN.