Previous blog entries utilizing VMmark 2.1 introduced the benchmark, showed the effects of generational scaling, and evaluated the scale-out performance of vSphere clusters. This article analyzes the performance impact of the type of storage infrastructure used, specifically when comparing the effects of Enterprise Flash Drives (EFDs; often referred to as SSDs) versus traditional SCSI HDDs. There is a general perception, both in the consumer and business space, that EFDs are better than HDDs. Less clear, however, is how much better and whether the performance benefits of the typically more expensive EFDs are observed in today’s more complex datacenters.
VMmark 2 Overview:
Once again we used VMmark2.1 to model the performance characteristics of a multi-host heterogeneous virtualization environment. VMmark 2.1 is a combination of application workloads and infrastructure operations running simultaneously. In general, the infrastructure operations increase with the number of hosts in an N/2 fashion, where N is the number of hosts. To calculate the score for VMmark 2.1, final results are generated from a weighted average of the two kinds of workloads; hence scores will not increase linearly as workload tiles are added. For more general information on VMmark 2.1, including the application and infrastructure workload details, take a look at the expanded overview in my previous blog post or the VMmark 2.1 release notification written by Bruce Herndon.
- Systems Under Test: 2 HP ProLiant DL380 G6
- CPUs: 2 Quad-Core Intel® Xeon® CPU 5570 @ 2.93 GHz with Hyper-Threading enabled per system
- Memory: 96GB DDR2 Reg ECC per system
- Storage Arrays Under Test:
- HDD: EMC CX3-80
- 8 Enclosures: RAID0 LUNs, 133.68GB FC HDDs
- EFD: EMC CX4-960
- 4 Enclosures: RAID0 LUNs, mix of 66.64GB and 366.8GB FC EFDs
- HDD: EMC CX3-80
- Hypervisor: VMware ESX 4.1
- Virtualization Management: VMware vCenter Server 4.1
To analyze the comparative performance of EFDs versus HDDs with VMmark 2.1, a vSphere DRS enabled cluster consisting of two identically-configured HP ProLiant DL380 servers was connected to the two EMC storage arrays. A series of tests were then conducted against the cluster with the same VMs being moved to the storage array under test, increasing the number of tiles until the cluster approached saturation. Saturation was defined as the point where the cluster was unable to meet the minimum quality-of-service (QoS) requirements for VMmark 2.1. The minimum configuration for VMmark 2.1 is a two-host cluster running a single tile. The result from this minimal configuration on the HDD storage array was used as the baseline, and all VMmark 2.1 data in this article were normalized to that result. In addition to the standard VMmark 2.1 results, esxtop data was also collected during the measurement phase of the benchmark to provide additional statistics.
In a top-down approach to reviewing the two storage technologies, it seems natural that the first point of comparison would be the overall performance of VMmark 2.1. By comparing the normalized scores, it’s possible to immediately see the impact of running our cluster on EFDs versus traditional HDDs at a variety of load levels.
Click to Enlarge
The improvement in score is apparent at every point of utilization, from the lowest-loaded 1-tile configuration out to the saturation point of 6 tiles. Overall, the average improvement in score for the EFD configuration was 25.4%. And while the HDD configuration was unable to meet the QoS requirements at 6 tiles, the EFD configuration not only met the requirements, but also improved the overall VMmark 2.1 score, even when the cluster was completely saturated (as seen in the graph below). VMmark 2.1 can drive a considerable amount of I/O, up to many thousands of IOPS for large numbers of tiles. Digging deeper into the root cause of such dramatic improvement for EFDs led me to investigate the overall throughputs for each of the configurations.
Click to Enlarge
It’s apparent from the above graph that there was significant improvement in the total bandwidth, represented by Total MB/s, in the EFD configurations. Compared to the HDD configuration, the EFD configuration’s total throughput improved (8%, 9.2%, 9.5%, 6.5%, and 14.5%, respectively). The amount of improvement actually increased as the I/O demands on the cluster increased. Another interesting detail that arose from reviewing the data over numerous points of utilization was that %CPU used on the EFD configuration was typically higher than its HDD counterpart at the same load. Although slightly counter-intuitive at first, it makes sense that if the system is waiting less for I/Os to complete, it can spend more time doing actual work as demonstrated by the higher VMmark 2.1 scores. This observation leads to another interesting comparison. Disk latency characteristics are often used to predict hardware performance. This can be useful, but what can be unclear is how this translates to real-world disk latencies running a diverse set of workloads.
Lower is Better:Click to Enlarge
Above is a series of graphs that display the average latency reported per write and read I/Os (note that lower latency is better). In looking at each of the key latency counters we can get a better sense for where the additional performance is derived. There’s a generalization that EFDs have poor write speeds by comparison to today’s HDDs. The results here show that the generalization doesn’t always apply. In fact, when looking at the average write latency for the tested EFDs across all data points, it was within 1% of the average write latency for the tested HDDs. Additionally, reviewing the read latency comparison data showed massive reductions in latency across all workload levels, 76% on average. Depending on the workload being run, this in itself could be all the justification needed to move to the newer technology.
It isn’t surprising that EFDs outperformed HDDs. What is somewhat unexpected is the amount of performance, and the ability for EFDs to show immediate advantages even on the most lightly loaded clusters. With an average VMmark 2.1 score improvement of 25.4%, an average bandwidth increase of 9.6%, and a combined average read latency reduction of 76%, it’s easy to imagine there are a great many environments that might benefit from the real-world performance of EFDs.
7 comments have been added so far
Interesting test. Why RAID-0? No one runs production data at RAID-0. There is a performance penalty to any RAID with parity (4, 5, 10, etc), that I think would be even more obvious when comparing FC to EFD.
As I’m sure you’re aware, RAID0 is the most performant and it highlights the best that the two storage technologies can do. By using RAID0, instead of any number of different RAID levels, we’re able to compare the two technologies in a more generalized manner and leave it up to the reader to extrapolate those results as to how it might compare within their individual environments.
Why did you use an older less powerful CX-3 storage array for the HDDs?
To have a far comparison, both arrays should be identical.
For the purposes of this comparison (an older technology versus a newer), using a CX-3 array is quite fair and likely more representative of the type of gains one might see if making the upgrade to EFDs. Regardless of the array, although it’s not quantified in this study, most of the gains were likely due to the use of EFDs.