The recently released VMware vRealize Operations (vR Ops) Management Pack for vSAN offers an all new level of visibility into the performance and health of vSAN, courtesy of vR Ops. Since vSAN is built right into the kernel, vR Ops paired with the management pack provides an easy way to consume the right information about your infrastructure. What you measure, how you measure it, and where you measure it are key tenets behind accurate, meaningful infrastructure analytics. This is the reason why vSphere and vR Ops can be such a powerful combination.
The example below steps through a very basic, yet common scenario for new or existing vSAN users. Infrastructure administrators often want to introduce vSAN to an existing environment but do it in a way that instills confidence that vSAN will deliver the performance expected. By having a method to validate performance expectations of vSAN, transitioning workloads from an existing storage array, or other HCI solution can happen quickly, with confidence.
Let’s look at one way to use the new vR Ops Management Pack for vSAN, and how it can be helpful in this scenario.
Validating performance for workload migrations
In this example, we have two systems running a very similar workload. This occurs quite often in formal application farms such as SQL clusters, ERP systems, SharePoint servers, or some other multi-tiered application. Using similar systems simplifies the process of comparison as the characteristics of the I/O, such as read/write ratios, sequential/random ratios, and block sizes of both systems will be very similar. If two similar systems are not available, one could also choose any two systems desired.
Once a VM has been transitioned to vSAN based storage, go to the ‘Optimize vSAN Deployments‘ dashboard added upon installation of the management pack. The dashboard allows for you to easily select the VM served up on non vSAN storage by selecting the vSphere cluster, followed by the datastore, then selecting the VM. The actual performance of this VM will be presented in the lower left hand corner of the screen. Then for comparison, choose the vSAN based VM by clicking on the desired VM in the heatmap. This will present the performance of the vSAN powered VM in the lower right hand portion of the screen.
For this exercise, the time period has been changed to a 2-hour window, and the sampling rate adjusted from 5 minutes, to 1 minute. These two adjustments allow us to look at a greater level of detail to better understand the performance behaviors between systems.
Figure 1. IOPS comparison between non-VSAN and vSAN based VM. 2-hour window.
The dashboard view will default to showing 3 metrics for both workloads. Aggregate virtual disk IOPS, read latency, and write latency. We can see in Figure 1 that the activity (in IOPS) of each system mirrors each other closely, peaking at around 260 IOPS when sampled every 1 minute. The absolute numbers are not as important here as the purpose is to compare systems. We want to validate that vSAN can deliver performance equal to or greater than the VM living on the traditional flash based array.
Now, let’s untick the line in the key showing IOPS. This will change the display so that only read latency, and write latency are shown, and will help us better understand the real performance comparison between storage systems. Each y-axis is independently scaled relative to the highest peak of a given time window.
Figure 2. Read and write latency comparison between non-VSAN and vSAN based VM. 2-hour window.
By unticking the IOPS metric, we can see the correlating latency at the same time in which there was activity. In Figure 2, we see that the latency of the VM living on traditional flash based array had latency peaks of 6ms for reads, and 7.5ms for writes, sampled every 1 minute. Compare that to the latency numbers on the vSAN based VM, and you can see that latency is dramatically improved. Latency peaks on vSAN storage were 460us for reads, and 750us for writes. This is a 10x improvement in latency on the same workload running on vSAN base storage versus a traditional flash based array. This reduction in latency is what your applications, and your users feel. Just as with the comparison of IOPS, in this case the absolute numbers are not as important as the measurements relative to each other.
Now, let’s untick the line in the key showing read latency, and change the time window from 2 hours to 6 hours. This will help us better understand the consistency of latency for writes, and over a longer time period.
Figure 3. Write latency comparison between non-VSAN and vSAN based VM. 6-hour window.
Performance of write operations are typically the most difficult for any storage system to provide consistently. In Figure 3, we see that the deviation of latency on the flash based array is much broader than with vSAN. The highest latency spikes doubled to nearly 20ms on the traditional, flash based storage. The vSAN powered VM had a high latency spike of just 870us. In other words, the worst latency spike of the vSAN powered VM remained under 1ms. When focusing on the comparison as opposed to the absolute numbers, we can clearly see that vSAN is not only offering much lower latency, but providing it more consistently.
Flexible, and customizable
The power of the vR Ops platform comes not only from the data and insight it provides, but from the flexibility it gives the user in presenting the data. If you prefer targeting certain elements of your vSAN environment, you can create custom dashboards that focus specifically on what you want to understand more clearly. Time window defaults and sampling rates can also be easily adjusted to best suite your performance analysis requirements.
Figure 4. Custom dashboard that provides a variation of the data in the default ‘Optimize vSAN Deployments’ dashboard
In Figure 4, we see a slightly modified version of the ‘Optimize vSAN Deployments’ dashboard shipped with the management pack. The heatmap was removed so that the performance graphs can be larger. This is a customization that can be done quickly and easily.
Summary
Validating that new solutions are living up to expectations is an important step in building and maintaining a predictable data center. The vR Ops Management Pack for vSAN not only helps you better understand how your vSAN environment is performing, but also provides a unified control plane to measure infrastructure related metrics in a consistent, reliable manner.