Using vR Ops Management Pack for vSAN to understand cluster performance

Accurate interpretation of metrics in the data center is critical to the efficient design, operation, and optimization of any environment. The vRealize Operations (vR Ops) Management Pack for vSAN extends the powerful abilities of vR Ops to bring the right information to the eyes of the Administrator running vSAN. This management pack exposes vSAN related metrics through predefined dashboards that are designed to assist in common operation and optimization efforts. These predefined dashboards are a great way to get started, but the flexibility of vR Ops allows the Administrator to take this a step further.

Ease and flexibility in customization is a fundamental design goal of management packs for vR Ops. Users can build dashboards from scratch, or modify existing vSAN dashboards using any of the vSAN performance metrics exposed courtesy of the management pack. vR Ops also provides an easy way to clone existing dashboards for easy learning and customization while preserving the state of the original dashboards.


Figure 1. Adding vSAN related entities to a custom dashboard.

As shown in Figure 1, the vR Ops Management Pack for vSAN provides access to general metrics such as disk capacity, or granular metrics such as write buffer characteristics within a disk group. Custom dashboards can easily be adjusted to time windows that suit the needs of the administrator. Some custom dashboards may specifically target very short time windows to examine data with a higher level of detail, while others are intended to identify broader trends or statistical anomalies. Each approach is equally important for better understanding the demands of the cluster, and how the cluster is responding.

The power that comes from vR Ops and the management pack isn’t just from looking at all vSAN related metrics, but associating metrics that might be generally considered unrelated, such as CPU and storage I/O, and learn more about how they may impact each other. This type of discovery can lead to opportunities to optimize a data center the right way, transforming a collection of metrics into meaningful analytics.


Putting a custom dashboard to work: Evaluating cluster performance

Let’s look at an example of a very simple custom dashboard that will provide a better understanding of the performance of a vSAN cluster. The motive for this type of view is to provide a cluster -wide overview that is simple, interactive, and easy to read. The dashboard will present historical demands and behavior of a single vSAN datastore serving an entire vSphere cluster, by looking at commonly known trailing indicators; IOPS, throughput, and latency.


Figure 2. Custom dashboard showcasing cluster related performance of IOPS, Throughput and Latency for reads and writes.

The custom dashboard shown in Figure 2 has been optimized to show six key storage I/O metrics for interpreting general storage performance of a vSAN datastore:

  • Reads Per Second (IOPS)
  • Writes Per Second (IOPS)
  • Read Throughput (MBps)
  • Write Throughput (MBps)
  • Average Read Latency (ms)
  • Average Write Latency (ms)

There are certainly other critical storage metrics to that contribute to storage performance and efficiency, but this simplified view can be an effective way to discover aspects of your environment not readily visible when looking at the metrics in isolation. Metrics often require context to make sense. For example, latency is dependent on other metrics to provide real meaning. With this custom dashboard, vR Ops will help you see the correlation between these metrics in a variety of ways.


Figure 3. Metrics can be toggled off or on by clicking on each metric in the key.

As shown in Figure 3, the six metrics in the dashboard view can be easily toggled on or off as needed by simply clicking on the metric in the key. Being selective on which metrics are visible at any given time helps achieve two objectives:

  • Minimize visual noise so that proper context can be illustrated between similar metrics. For instance, write IOPS versus read IOPS.
  • Make effective use of the single value for the Y axis. The charts in vR Ops are limited to a single value for the Y axis. Attempting to show two metrics that have two very different Y axis values may prevent one of them from being visible. If one wants to understand the correlating latency of write I/Os, “Writes Per Second (IOPS)” and “Average Write Latency” can be toggled on, then toggling “Writes Per Second (IOPS)” on and off will show the correlating latency.

This approach certainly isn’t the only way to present this data. Multiple predefined views in a dashboard could be used to show similar results. The objective of this chart is to present a large area of display and interact with each metric in a variety of ways.


Try it yourself using a click-through demonstration

A click-through demonstration of a vSAN Cluster Analysis can be found on StorageHub that details the approach for interpreting the behaviors of vSAN based storage using the custom dashboard shown above. The click-through demo presents the steps for toggling specific metrics off and on so that they can be interpreted easily and accurately. Some of the toggling is repeated to best illustrate the result.

Near the end of the demonstration, the analysis of I/O behavior will showcase two commonly misunderstood and overlooked aspects of storage I/O across data centers. Incorrect assumptions are often made around read-write distributions, and common I/O sizes used in transmitting storage payload. The click-through demo using this custom view can help the viewer see these results more accurately than traditional methods.



Taking advantage of the flexibility of vR Ops and the extended visibility when using the management pack for vSAN, users can gain confidence in assuring that workloads demands are being met, as well as having an easy way to quickly see evolving demands in the data center. vR Ops paired with the management pack for vSAN can easily be the go-to tool for planning, operations, and optimization of a data center, and lets the administrator make data driven decisions about the environment.


Leave a Reply

Your email address will not be published. Required fields are marked *