Home > Blogs > VMware VROOM! Blog > Tag Archives: vSphere 6.7 U1

Tag Archives: vSphere 6.7 U1

vSAN Performance Diagnostics Now Shows “Specific Issues and Recommendations” for HCIBench

By Amitabha Banerjee and Abhishek Srivastava

The vSAN Performance Diagnostics feature, which helps customers to optimize their benchmarks or their vSAN configurations to achieve the best possible performance, was first introduced in vSphere 6.5 U1. vSAN Performance Diagnostics is a “cloud connected” feature and requires participation in the VMware Customer Experience Improvement Program (CEIP). Performance metrics and data are collected from the vSAN cluster and are sent to the VMware Cloud. The data is analyzed and the results are sent back for display in the vCenter Client. These results are shown as performance issues, where each issue includes a problem with its description and a link to a KB article.

In this blog, we describe how vSAN Performance Diagnostics can be used with HCIBench and show the new feature in vSphere 6.7 U1 that provides HCIBench specific issues and recommendations.

What is HCIBench?

HCIBench (Hyper-converged Infrastructure Benchmark) is a standard benchmark that vSAN customers can use to evaluate the performance of their vSAN systems. HCIBench is an automation wrapper around the popular and proven VDbench open source benchmark tool that makes it easier to automate testing across an HCI cluster. HCIBench, available as a fling, simplifies and accelerates customer performance testing in a consistent and controlled way.

Example: Achieving Maximum IOPS

As an example, consider the following HCIBench workload that is run on a vSAN system:

Number of
VMs
Number of
Disks (vmdks)
to Test
Number of
Threads/Disk
Working Set
Percentage
Block Size Read/Write
Percentage
Randomness
Percentage
1 10 1 100 4 KB 0/100 100

If the goal is to achieve maximum IOs per second (IOPS), vSAN Performance Diagnostics for this workload yields the result shown in figure 1.

Figure 1

In this example, vSAN Performance Diagnostics reports, “The Outstanding IOs for the benchmark is too low to achieve the desired goal”. Here we can see that the feedback from vSAN Performance Diagnostics tells us about the problem, and a possible solution recommends that we increase the number of outstanding IOs to a value of 2 per host. The linked “Ask VMware” article explains this issue and what we must do with the benchmark in more detail.

The new HCIBench Exceptions and Recommendations feature removes the need to read through a KB article by precisely mapping recommendations to one (or more) configurable parameters of HCIBench.

Now let us check how vSAN Performance Diagnostics works with vSphere 6.7 U1, which has access to this new feature (figure 2).

Figure 2

Now we can clearly see an exact issue in our HCIBench workload configuration that provides the precise recommendation to resolve this issue with the message: “Increase number of threads per disk from 1 to 2”.

Let us monitor the current write IOPS generated by the benchmark for our reference. We go to Data Center–>Cluster–>Monitor–>vSAN–>Performance (figure 3).

Figure 3

Now, as part of the evaluation, we apply the recommendation generated by vSAN Performance Diagnostics and check its impact. We now run the new HCIBench workload configuration with 2 threads per disk and the following parameters:

Number of
VMs
Number of
Disks (vmdks)
to Test
Number of
Threads/Disk
Working Set
Percentage
Block Size Read/Write
Percentage
Randomness
Percentage
1 10 2 100 4KB 0/100 100

After the benchmark completes, we use vSAN Performance Diagnostics again to see if we can now achieve the required goal of maximum IOPS. The result from Performance Diagnostics now shows “No Issues were found”, which means that we are achieving good IOPS from the vSAN system (figure 4).

Figure 4

Now, as part of actual verification, we can see the change in IOPS after applying the recommendation. From figure 5 below (screenshot from Data Center–>Cluster–>Monitor–>vSAN–>Performance), we can clearly see that there is a 25-30% increase in IOPS after applying the recommendation, which verifies that it helped us achieve our goal.

Figure 5

We believe that this feature will be very useful for customers who want to tune their HCIBench workload for a desired goal.

Prerequisites

  • Please note that this feature is currently integrated with HCIBench 1.6.7 or later.
  • This feature is available for vSphere 6.7 U1 and newer releases. It will not be available to patch releases of vSphere 6.7.

Storage DRS Performance Improvements in vSphere 6.7

Virtual machine (VM) provisioning operations such as create, clone, and relocate involve the placement of storage resources. Storage DRS (sometimes seen as “SDRS”) is the resource management component in vSphere responsible for optimal storage placement and load balancing recommendations in the datastore cluster.

A key contributor to VM provisioning times in Storage DRS-enabled environments is the time it takes (latency) to receive placement recommendations for the VM disks (VMDKs). This latency particularly comes into play when multiple VM provisioning requests are issued concurrently.

Several changes were made in vSphere 6.7 to improve the time to generate placement recommendations for provisioning operations. Specifically, the level of parallelism was improved for the case where there are no storage reservations for VMDKs. This resulted in significant improvements in recommendation times when there are concurrent provisioning requests.

vRealize automation suite users who use blueprints to deploy large numbers of VMs quickly will notice the improvement in provisioning times for the case when no reservations are used.

Several performance optimizations were further made inside key steps of processing the Storage DRS recommendations. This improved the time to generate recommendations, even for standalone provisioning requests with or without reservations.

Test Setup and Results

We ran several performance tests to measure the improvement in recommendation times between vSphere 6.5 and vSphere 6.7. We ran these tests in our internal lab setup consisting of hundreds of VMs and few thousands of VMDKs. The VM operations are

  1. CreateVM – A single VM per thread is created.
  2. CloneVM – A single clone per thread is created.
  3. ReconfigureVM – A single VM per thread is reconfigured to add an additional VMDK.
  4. RelocateVM – A single VM per thread is relocated to a different datastore.
  5. DatastoreEnterMaintenance – Put a single datastore into maintenance mode. This is a non-concurrent operation.

Shown below are the relative improvements in recommendation times for VM operations at varying concurrencies. The y-axis has a numerical limit of 10, to allow better visualization of the relative values of the average recommendation time. 

The concurrent VM operations show an improvement of between 20x and 30x in vSphere 6.7 compared to vSphere 6.5 

Below we see the relative average time taken among all runs for serial operations.

The Datastore Enter Maintenance operation shows an improvement of nearly 14x in vSphere 6.7 compared to vSphere 6.5

With much faster storage DRS recommendation times, we expect customers to be able to provision multiple VMs much faster to service their in-house demands. Specifically, we expect VMware vRealize Automation suite users to hugely benefit from these improvements.

SPBM compliance check just got faster in vSphere 6.7 U1!

vSphere 6.7 U1 includes several enhancements in Storage Policy-Based Management (SPBM) to significantly reduce CPU use and generate a much faster response time for compliance checking operations.

SPBM is a framework that allows vSphere users to translate their workload’s storage requirements into rules called storage policies. Users can apply storage policies to virtual machines (VMs) and virtual machine disks (VMDKs) using the vSphere Client or through the VMware Storage Policy API’s rich set of managed objects and methods. One such managed object is PbmComplianceManager. One of its methods, PbmCheckCompliance, helps users determine whether or not the storage policy attached to their VM is being honored.

PbmCheckCompliance is automatically invoked soon after provisioning operations such as creating, cloning, and relocating a VM. It is also automatically triggered in the background once every 8 hours to help keep the compliance records up-to-date.

In addition, users can invoke the method when checking compliance for a VM storage policy in the vSphere Client, or through the VMware Storage Policy API method PbmCheckCompliance.

We did a study in our lab to compare the performance of PbmCheckCompliance between vSphere 6.5 U2 and vSphere 6.7 U1. We present this comparison in the form of charts showing the latency (normalized on a 100-point scale) of PbmCheckCompliance for varying numbers of VMs.

The following chart compares the performance of PbmCheckCompliance on VMFS and vSAN environments.

As we see from the above chart, PbmCheckCompliance returns results much faster in vSphere 6.7 U1 compared to 6.5 U2. The improvement is seen across all inventory sizes and all datastore types and become more prominent for larger inventories and higher numbers of VMs.

The enhancements also positively impact a similar method, PbmCheckRollupCompliance. This method also returns the compliance status of VMs and adds compliance results for all disks associated with these VMs. The following chart represents the performance comparison of PbmCheckRollupCompliance on VMFS and vSAN environments.

Our experiments show that compliance check operations are significantly faster and more light-weight in vSphere 6.7 U1.