Virtualization

What-If? Resource Management with vSphere DRS

vSphere Distributed Resource Scheduler (DRS) provides a simple and easy way to manage your cluster resources. DRS works well, out of the box for most vSphere installations.

For cases where more flexibility is desired in how the cluster is managed, DRS provides many options in the form of cluster rules, settings and advanced options.

Often the impact of using rules in a DRS cluster is not very well understood. The settings and advanced options are not very well documented. Imagine if it was possible to play around with rules in your cluster before actually applying them, or changing the DRS migration threshold in your cluster without changing the setting in your live cluster – and yet, be able to visualize the impact of those actions in your cluster?

Introducing – DRS Dump Insight – to help with simple queries regarding DRS behavior, like the following.

  • What if I dropped all the affinity rules in my cluster?
  • What if I set cluster advanced option “AggressiveCPUActive”?
  • What if I changed the DRS migration threshold from 3 to 5?

 How does DRS Dump Insight work?

DRS Dump Insight is designed as an online portal, where users can upload their drmdump file and specify the vCenter server version. After analyzing the drmdump file, the portal will present the user with a summary of vMotion recommendations that DRS made during the interval for which the drmdump file was collected. The portal also displays a summary of how the host resource usage distribution changed after the given DRS run.

DRS Dump Insight also comes with an option for What-If? analysis. The What-If? analysis runs the DRS algorithm against the same drmdump file with the options specified by the user to come up with a new summary of vMotion recommendations and the resource usage distribution. In effect, the What-If? options provide a summary of how DRS would react to the options, without actually having to run them against the live cluster. There are three What-If? options that are provided.

Rules

Rules are specified to enforce certain constraints for VMs, with respect to other VMs or hosts within the cluster. While they provide the flexibility to override DRS load balancing in order to meet performance, licensing or compliance SLAs, rules can often hinder the ability of DRS to balance the load optimally. If you want to find out if rules in your cluster are preventing DRS from improving the current load balance, you can use the What-If? option for rules. There are three types of rules you can choose:

  • Affinity Rules
  • Anti-affinity Rules
  • VM Host Soft Rules

DRS Dump Insight will turn off all rules of the specified type(s), run DRS again and provide a summary of vMotion recommendations. This summary can then be compared against the original summary to see the impact of dropping the selected rules on DRS load balancing.

Let us look at an example: We uploaded a drmdump file corresponding to a DRS run from our lab’s cluster in to DRS Dump Insight. From the summary of DRS recommendations, we could see that DRS did not recommend any vMotion. We then used DRS Dump Insight’s What-If? analysis to drop all soft rules in the cluster. At this point, DRS recommended 8 vMotions, and the Hosts Resource Usage Standard Deviation also improved after dropping all the soft rules. The following image compares the DRS recommendations with and without soft rules in the cluster.

We can clearly see that in this case, DRS was not able to recommend any vMotions, since it had to honor the soft rules defined in the cluster. Once we removed all of those rules, DRS was able to improve the imbalance in the cluster by recommending vMotions.

Migration Threshold

DRS migration threshold controls the amount of imbalance that will be tolerated in the cluster. DRS has five threshold levels ranging between 1 (most conservative) and 5 (most aggressive). The more aggressive the level, the less DRS tolerates imbalance in the cluster. As a result, you might see DRS initiate more migrations and generate a more even load distribution when you increase the migration threshold level. By default, DRS migration threshold level is set to 3. If you want to know how DRS will behave if you change its migration threshold, you can use the What-If? option for migration threshold. You can specify a level from 1 to 5, and compare the resulting DRS run summary against the current level.

Let us take an example. The following drmdump was taken when DRS was at migration threshold of 3. DRS made 5 vMotion recommendations during this run. Using What-If? analysis, when we increased the migration threshold to 5, we see that DRS makes 78 vMotion recommendations, as shown in figure.

Advanced Options

DRS generally works well with the default/recommended settings. However, not all clusters are the same, and some special cases might require specific customizations in DRS for best performance. DRS provides several advanced options to handle specific cluster requirements outside of recommended settings. You can now use What-If? to specify certain hand-picked advanced options against the current cluster state (drmdump file). This gives you an idea of how the cluster state will change as a result of enabling the advanced option, before actually enabling it in your live cluster.

 

DRS Dump Insight is an attempt at providing an under-the-hood view of how DRS is working to keep your cluster happy and healthy. It is also the first tool to provide a unique and powerful What-If? analysis to help make the best use of DRS capabilities in your clusters.