Capacity management is a critical aspect of managing vSphere clusters. Capacity management goals can vary depending on the type of VMs running in the cluster or based on business requirements. Failure to manage capacity based on business goals can lead to capacity shortfalls or performance problems.
In vRealize Operations 6.7 and 7.0, there was only the Demand model, but with the release of vRealize Operations 7.5, Allocation model is now available. Allocation model unlocks additional use cases that will help guide you towards more efficient utilization of your clusters and better projections of future utilization.
The Demand model was the only capacity model available in vRealize Operations 6.7 and 7.0. The reason it’s called demand model is because it looks at the demand for resources in the cluster to determine the amount of capacity needed. Demand is utilization plus unmet demand due to contention like CPU Ready and Co-Stop. The goal with demand model is to drive towards the most efficient utilization a cluster based on the actual utilization of resources in the cluster.
I speak to a lot of customers about capacity management, and allocation model comes up in several use cases. The first one is avoiding overcommitment for business critical workloads. In these environments, the cost of the unused resources is not worth the risk of overcommitting resources.
The next use case is showback and reporting. There are typically restrictions such as contractual obligations or SLAs that mandate capacity not be overcommitted beyond an agreed upon ratio. Note these restrictions are usually non-technical.
Some customers like to do procurement planning based on overcommit ratios. A comfortable overcommit ratio is determined, and that’s what is used to project utilization into the future. The overcommit ratio is intended to be a rough estimate of utilization, e.g. 4:1 CPU overcommit ratio means that on average each vCPU will only run 25% utilization.
Many customers I have talked with have historically used Excel for capacity planning because it’s easy to do the mathematical formulas and charts. Getting the data needed into Excel is time consuming so overcommit ratios are used instead of the actual demand for resources. Since overcommit ratios don’t consider actual utilization, there is a real risk of utilization spikes causing performance problems when the wrong overcommit ratio is used.
By default, vRealize Operations 7.5 only uses the demand model. There is no configuration necessary to enable it, and it cannot be disabled. The reason why it can’t be disabled is due to the nature of demand model being based on the actual demand for resources in a cluster. If a cluster is running out of capacity due to high demand, that could be a critical problem because failure to address the capacity shortfall will most likely lead to performance problems.
Allocation model, on the other hand, is not enabled by default. Enabling allocation model can be done from the Assess Capacity page by clicking on the icon shown in the screenshot below.
The overcommit ratios for the cluster can be entered in the settings box. Note the Affected Policy at the top to see which policy will be changed. It’s possible to configure a different overcommit ratio for each cluster. For example, development vs. production vs. business critical environments may need different ratios. If different overcommit ratios are needed, a policy will be needed for each ratio and applied to the cluster before making the change to the settings. The example screenshot below shows the policy named Allocation Model has been configured for allocation model. More info on creating policies is available in the product documentation.
The beauty about how allocation model has been implemented in vRealize Operations 7.5, is it uses the same capacity analytics engine that was introduced in vRealize Operations 6.7. That means projections for Time Remaining and Capacity Remaining now work for both demand and allocation model equally.
Time Remaining is based on the most constrained resource, for example, 23 Days until the Production_North cluster runs out of memory based on allocation model.
An excellent example of why demand model is always enabled is shown with CPU in the screenshot below. The cluster is projected to run out of CPU in 76 days based on demand in contrast to 184 days due to allocation. In this situation, if only allocation model was enabled, performance problems could happen around 76 days in the future even though the overcommit ratio has not been reached. If this is deemed acceptable from a demand perspective, lowering the overcommit ratio is suggested.
Capacity Allocation Overview Dashboard
If you were a user of vRealize Operations 6.7 or 7.0, you may have used the Capacity Allocation Overview dashboard to track overcommit ratios. This dashboard has been updated to use the configured overcommit ratios, which helps a lot if there a multiple overcommit ratios throughout an environment like for development vs. production.
While I’ve covered how to enable allocation model, shown how it affects projections, and use cases for allocation model in this post, those are not the only areas affected by allocation model in vRealize Operations 7.5. A video walkthrough is available HERE. Keep an eye here for future posts covering allocation model’s impact on reclamation, Virtual Machines remaining, and costing. All exciting stuff!
I hope you have a better understanding how allocation model can be used in vRealize Operations, and why it might be needed for some situations. If you don’t have vRealize Operations today, you can download a trial of vRealize Operations 7.5 and try it out in your environment! You can also find more demos and videos on vrealize.vmware.com.