A consistent theme in questions from customers have been cropping up lately around VMware DRS (Distributed Resource Scheduler), p(predictive)DRS & workload placement. The questions are related to how DRS & pDRS actually work as well as some misunderstandings around the benefits they bring to customers’ environments. There are several descriptive posts on how DRS & pDRS work, why they were created and how to tune DRS for each unique environment. Even still, it warrants more discussion on exactly why we make some of the decisions we do with respect to balancing & workload placement – especially as other vendors are making inaccurate claims.
The business problem
It is important to take a step back from the technology and understand the business problems we are trying to solve. Firstly, where should I place new workloads? Should we place them round-robin on hosts or how can we make the right decision when a brand-new workload is on boarded into an environment? Secondly, once workloads are running, how can we ensure they have access to all the resources they need? Lastly, when I inevitably run out of resources, how can I ensure mission critical applications have higher priorities to resources?
What compounds these questions are the numerous ways that customers configure their datacenters and clusters. Some customers use a few single large clusters for everything including production and test/dev. Other customers may use many smaller clusters that are purpose built for a variety of reasons, eg. clusters to minimize license costs or production vs. test. Others build clusters and fill them up to a percentage, close any new workloads into it and create new clusters moving forward. It’s important that whatever is doing workload placement & balancing can work in each environment.
What metrics do DRS & pDRS consider?
The first thing to cover is what does DRS & pDRS consider. I am told a lot by customers that DRS does not account for CPU Ready (%RDY) or memory swap or [insert here] any other critical metric… DRS & pDRS consider the most critical metrics for compute, network and storage. To put it a different way, if you compare that to any other workload placement solution on the market, we look at 2x or more data points and made decisions 3x more frequently. For a more in-depth look at how DRS & pDRS function, I recommend these blog posts.
https://blogs.vmware.com/vsphere/2015/05/drs-keeps-vms-happy.html
https://blogs.vmware.com/vsphere/2016/05/load-balancing-vsphere-clusters-with-drs.html
https://blogs.vmware.com/management/2016/11/predictive-drs.html
Q. Why do the number of metrics and frequency matter? A. efficiency
Looking at the data every 5 minutes and making decisions allows DRS & pDRS to ensure that current and forecasted workloads will be able to run with minimal or no contention for resources. In our studies, this means less volatility in the datacenter, which ultimately allows for higher utilization of hardware and reduces potential impact to other workloads.
Initial placement
The initial provisioning time for a new workload is minimal compared to the weeks, months or years it may live on for. Because of this, initial placement of a new workload is a generally a spot decision. There simply isn’t enough data to warrant more long term decisions when the future demand is all but unknown. Even if it is a clone of an existing workload, there is no guarantee it will perform the same. For initial placement, DRS looks at the allocated resources of the workload being powered on and determines the best host to place and “power-on”.
Ongoing balance vs. contention
Another consideration about DRS & pDRS is that it not just about balancing a cluster. Moving workloads simply to spread the load evenly has a cost. People tend to forget that vMotions & storage vMotions are not free from a resource perspective. Constantly moving workloads for best fit, only exacerbates issues. There is minimal gain, if any, from balancing a host at 50% and one at 60%. In most environments where there are 4-16 hosts per cluster, this constant movement can cause more issues than it solves. A good way to measure this is to use service accounts for any tools talking to vCenter and monitoring what they do. A lot 3rd party tools make exorbitant numbers of API calls, vMotions and log data that fill up vCenter. A good way to keep those in check is with vRealize Log Insight which can quickly highlight those noisy tools.
Contrary to just balancing, DRS & pDRS ensures that workloads are moving less often without performance issues. If all workloads on a given host can access all the resources they are entitled to, and therefore is no contention, there is no performance gain in moving that workload. Again, it would mean a resource increase to move the workload to another host. Simply put, contention must be weighed heavily when balancing workloads.
Reactive vs. proactive
What VMware customers tell me they want is a solution that looks at the current state of entire environment, understands past trends, and proactively moves workloads to avoid contention. This is where pDRS comes into the conversation. Most workloads have some consistent trends. One over simplified example is to think about a proxy server. Traffic will generally be consistent throughout the day, with peaks at morning arrival of employees, lunch time and just prior to departure for the day. If you know the demand from historical data, then you can predict with some level of accuracy the future demand. That prediction, which comes from vRealize Operations, feeds pDRS and DRS to proactively move workloads to ensure they have the resources they need now AND in the future.
Data granularity
In a previous post, I wrote about why the granularity of historical data matters. A good example of that is looking at the vCenter yearly data charts vs. hourly. You can see the peaks and valleys in the hourly chart vs. the yearly things are much smoother. Essentially, when you roll up data over time you lose the importance of it. This becomes critical for making proactive decisions. This is one of the reasons why vRealize Operations uses the same data sample size as DRS and keeps it for long periods of time. If the quality of the data is rolled up over time, there is simply no way to make accurate decisions based on it.
pDRS maximum
Because of the amount of data that pDRS is leveraging, there currently is a supported limit of 4,000 VMs per cluster. This is half of the vCenter maximum for # of VMs per cluster, which is 8,000. To put that in perspective, if you have the maximum number of hosts per cluster (64), you would need to run more than 62 1/2 VMs per host to exceed the supported limit. Most customers I’ve spoken with are leveraging 16 hosts or fewer, which would mean you need to have >250 VMs per host to hit this limit. Again, this is a per cluster supportability limit.
With the release of vRealize Operations 6.5 the scalability limits have been removed and are officially supported to the cluster maximums. You can read more about this in the release notes.
Conclusion
Hopefully, this post can help address some of the more common questions & concerns you may have about DRS & pDRS. If you have more questions, I’d love to answer them or even continue more posts about this, so please leave questions and comments down below.