Home > Blogs > VMware VROOM! Blog > Tag Archives: DRS

Tag Archives: DRS

vCenter performance improvements from vSphere 6.5 to 6.7: What does 2x mean?

In a recent blog, the VMware vSphere team shared the following performance improvements in vSphere 6.7 vs. 6.5:

Moreover, with vSphere 6.7 vCSA delivers phenomenal performance improvements (all metrics compared at cluster scale limits, versus vSphere 6.5):
2X faster performance in vCenter operations per second
3X reduction in memory usage
3X faster DRS-related operations (e.g. power-on virtual machine)

As senior engineers within the VMware Performance and vSphere teams, we are writing this blog to provide more details regarding these numbers and to explain how we measured them. We also briefly explain some of the technical details behind these improvements.

Cluster Scale

Let us first explain what all metrics compared at cluster scale limits means. What is cluster scale? Here, it is an environment that includes a vCenter server that is configured for the largest vSphere cluster that VMware currently supports, namely 64 hosts and 8,000 powered-on VMs. This setup represents a high-consolidation environment with 125 VMs per host. Note that this is different from the setup used in our previous blog about vCenter 6.5 performance improvements. The setup in that blog was our datacenter scale environment, which used the largest number of supported hosts (2000) and VMs (25,000), so the numbers from that blog should not be compared to these numbers.

2x and 3x

Let us now explain some of the performance numbers quoted. We produced the numbers by measuring workload runs in our cluster scale setup.

2x vCenter Operations Per Second, vSphere 6.7 vs. 6.5, cluster scale limits. We measure operations per second using an internal benchmark called vcbench. We describe vcbench below under “Benchmark Details.” One of the outputs of this workload is management operations (for example, clone, powerOn, vMotion) performed per second.

  • In 6.5, vCenter was capable of performing approximately 8.3 vcbench operations per second (described below under “Benchmark Details”) in the cluster-scale testbed.
  • In 6.7, vCenter is now capable of performing approximately 16.7 vcbench operations per second.

3x reduction in memory usage. In addition to our vcbench workload, we also include a simplified workload that simply executes a standard workflow: create a VM, power it on, power it off, and delete it. The rapid powerOn and powerOff of VMs in this setup puts more load on the DRS subsystem than the typical vcbench test.

  • In 6.5, the core vCenter process (vpxd) used on average about 10 GB to complete the workflow benchmark (described below under “Benchmark Details”).
  • In 6.7, the core vCenter process used approximately 3 GB to complete this run, while also achieving higher churn (that is, more workflow ‘create/powerOn/powerOff/delete’ cycles completed within the same time period).

3x faster DRS-related operations. In our vcbench workload, we measure not just the overall operations per second, but also the average latencies of individual operations like powerOn (which exercises the majority of the DRS software stack). We issue many concurrent operations, and we measure latency under such load.

  • In 6.5, the average latency of an individual powerOn during a vcbench run was 9.5 seconds.
  • In 6.7, the average latency of an individual powerOn during a vcbench run was 2.8 seconds.

The latencies above reflect the fact that a cluster has 8,000 VMs and many operations in flight at once. As a result, individual operations are slower than if they were simply run on a single-host, single-VM environment.

What does this mean to you as a customer?

As a result of these improvements, customers in high-consolidation environments will see reduced resource consumption due to DRS and reduced latency to generate VMotions for load balancing. Customers will also see faster initial placement of VMs.

Brief Deep Dive into Improvements

Before we describe the improvements, let us first briefly explain how DRS works, at a very high level.

When powering on a VM, vCenter must determine where to place the VM. This is called initial placement. Many subsystems, including DRS and policy management, must be consulted to determine valid hosts on which this VM can run. This phase is called constraint check. Once DRS determines the host on which a VM should be powered on, it registers the VM onto that host and issues the powerOn. This initial placement requires a snapshot of the inventory: by snapshot, we mean that DRS records the current configuration of hosts and VMs in the cluster.

In addition to balancing during initial placement, every 5 minutes, DRS re-examines the load of the cluster and performs a series of computations to generate VMotions that help balance the load across hosts. This phase is called periodic rebalancing. Periodic rebalancing requires an examination of the historical utilization statistics for each host and VM (for example, over the previous hour) in order to determine proper placement.

Finally, as VMs get moved around, the used capacity in resource pools changes. The vCenter server periodically exchanges messages called SpecSyncs with each host to push down the most recent resource pool configuration. The SpecSync operation requires traversing a host’s resource pool structure and changing it to make sure it matches vCenter’s configuration.

With this understanding in mind, let us now give some technical details behind the improvements above. As with our previous blog about vCenter performance improvement, we describe changes in terms of rocks (that is, somewhat large changes to entire subsystems) and pebbles (smaller individual changes throughout the code base).

Rocks

The three main rocks that we address in 6.7 are simplified initial placement, faster resource divvying, and faster SpecSyncs.

Simplified initial placement. As mentioned above, initial placement relies on a snapshot of the current state of the inventory. Creating this snapshot can be a heavyweight operation, requiring a number of data copies and locking of host and cluster data structures to ensure a consistent view of the data. In 6.7, we moved to a lightweight online approach that keeps the state up-to-date in a continuous manner, avoiding snapshots altogether. With this approach, we significantly reduce locking demands and significantly reduce the number of times we need to copy data. In some highly-contended clusters, this reduced the initial placement time from seconds down to milliseconds.

Faster (and more frequent) resource divvying. Divvying is the act of determining the resource allocations for each VM. Every five minutes, the state of the cluster is examined and both divvying and then rebalancing (using VMotion) are performed. To make the divvying phase faster, we performed a number of optimizations.

  • We changed the approach to examining historical usage statistics. Instead of storing metrics for every VM and every host over an hour, we aggregated the data, which allowed us to store a smaller number of metrics per host. This dramatically reduced memory usage and simplified the computation of the desired load for each host.
  • We restructured the code to remove compatibility checks (for example, those that help to determine which VMs can run on which hosts) during this divvying phase. In 6.5 and earlier, divvying a load also involved various host/VM compatibility calculations. Now, we store the compatible matrix and update it when compatibility changes, which is typically infrequent.
  • We have also done significant code refactoring (described below under “Pebbles”) to this code path.

By implementing these changes and making divvying faster in 6.7, we are now able to perform divvying more frequently: once per minute instead of once every five minutes. As a result, resources flow more quickly between resource pools, and we are better able to enforce fairness guarantees in DRS clusters.

Note that periodic load balancing (through VMotion) still occurs every five minutes.

Faster SpecSyncs. To perform a SpecSync, vCenter sends a resource pool configuration to a host. The host must examine this configuration and create a list of changes required to bring that host in sync with vCenter. In 6.5 and earlier, depending on the number of VMs, creating this list of changes could result in hundreds of operations on a host, and the runtime was highly variable. In 6.7, we made this computation more deterministic, reducing the number of operations and lowering the latency appropriately.

Pebbles

In addition to the changes above, we also performed a number of optimizations throughout our code base.

Code Refactoring. In 6.5 and before, admission control decisions were made by multiple independent subsystems within vCenter (for example, DRS would be responsible for some decisions, and HA would make others). In 6.7, we simplified our code such that all admission control decisions are handled by a module within DRS. Reducing multiple copies of this code simplifies debugging and reduces resource usage.

Finer-grained locks. In 6.7, we continued to make strides in reducing the scope of our locks. We introduced finer-grained locks so that DRS would not have to lock an entire VM to examine certain pieces of state. We made similar improvements to both hosts and clusters.

Removal of unnecessary classes, maps, and sets. In refactoring our code, we were able to remove a number of classes and thereby reduce the number of copies of data in our system. The maps and sets that were needed to support these classes could also be removed.

Preferring integers over strings. In many situations, we replaced strings and string comparisons with integers and integer comparisons. This dramatically reduces memory allocation overhead and speeds up comparisons, reducing CPU.

Benchmark Details

We measure Operations Per Second (OPS) using a VMware benchmark called vcbench. (For more details about vcbench, see “Benchmarking Details” in vCenter 6.5 Performance: what does 6x mean?) Briefly, vcbench is a java-based application that takes as an input a runlist, which is a list of management operations to perform. These operations include powering on VMs, cloning VMs, reconfiguring VMs, and migrating VMs, among many other operations. We chose the operations based on an analysis of typical customer management scenarios. The vcbench application uses vSphere APIs to issue these operations to vCenter. The benchmark creates multiple threads and issues operations on those threads in parallel (up to 32). We measure the operations per second by taking the total number of operations performed in a given timeframe (say, 1 hour), and dividing it by the time interval.

Our workflow benchmark is very similar to vcbench. The main difference is that more operations are issued per host at a time. In vcbench, one operation is issued per host at a time, while in workflow, up to 8 operations are issued per host at a time.

In many cases, the size of the VM has an impact on operational latency. For example, a VM with a lot of memory (say, 32 GB) and large disks (say, 100 GB) will take longer to clone because more memory will need to be snapshotted, and more disk data will need to be copied. To minimize the impact of the disk subsystem in our measurements, we use small VMs (<4GB memory, < 8GB disk).

Because we limit ourselves to 32 threads per vCenter in this single-cluster setup, throughput numbers are smaller than for our datacenter-at-scale setups (2,000 hosts; 25,000 VMs), which use up to 256 concurrent threads per vCenter.

Summary

In this blog, we have described some of the performance improvements in vCenter from 6.5 to 6.7. A variety of improvements to DRS have led to improved throughput and reduced resource usage for our vcbench workload in a cluster scale setup of 64 hosts and 8,000 powered-on VMs. Many of these changes also apply to larger datacenter-scale setups, although the scope of improvement may not be as pronounced.

Acknowledgments

The vCenter improvements described in this blog are the results of thousands of person-hours from vCenter developers, performance engineers, and others throughout VMware. We are deeply grateful to them for making this happen.

Authors

Zhelong Pan is a senior staff engineer in the Distributed Resource Management Team at VMware. He works on cluster management, including shared resource allocation, VM placement, and load balancing. He is interested in performance optimizations, including virtualization performance and management software performance. He has been at VMware since 2006.

Ravi Soundararajan is a principal engineer in the Performance Group at VMware. He works on vCenter performance and scalability, from the UI to the server to the database to the hypervisor management agents. He has been at VMware since 2003, and he has presented on the topic of vCenter Performance at VMworld from 2013-2017. His Twitter handle is @vCenterPerfGuy.

What-If? Resource Management with vSphere DRS

vSphere Distributed Resource Scheduler (DRS) provides a simple and easy way to manage your cluster resources. DRS works well, out of the box for most vSphere installations.

For cases where more flexibility is desired in how the cluster is managed, DRS provides many options in the form of cluster rules, settings and advanced options.

Often the impact of using rules in a DRS cluster is not very well understood. The settings and advanced options are not very well documented. Imagine if it was possible to play around with rules in your cluster before actually applying them, or changing the DRS migration threshold in your cluster without changing the setting in your live cluster – and yet, be able to visualize the impact of those actions in your cluster?

Introducing – DRS Dump Insight – to help with simple queries regarding DRS behavior, like the following.

  • What if I dropped all the affinity rules in my cluster?
  • What if I set cluster advanced option “AggressiveCPUActive”?
  • What if I changed the DRS migration threshold from 3 to 5?

Continue reading

Introducing DRS DumpInsight

In an effort to provide a more insightful user experience and to help understand how vSphere DRS works, we recently released a fling: DRS Dump Insight.

DRS Dump Insight is a service portal where users can upload drmdump files and it provides a summary of the DRS run, with a breakup of all the possible moves along with the changes in ESX hosts resource consumption before and after DRS run.

Users can get answers to questions like:

  • Why did DRS make a certain recommendation?
  • Why is DRS not making any recommendations to balance my cluster?
  • What recommendations did DRS drop due to cost/benefit analysis?
  • Can I get all the recommendations made by DRS?

Continue reading

DRS Lens – A new UI dashboard for DRS

DRS Lens provides an alternative UI for a DRS enabled cluster. It gives a simple, yet powerful interface to monitor the cluster real time and provide useful analyses to the users. The UI is comprised of different dashboards in the form of tabs for each cluster being monitored.

Continue reading

vSphere 6.5 DRS Performance – A new white-paper

VMware recently announced the general availability of vSphere 6.5. Among the many new features in this release are some DRS specific ones like predictive DRS, and network-aware DRS. In vSphere 6.5, DRS also comes with a host of performance improvements like the all-new VM initial placement and the faster and more effective maintenance mode operation.

If you want to learn more about them, we published a new white-paper on the new features and performance improvements of DRS in vSphere 6.5. Here are some highlights from the paper:

 

65wp-blog-3

 

65wp-blog-2

Expandable Reservation for Resource Pools

One of the questions I was often asked about resource pools (RP) is ‘Expandable reservation’. What is expandable reservation, and why should I care about it? Although it sounds intuitive, it can be easily misunderstood.

To put it simply, a resource pool with ‘expandable reservation’ can expand its reservation by asking more resources from its parent .

The need to expand reservation comes from the increase in reservation demand of its child objects (VMs or resource pools). If the parent resource pool is short of resources, then the parent expands it reservation asking resources from the grand parent.

Let us try to understand this with a simple example. Consider the following RP hierarchy. If RP-4 has to expand its reservation, it requests resources from its parent RP-3 and if RP-3 has to expand resources it eventually requests Root-RP.

exp-res-7

Continue reading

Latency Sensitive VMs and vSphere DRS

Some applications are inherently highly latency sensitive, and cannot afford long vMotion times. VMs running such applications are termed as being ‘Latency Sensitive’. These VMs consume resources very actively, so vMotion of such VMs is often a slow process. Such VMs require special care during cluster load balancing, due to their latency sensitivity.

You can tag a VM as latency sensitive, by setting the VM option through the vSphere web client as shown below (VM → Edit Settings → VM Options → Advanced)

edit-vmsettings-2
By default, the latency sensitivity value of a VM is set to ‘normal’. Changing it to ‘high’ will make the VM ‘Latency Sensitive’. There are other levels like ‘medium’ and ‘low’ which are experimental right now. Once the value is set to high, 100% of the VM configured memory should be reserved. It is also recommended to reserve 100% of its CPU. This white paper talks more about the VM latency sensitivity feature in vSphere.

Continue reading

Understanding vSphere DRS Performance – A White Paper

VMware vSphere Distributed Resource Scheduler (DRS) is responsible for placement of Virtual Machines and balancing of resources in a cluster. The key driver for DRS is VM/Application happiness, and it achieves this by effective VM placement and efficient load balancing. We have a new white paper, which tries to explain how DRS works in basic scenarios and how it can be tuned to behave differently for specific scenarios.

The white paper talks about the factors that influence DRS decisions and provides some useful insights into different parameters that can be tuned in specific scenarios to make DRS more effective. It also explains how to monitor DRS to better understand its behavior.

It covers DRS behavior in specific scenarios with some case studies. Some of these studies are around

  •  VM Consumed vs. Active Memory – How it impacts DRS behavior.
  •  Impact of VM overrides on cluster balance.
  •  Prerequisite moves during initial placement.
  •  Using shares to prioritize cluster resources.

The paper provides knowledge about the factors that affect DRS behavior and helps understand how DRS does what it does. This knowledge, along with monitoring and troubleshooting tips, including real case studies, will help tune DRS clusters for optimum performance.

DRS Doctor is here to diagnose your DRS clusters

Mystery revealed, DRS for VMware vSphere is no more a black box! DRS Doctor will tell you all you need to know about your DRS clusters.

Our latest fling, DRS Doctor, will monitor your DRS clusters for virtual machine and host resource usage data, DRS-recommended migrations, and the reason behind each migration. It also monitors all the cluster-related events, tasks, and cluster balance, and logs all this information into a plain text log file that anyone can read.

Read this blog for more information on how DRS Doctor can monitor and diagnose your clusters.

Download DRS Doctor from our flings site.

Performance Best Practices for VMware vSphere 5.0

A new version of Performance Best Practices for vSphere is now available.  This is a book designed to help system administrators obtain the best performance from vSphere deployments.

We've addressed many of the new features in vSphere 5.0 from a performance perspective.  These include:

  • Storage Distributed Resource Scheduler (Storage DRS), which performs automatic storage I/O load balancing
  • Virtual NUMA, allowing guests to make efficient use of hardware NUMA architecture
  • Memory compression, which can reduce the need for host-level swapping
  • Swap to host cache, which can dramatically reduce the impact of host-level swapping
  • SplitRx mode, which improves network performance for certain workloads
  • VMX swap, which reduces per-VM memory reservation
  • Multiple vMotion vmknics, allowing for more and faster vMotion operations

We've also significantly updated and expanded many of the topics we've covered in previous editions of the book.  These include:

  • Choosing hardware for a vSphere deployment
  • Power management
  • Configuring ESXi for best performance
  • Guest operating system performance
  • vCenter and vCenter database performance
  • vMotion and Storage vMotion performance
  • Distributed Resource Scheduler (DRS) and Distributed Power Management (DPM) performance
  • High Availability (HA), Fault Tolerance (FT), and VMware vCenter Update Manager performance

The book can be found at: Performance Best Practices for VMware vSphere 5.0.