Author Archives: Aravind Pavuluri

ESX scheduler support for SMP VMs: co-scheduling and more

ESX supports virtual machines configured with multiple virtual CPUs (for example, ESX 3.x supports up to
4 vCPUs). Handling mixed loads of uniprocessor and multiprocessor VMs can be challenging for a
scheduler to get right. This article answers some common questions about deploying multiprocessor VMs,
and describes the algorithms used by the ESX scheduler to provide both high performance and fairness.

When considering multiprocessor VMs, the following questions naturally arise for ESX users:

a) When do I decide to configure multiple vCPUs for a VM?
b) What are the overheads of using multiprocessor VMs? What would I lose by over provisioning vCPUs 
     for VMs?
c) Does the ESX scheduler (co-schedule) all of the vCPUs belonging to a VM together?
d) Why is co-scheduling necessary and important?
e) How does ESX scheduler deal with certain vCPUs belonging to a VM idling while others actively
     perform work? Do the idle vCPUs unnecessarily burn CPU?

Let’s answer these questions briefly:

a) It makes sense to configure multiple vCPUs for a VM when:
    1. The application you intend to run within the VM is both multi-threaded (Apache Web Server, MS
         Exchange 2007, etc) and these application threads can indeed make good use of additional
         processors provided (multiple threads can be active and running at the same time). 
    2. Multiple single threaded applications are intended to run simultaneously within the VM.   

     Running one single threaded application within a multiprocessor VM will not improve performance
     of that application, since only one vCPU will be in use at any given time.  Configuring additional 
     vCPUs in such a case is unnecessary.   

b) It’s best to configure as few virtual CPUs as needed by the application to handle its load. In other
    words, don’t overprovision on the vCPUs if not needed for additional application performance.

    Having virtual machines configured with virtual CPUs that are not used does impose resource
    requirements on the ESX Server. In some guest operating systems, the unused virtual CPUs still
    take timer interrupts which consumes a small amount of additional CPU. Please refer to KB
    articles 1077 and 1730.

c) For scheduling a VM with multiple vCPUs, ESX 2.x used a technique known as ‘Strict Co-scheduling’. 
    With strict co-scheduling, the scheduler keeps track of a "skew" value for each vCPU. A vCPU’s skew   
    increases if it is not making progress (running or idling) while at least one of its vCPU sibling is
    making progress.

   When the skew for any vCPU in a VM exceeds a threshold, the entire VM is descheduled. The VM is   
   rescheduled only when enough physical processors are available to accommodate all of the VM’s vCPUs.
   This may, especially with a system with fewer cores and running a mix of UP and SMP VMs,  lead to
   CPU ‘fragmentation’ resulting in relatively lower overall system utilization. As an example consider a
   two core system running  a single UP and a single two vCPU SMP VM. When the vCPU belonging to the
   UP VM is scheduled the other physical processor cannot be used to execute one of the two vCPUs of
   SMP VM, leading to the other physical CPU idling for that length of time.

   This co-scheduling algorithm was improved to a ‘Relaxed Co-Scheduling’ scheme in ESX 3.x. wherein
   even on availability of fewer physical processors than vCPUs in a skewed VM  only vCPUs that are 
   skewed need to be scheduled. This scheme increases the number of scheduling opportunities available
   to the scheduler and hence improving overall system throughput. Relaxed co-scheduling significantly
   reduces the possibility of co-scheduling fragmentation, improving overall processor utilization.

d) Briefly co-scheduling (to maintain the skew between processors execution times within reasonable
    limits) is necessary both so that the guest operating system and the applications with them run
    correctly and with good performance. Significant skew between the vCPUs corresponding to a VM can
    result in both severe performance and correctness issues.

    As an example guest operating systems make use of spin locks for synchronization. But if the vCPU
    currently holding a lock is descheduled, then the other VCPUs belonging to the VM will burn cycles
    busy-waiting until the lock is released. Similar performance problems can also show up in
    multi-threaded user applications, which may also perform some form of synchronization. Correctness
    issues associated with significant skew between the vCPUs of a VM can cause Windows BSODs or Linux
    kernel panics.

e) Idle vCPUs, vCPUs on which the guest is executing the idle loop, are detected by ESX and descheduled
    so that they free up a processor that can be productively utilized by some other active vCPU. 
    Descheduled idle vCPU’s are considered as making progress in the skew detection algorithm. As a
    result, for co-scheduling decisions, idle vCPUs do not accumulate skew and are treated as if they were
    running . This optimization ensures that idle guest vCPUs don’t waste physical processor resources,
    which can instead be allocated to other VMs.  For example, an ESX Server with two physical cores may
    be running one vCPU each from two different VMs, if their sibling vCPUs are idling, without incurring
    any co-scheduling overhead.  Similarly, in the fragmentation example above, if  one of the SMP VM’s
     VCPU is idling, then there will be no co-scheduling fragmentation, since its sibling vCPU can be
     scheduled concurrently with the UP VM.

To summarize ESX scheduler supports and enables SMP VMs for both high performance and fairness. ESX
users should leverage this SMP support for improving the performance of their applications by
configuring the appropriate number of vCPUs for a VM as really needed by the application load.

For a broader technical overview on ESX co-scheduling algorithms described above, please also refer to
the “Co-scheduling SMP VMs in VMware ESX Server“ blog.

Performance and scalability of virtualized Microsoft Exchange 2003 on VI 3

Many of our customers have already virtualized Microsoft Exchange 2003 on VMware ESX Server 3.  For customers who are considering virtualizing Exchange and want to know what to expect in terms of performance, we’ve published a whitepaper on the performance of Exchange in a virtual environment: http://www.vmware.com/pdf/Virtualizing_Exchange2003.pdf

The paper presents the results of a joint study with Dell that examined the performance implications of a virtualized Exchange environment. Specifically, we looked at:

  • The performance implications of running Exchange Server 2003 on a virtual machine versus a physical system.
  • The performance of Exchange Server 2003 in virtual machine configurations when “scaling-up” (adding more processors to a machine) and “scaling-out” (adding more machines).

The details of the configurations and results of the above experiments are documented in the white paper.

To briefly summarize, the results from the study indicate that on an Dell PowerEdge 6850 server configured with four 2.66 GHz dual-core Intel Xeon 7020 processors and 16GB of RAM.

  • A uniprocessor virtual machine can support up to 1,300 Heavy Exchange users.
  • Consolidating multiple instances of these uniprocessor Exchange virtual machines can cumulatively support up to 4,000 Heavy users while still providing acceptable performance and scaling.
  • Uniprocessor virtual machines are, from a performance perspective, equivalent to half as many multiprocessor (two virtual processors) virtual machines.

Performance Tuning Guide for ESX 3

Optimizing ESX’s performance is one of the primary tasks of a system administrator. One wants to make the best use of what ESX can
offer not only in terms of its features but also their associated performance. Over time a number of customers have been asking us for a single comprehensive ESX performance tuning guide that would encompass its CPU, memory, storage, networking, resource management and DRS, component optimizations. Finally we have the Performance Tuning Best Practices for ESX Server 3  guide.

As indicated above this paper provides a list of performance tips that cover the most performance-critical areas of Virtual Infrastructure 3 (VI3). The paper assumes that one has deployed ESX and has a decent  working knowledge of both ESX and its virtualization concepts, and are now looking forward to optimizing its performance.

Some customers will want to carefully benchmark their ESX
installations, as a way to validate their configurations and determine their
sizing requirements. In order to help such customers with a systematic
benchmarking methodology for their virtualized workloads, we’ve added a
section in the paper called "Benchmarking Best Practices". It covers the
precautions that have to be taken and things to be kept in mind during such
benchmarking. We’ve already published a similar benchmarking guidelines
whitepaper for our hosted products Performance Benchmarking Guidelines for VMware Workstation 5.5

The strength of the paper is that it succinctly
(in 22 pages) captures the performance best practices and benchmarking tips
associated with key components. Note that the document does not delve into the
architecture of ESX nor provide specific performance data for the discussions.
It also doesn’t cover sizing guidelines or tuning tips for specific
applications running on ESX.

All of us from the Performance team hope you find the document useful.