VMware vSphere™ 4: The CPU Scheduler in VMware® ESX™ 4

VMware recently published a whitepaper that discusses changes in CPU scheduler in ESX 4. The paper also describes a few key concepts in CPU scheduler that should be useful to understand performance issues involved with CPU scheduler. Specifically, it attempts to answer the following questions:

How CPU time is allocated between virtual machines? How well does it work?
What is the difference between “strict” and “relaxed” co-scheduling? What is the performance impact of recent co-scheduling improvements?
What is the “CPU scheduler cell”? What happened to the scheduler cell in ESX4?
How does ESX scheduler exploit the underlying CPU architecture features like multi-core, Hyper-threading, and NUMA?

The following provides brief summary of the paper:

In ESX 4, many improvements have been introduced in CPU scheduler. This includes further relaxed co-scheduling, lower lock-contention, and multi-core aware load balancing. Co-scheduling overhead has been further reduced by the accurate measurement of the co-scheduling skew, and by allowing more scheduling choices. Lower lock-contention is achieved by replacing scheduler cell-lock with finer-grained locks. By eliminating the scheduler-cell, a virtual machine can get higher aggregated cache capacity and memory bandwidth. Lastly, multi-core aware load balancing achieves high CPU utilization while minimizing the cost of migrations.

Experimental results show that the ESX 4 CPU scheduler faithfully allocates CPU resource as specified by users. While maintaining the benefit of a proportional-share algorithm, the improvements in co-scheduling and load-balancing algorithms are shown to benefit performance. Compared to ESX 3.5, ESX 4 significantly improves performance in both lightly loaded and heavily loaded systems.

For more details please download and read our full paper from here.