Architecture

SIOC: I/O Distribution with Reservations & Limits – Part 1

The mClock scheduler was introduced with vSphere 5.5 Storage I/O Control (SIOC) and laid the foundation for new capabilities for scheduling storage resources.  vSphere 6.0 expands upon these capabilities and adds the ability to reserve IOPS, providing even more flexibility and control when delivering storage services to virtual machines.  However, this new capability introduces new questions about how resources are managed and allocated during periods of storage contention.

There are now two basic circumstances that can occur when there is storage contention.  The first is when the array can satisfy the reservations configured on the VMs, and the second is when the configured reservations exceed what the array is capable of delivering.  Let’s take a look at the two scenarios in more detail.

Scenario 1
In this scenario, we have a shared storage array that is capable of delivering 8000 IOPS and there are four VMs accessing the storage.  The configuration looks like the following:

8000 IOPS

In this example, the array is capable of delivering all the reservations that are configured. As long as all the reservations are satisfied, the mClock scheduler will look at the shares value to determine the how the resources should be distributed when the VMs are contending for resources.  Note: This scenario assumes that the applications within the VMs have generated enough outstanding I/O to achieve the maximum IOPS possible.

The first step is to determine what percentage of the pool of resource each VM will receive.  In this example there are 5000 shares, and this is how the percentage of entitlements are distributed:

VM1: 20% (1000/5000)
VM2: 50% (2500/5000)
VM3: 10% (500/5000)
VM4: 20% (1000/5000)

Next, you use the following formula to calculate the maximum IOPS for each VM.
(Array Capacity) * (% of Entitled Resources)

This will result with the following distribution:

VM1: 20% x 8000 = 1600 IOPS
VM2: 50% x 8000 = 4000 IOPS
VM3: 10% x 8000 = 800 IOPS
VM4: 20% x 8000 = 1600 IOPS

Scenario 2
The second scenario is when the configured reservations exceed what the array is capable of delivering.  In this example, the array has become congested and the maximum IOPS that can be provided has dropped to 500 IOPS.  In this situation the mClock scheduler will distribute the IOPS based on the percentage of the IOPS reservation rather than the configured shares.

500 IOPS

To figure out the maximum IOPS for each VM, the first step is to determine what percentage each VM will receive based on the configured reservation.  In this example there are 2000 total IOPS reserved, so the allocation would be as follows:

VM1: 20% (400/2000)
VM2: 40% (800/2000)
VM3: 20% (400/2000)
VM4: 20% (400/2000)

This is how the IOPS allocation would be distributed in this scenario.  Since the array is only capable of delivering 500 IOPS, the above % of entitled IOPS will be a taken from the 500.

VM1: 20% x 500 = 100 IOPS
VM2: 40% x 500 = 200 IOPS
VM3: 20% x 500 = 100 IOPS
VM4: 20% x 500 = 100 IOPS

I hope this helps to explain how the new mClock scheduler in vSphere 6.0 distributes the I/O for VMs.  This article is intended to introduce the concepts of storage resource scheduling and assumes that all VMs on a datastore are in contention at the same time.  In reality, however, all VMs will not be demanding resources at the same time.  A future article will dive into examples where only a portion of VMs are contending for resources, and explain how the mClock scheduler handles the resource distribution in those scenarios.