vSphere 5.0 Storage Features Part 6 - Storage DRS - Balance On I/O Metrics
Another feature of Storage DRS is its abaility to balance VMs across datastores in the datastore cluster based on I/O metrics, specifically based on latency.
First, let us see how Storage DRS (SDRS) is able to capture this information.
SDRS uses Storage I/O Control (SIOC) to evaluate datastore capabilities & capture latency information regarding all the datastores in the datastore cluster. SIOC was first introduced in vSphere 4.1. Its purpose is to ensure that no single VM uses all the bandwidth of a particular datastore, and it modifies the queue depth to the datastores on each ESX to achieve this.
In SDRS, its implementation is different. SIOC (on behalf of SDRS) checks the capabilities of the datastores in a datastore cluster by injecting various I/O loads. Once this information is normalized, SDRS will have a good indication of the types of workloads that a datastore can handle. This information is used in initial placement and load balancing decisions.
SDRS continuously uses SIOC to monitor how long it takes an I/O to do a round trip - this is the latency. This information about the datastore is passed back to Storage DRS. If the latency value for a particular datastore is above the threshold value (default 15ms) for a significant percentage of time over an observation period (default 16 hours), SDRS will try to rebalance the VMs across the datastores in the datastore cluster so that the latency value returns below the threshold. This may involve a single or multiple Storage vMotion operations. In fact, even if SDRS is unable to bring the latency below the defined threshold value, it may still move VMs between datastores to balance the latency.
And since we now support Storage I/O Control on NFS in vSphere 5.0, we can also have NFS datastore clusters in SDRS.
If the datastore cluster is set to manual mode, SDRS will raise an alarm to bring to the administrators attention that a recommendation has been made. By looking at the SDRS tab, the administrator can then see the recommendations made by SDRS in order to balance the I/O load. An example of a recommendation is shown here:
What is very cool about the recommendation is that it gives the administration insight into what the latency measurements are on the source and destination datastores. The administrator can then refer to this information before deciding on whether to migrate the VM or not.
Storage DRS provides customers with a way of automatically load-balancing their datastores, avoiding hot-spots on your storage.
I thought that the SDRS observation period was only 8, not 16 hours? Taken from the Whats New in Storage Technical WP: "I/O load is evaluated by default every 8 hours."
Am I missing something or where is the typo? :)
Posted by: Steffen Oezcan | 07/26/2011 at 08:27 AM
Do you plan to add sub-vmdk balance?
Posted by: Attila Bognár | 07/26/2011 at 02:22 PM
Hi Steffen, SDRS needs at least 16 hours of statistics gathered before it will make its first recommendation. You are correct though - SDRS will check I/O imbalance every 8 hours, but will look back over the last days data before making a recommendation.
Posted by: Chogan | 07/28/2011 at 02:41 AM
Hi Attila, at this time, SDRS only does balancing at the VMDK level.
Posted by: Chogan | 07/28/2011 at 02:43 AM
Cool stuff, but does not seem like it would work with Tiered storage SANs such as Compellent.
Posted by: Jeff | 09/28/2011 at 11:40 AM
Hi Jeff, thanks for the comment.
Indeed, there are some considerations when using Storage DRS with tiered storage solutions. My colleague, Duncan Epping, has written some interesting posts around this. You can read them here http://www.yellow-bricks.com/2011/07/15/storage-drs-interoperability/ and here http://www.yellow-bricks.com/2011/08/05/sdrs-and-auto-tiering-solutions-the-injector/
Posted by: Chogan | 09/28/2011 at 01:55 PM