Home > Blogs > VMware vSphere Blog

Virtual Machine I/O – Fairness versus Performance

Many of you will have read various articles related to queue depths, especially in the area of LUN/device queue depths, and how these can be tuned to provide different performance for your I/O. However there are other queue settings internal to the VMkernel, which relate to how many I/Os a Virtual Machine can issue before it has to allow another virtual machine to send I/Os to the same LUN. What follows is some detail around these internal settings, and how they are used to achieve fairness and performance for virtual machine I/O.

Warning: These settings have already been pre-configured to allow virtual machines perform optimally. There should be no reason to change these unless guided to do so by VMware Support Staff. This is all about performance vs fairness. Failure to follow this advice can can give you some very fast virtual machines in your environment, but also some extremely slow ones.

This advanced setting represents the maximum number of I/Os one individual Virtual Machine can issue all the way down to the LUN when there is more than one Virtual Machine pushing I/O to the same LUN. That is the important point. When it is the only virtual machine issuing I/O, it can fill the device queue depth with I/O requests. But when it is sharing the LUN with other active virtual machines, this value throttles that I/O for fairness. The default value for this setting is 32, meaning it can issue a maximum of 32 I/O requests when multiple virtual machines are active on the same LUN from an I/O perspective. If there is only a single Virtual Machine pushing I/O to a LUN, we obviously would not want to throttle it. In that case we allow it to send as many I/Os as per the device queue depth, which may also be 32 by the way, but it can be made to be much higher. In many cases, a queue depth of 64 can be found. Let’s say that in this example, the device queue depth is 64. When a second virtual machine starts to issue I/O to the same LUN, Disk.SchedNumReqOutstanding halves the number of I/Os sent to the LUN (from 64 down to 32) from the virtual machine in an effort to achieve fairness between all the virtual machines sharing the datastore.

When there are multiple Virtual Machines sharing the same LUN, there may be an occasion where we may want it to send more I/Os than the number defined in Disk.SchedNumReqOutstanding. For example, in the case of a sequential I/O pattern, we might allow a number of additional I/Os to complete.  The reason for this is that performance will be impacted if we have to seek back to the same location to complete the sequential I/O pattern when this Virtual Machine’s I/O is next serviced.

The parameter Disk.SchedQuantum represents the maximum number of consecutive “sequential” I/O’s allowed from a virtual machine before we force a switch to service the I/Os from another Virtual Machine. Disk.SchedQuantum’s default value is 8. Therefore during a “sequential” I/O operation, we may allow a virtual machine to issue its 32 I/Os as per Disk.SchedNumReqOutstanding and then an additional 8 I/O as per Disk.SchedQuantum.This allows us to improve the all round performance of virtual machines, especially when they are doing sequential I/O.

This then begs the question – how do we figure out if the next I/O is sequential or not? That is what the next setting gives us.

As stated, we need a figure of ‘proximity’ to see if the next I/O of a Virtual Machine is ‘sequential’. If it is, then we give the virtual machine the benefit of getting the next I/O slot as it will likely be served faster by the storage. If it is outside this proximity, the I/O goes to the next Virtual Machine for fairness. The setting represents the maximum distance in disk sectors when considering if two I/O’s are indeed “sequential” in proximity. Disk.SectorMaxDiff defaults to 2000.

This advanced setting is used to determine when to throttle down the amount of I/Os sent by one virtual machine to the driver queue to allow other virtual machines to schedule their I/O. The setting refers to the number of times we switch between different virtual machines. If we switch between unique virtual machines this many times, then we will start to reduce the maximum number of commands that a virtual machine can queue to Disk.SchedNumReqOutstanding. By default, the number of switches between virtual machines to trigger this behavior is 6.  What this means is if we encounter a period of high I/O and switch between 6 unique virtual machines, and we have not returned to service the I/Os outstanding on any of the previous virtual machines, we will automatically throttle back the number of I/Os which a virtual machine can schedule. This again is to balance fairness versus performance.

Once the period of high I/O has passed, this advanced parameter is used to determine when to throttle back up to the full number of scheduled I/Os per virtual machine to the maximum value. The setting refers to the number of times we issue I/O’s from the same Virtual Machine before we go back to using the full number of scheduled requests. The default value for this setting is 128 scheduled I/Os. This means that a virtual machine is scheduled 4 consecutive times without interruption (4 x 32). If the device queue depth was set to 64, then this will allow the virtual machine to issue 64 I/Os at a time since clearly there are no other virtual machines issuing I/O at this time.

Hopefully this gives you an idea around how there are various checks and balances in the VMkernel to allow for performance and fairness when many virtual machines are sharing the same storage device/LUN.

I will finish with one final warning. I already highlighted that these parameters shouldn’t be changed unless you are under some specific guidance from our support staff. Another reason for not touching these parameters is that they are global. If you  want to tune the I/O behavior for a single LUN, you cannot do it. You will impact all of your datastores, and if you have multiple arrays, all datastores across all arrays. Be very careful!

Get notification of these blogs postings and more VMware Storage information by following me on Twitter: @VMwareStorage

This entry was posted in vSphere on by .
Cormac Hogan

About Cormac Hogan

Cormac Hogan is a Senior Staff Engineer in the Office of the CTO in the Storage and Availability Business Unit (SABU) at VMware. He has been with VMware since April 2005 and has previously held roles in VMware’s Technical Marketing and Technical Support organizations. He has written a number of storage related white papers and have given numerous presentations on storage best practices and vSphere storage features. He is also the co-author of the “Essential Virtual SAN” book published by VMware Press.

7 thoughts on “Virtual Machine I/O – Fairness versus Performance

  1. Pingback: Virtual Machine I O – Fairness versus Performance… « Virtual Fuss

  2. Robert Nolan

    This article does a great job of explaining the queue settings internal to the VMkernel and the tradeoffs that have to be made between performance and fairness in a multi-VM environment. There are finite resources available and these settings enforce rules to ensure that all the VMs get their turn at the resource trough.

    As the article notes, the maximum number I/O requests can be decreased when VMs are sharing the same LUN. In effect, some performance is sacrificed in the name of fairness. There are steps you can take to maximize performance even as Disk.ShedNumReqOutstanding does its thing. This is accomplished by making the most of the I/O requests you are allowed. By defragmenting the Windows guest OS you reduce the total number of I/O requests that cross the storage stack. The Windows file system, NTFS, fragments files before anything is written to disk. If NTFS saves a 2GB file contiguously, that file is accessed in a single I/O request of 2GB. If NTFS saves the same file in 100 equal size fragments, then the file is accessed in 100 I/O requests of 200MB each. If you only have 32 I/O requests, maximizing the size of each request will get you better throughput and performance.

    In testing we performed using VMware’s vscsiStats utility we captured stats on a baseline of disks with fragmented files and free space. We then defragmented the files and free space on these disks and repeated our test. The vscsiStats utility is a great tool, it counts every I/O coming through and sorts them based on a number of metrics including size, latency and distance from previous I/O. The test showed that defragmenting produced a 28% reduction in total I/O. As one would suspect, if you are doing fewer I/O then the resulting I/O should be larger. The largest bucket vscsiStats measures is >524K. Compared to the baseline disks, the defragmented disks had a 1200% increase (247 vs. 2959) in the number of I/O in this bucket size. Defragmentation is one way to maximize the size of the I/O requests you are allowed.

    Disk.SectorMaxDiff determines if the next I/O is sequential or not by looking at the maximum distance in sectors from the previous I/O. If the I/O is sequential, the VM gets the next I/O slot because it is likely to be better served by the storage. In the same tests, vscsiStats showed the defragmented disks produced a 58% increase in the number of sequential I/O where the distance from the previous I/O was one sector.

    If fairness means throttling the I/O requests, then getting the most out of the available I/O requests makes sense. Defragmenting Windows guests systems generates larger I/O and increases sequential I/O. Both would help performance when the numbers of outstanding I/O requests are constrained.

    Bob Nolan
    Raxco Software, Inc.
    VMware Elite TAP member

  3. Aboubacar Diare


    These being global parameters, are there any ways to set these values in a more granular way. For example via adding advanced configuration options in the VM vmx that would allow a specific VM to over ride the global settings????



  4. Pingback: SharePoint 2013 Development & Performance » Disk defragmenting SharePoint on Virtual Machines: Performance tweak or myth?

  5. Pingback: Disk defragmenting SharePoint on Virtual Machines: Performance tweak or myth? | Keith Tuomi - SharePoint Server MVP

  6. Pingback: An Easy Fix for Your Slow VM Performance Explained | Raxco Software Blog

Leave a Reply

Your email address will not be published. Required fields are marked *