Home > Blogs > VMware vSphere Blog


Low Level VAAI Behaviour

We’re getting a lot of queries lately around how exactly VAAI behaves at the lower level. One assumes more and more VMware customers are seeing the benefit of offloading certain storage intensive tasks to the array. Recently the questions I have been getting are even more in-depth. I’ve been back over my VAAI notes gathered since 4.1, and have put together the following article. Hope you find it useful.

VAAI first appeared in vSphere 4.1, and was only available to block storage devices (iSCSI, FC, FCoE). This was enhanced in 5.0 to include support for NAS device primitives and also introduced an UNMAP primitive for reclaiming stranded space on a thin provisioned VMFS.

A closer look at the original primitives and a description on how they work follows. I've used various references to some of the primitives as they seem to have taken on numerous different names since first launched.

i) Atomic Test & Set (ATS)

This is a replacement lock mechanism for SCSI reservations on VMFS volumes when doing metadata updates. Basically ATS locks can be considered as a mechanism to modify a disk sector, which when successful, allow an ESXi host to do a metadata update on a VMFS. This includes allocating space to a VMDK during provisioning, as certain characteristics would need to be updated in the metadata to reflect the new size of the file. Interestingly enough, in the initial VAAI release, the ATS primitives had to be implemented differently on each storage array, so you had a different ATS opcode depending on the vendor. ATS is now a standard T10 and uses opcode 0×89 (COMPARE AND WRITE).

ii) Write Same/Zero

One of the most common operations on virtual disks is initializing large extents of the disk with zeroes to isolate virtual machines and promote security.  vSphere hosts can be configured to enable the WRITE SAME SCSI command to zero out large portions of a disk. With WRITE SAME enabled, VMware ESX/ESXi will issue the command to arrays during specific operations. This offload task will zero large numbers of disk blocks without transferring the data over the transport link. The WRITE SAME opcode is 0×93.

The following provisioning tasks are accelerated by the use of the WRITE SAME command:

  • Cloning operations for eagerzeroedthick target disks.
  • Allocating new file blocks for thin provisioned virtual disks.
  • Initializing previous unwritten file blocks for zerothick virtual disks.

The data out buffer of the WRITE SAME command will contain all 0's. A single zero operation has a default zeroing size of 1MB. When monitoring VAAI counters, it is possible that you will only observe the WRITE_SAME incrementing in batches of 16 in esxtop. This is because we only ever launch 16 parallel worker threads for VAAI, so don’t be surprised if you only see a batch increments of 16 Write Same commands during a zero operation.

Note: Not all storage arrays need to do this directly to the disk. Some arrays only need do a metadata update to write a page of all zeroes. There is no need to actually write zeroes to every location, speeding up this process dramatically all round.

iii) Full Copy/XCOPY/Extended Copy

This primitive is used when a clone or migrate operation (such as a Storage vMotion) is initiated from a vSphere host, and we want the array to handle the operation on our behalf. vSphere hosts can be configured to enable the EXTENDED COPY SCSI command. When examining VAAI status in esxtop, you may see this counter increment in batches of 8 because the default size of a Full Copy transfer is 4MB. With a 32MB I/O size, this gave batches of 8 for a full XCOPY I/O. The opcode for XCOPY is 0×83.

What VAAI offloads looks like from an I/O perspective?

I've had a number of requests to describe exactly what happens under the covers when some of these offload operations are taking place. The default XCOPY size is 4MB. With a 32MB I/O, one would expect to see this counter in esxtop incrementing in batches of 8. The default XCOPY size can be incremented to a maximum value of 16MB.

The default WRITE SAME size is 1MB. With a 32MB I/O, one would expect to see this counter in esxtop incrementing in batches of 16, since we only ever launch 16 parallel worker threads for VAAI. We currently do not support changing the WRITE SAME size of 1MB.

Differences between VAAI in 4.x & VAAI in 5.x

One final piece of information I wanted to share with you is a distinction between our first phase/release of VAAI in vSphere 4.1 and our second phase of VAAI which was released with vSphere 5.0. A list of differences appears below.

  • VAAI now uses standard T10 primitives rather than bespoke array commands
  • Full ATS with VMFS-5. Here is a link to a blog post which talks about locking in more detail.
  • Support for NAS Primitives – An overview of the vSphere 5.0 primitives can be found here.
  • VCAI (View Composer Array Integration) – Read how View is using VAAI to offloads clones to the array here
  • UNMAP support – Some additional information on the new UNMAP primitive can be found here.
  • VMware HCL now requires performance of primitives before an array receives VAAI certification. This is important as certain storage arrays which appear in the vSphere 4.1 HCL may not appear in the vSphere 5.0 HCL if the performance of the offload primitives do not meet our requirements.

My thanks to Ilia Sokolinski for clarifying some of the behaviours above.

Get notification of these blogs postings and more VMware Storage information by following me on Twitter: Twitter @VMwareStorage

14 thoughts on “Low Level VAAI Behaviour

  1. Hi Cormac,
    I was very happy when I could test the NFS VAAI capabilities of a NetApp FAS3240 with Data Ontap 8.1.1 RC1 and vSphere 5.01 for the cloning of View 5.1 desktops.
    But when testing I was a bit disappointed, at first it seems very fast, but then alle clones seem to stop at 93%. At the datastore level it seems that, altough linked clones are selected, the VAAI option in View created Full Clones because they are the same size as the replica.
    Is this a correct way that VAAI should work? Or is it still Tech Preview and not yet fully functional as it should.
    I watched a video on VMwareTV which shows exactly the same, but their it goes a lot faster….
    Keep up the good articles on VMware storage.
    Kind regards, Berend

      • Have you installed the NAS plugin? If you have followed the steps to offload the creation of the linked clones to the array, I would suggest having a discussion with NetApp to see why full clones are being created.

        • Hi Cormac,

          Yes, I installed the NAS plugin and followed the instructions. It sounds as if I need to check in with support unless you’ve heard something else…?

          Thanks!
          Scott

  2. Hi Cormac,
    just a quick question. In a blog post I’ve read ( https://www.ibm.com/developerworks/mydeveloperworks/blogs/anthonyv/entry/upgrading_to_esxi_5_0_how_to_confirm_vaai_status?lang=en ), it states the following:
    “Of course having seen the Hardware Acceleration Supported message only proves that Atomic Test and Set works”
    However, KB1021976 states that you need a full copy operation to a VMDK at least 4MB in size, to display the hardware acceleration support status.
    Could you shed some light on this?
    Thanks,
    Bas

  3. John – great question. I agree that information does seem to be all over the place at the moment. I am currently looking for approval to do a VAAI whitepaper in Q3 which will pull all of the diaparate VAAI information into one place.

  4. Hi Bas,
    When I last checked, I had to do a clone operation as per the KB artcle before the UI status would change to supported. So my understanding is that the KB is correct. If you are observing something different, please let me know.
    Cormac

  5. Berend,
    You’re observations are correct – Native Clone VMs will have the same size as their parent VMs. The sharing of blocks occurs at the array level and is not reflected on the vSphere file size reports. However, the datastore free space reported by vSphere should reflect the space savings.
    Also, with regards to the video, if it was the one which I created, then I edited it quite a bit to speed it up, so it is not reflective of the time taken to do an actual deployment.
    Rgds
    Cormac

  6. Hi Cormac,
    Can you comment if VAAI operations are supported by ‘file copy’ asks performed either through API or through vSphere file copy/paste operations in datastore browser?
    I was testing a tool that uses this vmware file copy APIs on virtiual machine files to copy vm files including vmdk files across VAAI supported datastores. i did not see any VAAI improvements during these operations.
    I am not sure what I could be doing wrong there, but after reading through below VAAI kb article I did not find any specific limitations or such related to file copy to migrate clone VM files. So I was expecting it to leverage VAAI in my case but it seems like it is not.
    http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1021976
    Thanks

  7. Sorry for the delayed repsonse vmitguy. I only just came across your question.
    The datastore browser does not use the internal VMkernel Data Mover or VAAI for that matter – it has its own API.
    Therefore what you observe is correct – datastore browser copy/paste operations will not use VAAI.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>