Home > Blogs > VMware vSphere Blog


34 thoughts on “VMFS Locking Uncovered

  1. Rob

    Good Post Cormac,
    I have a question about datastores that are upgraded in place as opposed to newly formatted. Do datastores that are upgraded, assume all of the benefits(1-8) above or only a subset?
    Thanks,
    Rob

  2. Mostafa Khalil

    I think the reference was for “ATS only” flag on VMFS5 Datastores.
    It is enabled on Freshly created datastores but not upgraded ones. Note that it is not enabled out of the box. When the host detects that the array supports ATS on the device, ATS Only flag is written to the datastore. From that point on, ATS will always be used on that datastore.
    To manually enable it, you my use the hidden option:
    vmkfstools –configATSOnly 1 [device path]
    e.g.
    vmkfstools –configATSOnly 1 /vmfs/devices/disks/[naa-id]:[partition-number]
    Or
    vmkfstools –configATSOnly 1 /dev/disks/[naa-id]:[partition-number]
    However, it for whatever reason the storage array does not support ATS and you enable this flag manually, the datastore will not be mounted which is why it is not enabled by default.
    If you need to disable the flag repeat the vmkfstools command using the value “0″ instead of “1″.

  3. Mikael

    Hi Cormac, Great post and thanks for this explanation. Working with Cody in this wonderfull question and I wonder what would occur in terme of “contention” when you have a lot of storage vmotion between two lun’s and you are in the VMFS3 (ESX4.1) scenario? let’s suppose a 4 to 8 storage Vmotion scenario taking place in the same time on the same volume and the impact (fall back to SCSI2 commands)? you precise that only the (1) and (2) commands could take care of ATS in this case what about the other commands (3) to (8) in a VMFS3 scenario (ESX4.1)? Do they produce bad thing in the command (1) and (2)? Thanks
    Mikael

  4. Scott Langer

    Awesome post. thanks. can you tell me, if an array is “VAAI enabled” will it automatically have ATS or are some arrays “VAAI Enabled” but they don’t have the full feature set…
    I guess what Im wondering is: Must I ask my storage vendor, “does your array support ATS specifically?”
    thanks.

  5. Chogan

    Hi Scott,
    For newly created VMFS-5, ATS (if supported by the array) will be enabled by default.
    For upgraded VMFS-5, please see the comment from Mostafa above.
    To check if the storage array supports ATS, or indeed any of the VAAI primitives,you can use the following command:
    esxcli storage core device vaai status get -d ‘:
    naa.60a98000572d54724a346a6170627a52
    VAAI Plugin Name: VMW_VAAIP_NETAPP
    ATS Status: supported
    Clone Status: supported
    Zero Status: supported
    Delete Status: supported

  6. Chogan

    Hi Mikael,
    Thanks for commenting. You and Cody are doing some great work – thanks.
    To your specific question, indeed contention could occur with ATS & SCSI reservations, but it is handled in the ATS primitive implementation.
    One of the requirements in the design of ATS was to make it compatible with ESX hosts that use the legacy SCSI reservation based VMFS-3 lock manager.
    The ATS primitive behaves like a regular read/write CDB on the wire and fails with a reservation conflict if another host has the LUN reserved using SCSI-2 or SCSI-3 reservations.

  7. Nick

    Great post and clears up a lot of the mysteries associated with SCSI reservations and LUN locking.
    In the past (i.e. ESX 3.x and 4.0), we typically sized LUNs with block-based storage (FCP, iSCSI) using a very popular rule of thumb – and that was to limit the # of VMs on each LUN to about 20. So if each VM was 30 GB in size and generated moderate I/O, we might use ~600 GB LUNs for optimal performance. But with VAAI-capable arrays and VMFS 4.1 and 5.0, we can probably do better than that. My question is how many VMs per LUN now? Should we “cap” the # of VMs per LUN at 50 or so now with VAAI/ATS? Or is it 100? What do you think the new rule of thumb should be?
    While I know it depends, I’d really like a range…similar to how we’ve said “20-30 VMs per LUN with mixed/moderate I/O” before. Thanks in advance and keep the articles coming.
    -Nick

  8. Chogan

    Thanks for the nice comments Nick.
    Certainly SCSI reservations were a limiting factor, and the introduction of ATS complete VMFS-5 should remove this as a consideration when it comes to sizing VM density per volume.
    But there are too many mitigating factors for me to come up with a rule of thumb for number of VMs per datastore. What I will say is that if you were getting 20 VMs per datastore with SCSI reservations, and you now have an ATS capable datastore, then you should be able to increase the number of VMs.
    But of course, the IOPs capability of the datastore, the latency & IOPs requirements of the apps in the datastore, and the sort of applications running in the VM should also be considered.

  9. Chad

    Hi Cormac,
    What is the granularity of ATS locks within VMFS5 and how is space divided up between A) various ESX servers which have mounted the same VMFS5 datastore, B) Between various provisioning operation occurring simulateously on the same VMFS5 datastore from within the same ESX server. How many contiguous blocks does each lock guarantee? Further, how many locks can be granted at once?
    Chad

  10. Chogan

    Hi Chad,
    ATS locks are a mechanism to modify a disk sector, which when successful, allow an ESXi host to do a metadata update on a VMFS. This includes allocating space to a VMDK during provisioning, as certain characteristics would need to be updated in the metadata to reflect the new size of the file.
    When it comes to space allocation, the last time I looked into this (VMFS v3.31), we allocate 200 file block resources with each lock. If we take a 1MB file block, a cluster contains 64 file blocks, so we get 200 * 64MB each time we grow a file on a VMFS. This may have changed in 5.0, but I haven’t heard about it if it did.
    Some further information about the layout of VMDKs on VMFS can be found in this blog post – http://blogs.vmware.com/vsphere/2012/02/vmfs-extents-are-they-bad-or-simply-misunderstood.html.
    I do not believe that we have a limit on the number of locks that can be granted to an ESXi host, or if we do, I susepct that it is reasonably high enough to prevent us reaching it during provisioning.
    HTH
    Cormac

  11. Chad

    What I’m curious about is this, are Atomic test and set the same as Compare And Write which is a target side operation: the SCSI COMPARE AND WRITE (CAW) command provides a means to write data without imposing the overhead of a SCSI Reservation (a LUN level lock).

  12. Andy

    I was just re-reading the VMFS-5_Upgrade_Considerations.pdf and a line on page 4 under Small File Support caught my eye. “VMFS-5 introduces support for very small files. For files less than or equal to 1KB, VMFS-5 uses the file descriptor location in the metadata for storage rather than file blocks.”
    Now I understand that VMFS-5 tries to use ATS natively but if it can’t it will fall-back to SCSI-2 reservations.
    Now if we consider that a metadata update requires a lock, and if your array does not support ATS, then any small file will require a SCSI-2 lock therefore potentially impacting scalability & performance.
    I don’t think I have ever paid attention to the quantity of small files but now w/ datastore heartbeats and the plethora of other small files, how much of a concern is this?

  13. Chogan

    Hi Andy,
    The ‘small file’ mechanism that was introduced in VMFS-5 is a space saving technique. Rather than consuming disk blocks, the information is stored within the metadata.
    So the act of creating, modifying or deleting a small file from a locking perspective on VMFS-5 won’t have changed. If your array supports ATS, then yes, this locking procedure will be more efficient. As you state however, if your array does not support VAAI, then SCSI reservations will still have to be used to lock the LUN while the host places an exclusive lock on the file.
    But small file support on VMFS-5 should not introduce any additional overheads or latency.

  14. Jack

    Cormac, I have a question related to concurrent ATS request and read/write requests to the same blocks – does VMFS expect a concurrent ATS and another read/write request to be mutually exclusive? i.e. they must be executed by array in strict order. I know 2 ATS requests on the block range would be serialized, but curious how VMFS would expect for ATS vs. other read/write.

  15. Clint Beilman

    Hi Cormac,

    We just got a Violin 6000 series array which does not currently support ATS and I’m trying to figure out how to size the LUNS. I’m currently on vSphere 5.0. Back before ATS, I kept less than 20 VMs on a LUN, but that with VMFS3. Should I follow the same guideline with VMFS5?

    Thanks,
    Clint

  16. Natural Cleanse Weight Loss

    Excellent goods from you, man. I’ve take into account your stuff previous to and you are just too wonderful. I really like what you’ve acquired right here, really
    like what you’re saying and the way in which in which you are saying it. You are making it entertaining and you continue to take care of to keep it smart. I cant wait to learn much more from you. This is really a great web site.

  17. Automated pay days

    Great items from you, man. I’ve have in mind your stuff previous to and you’re simply too great.
    I really like what you’ve obtained here, really like what you’re
    saying and the best way wherein you assert
    it. You are making it entertaining and you still care for to keep it sensible.
    I can not wait to read far more from you. This is actually a wonderful web
    site.

  18. Wrinkle cream

    I’ll right away grab your rss as I can’t find your e-mail subscription hyperlink or e-newsletter service.
    Do you’ve any? Kindly permit me know in order that I may subscribe. Thanks.

  19. low testosterone in women

    I do agree with all the ideas you’ve introduced in your post.
    They are very convincing and will certainly work. Still, the posts are very
    quick for starters. May you please extend them a little from next time?
    Thank you for the post.

  20. tube.ishtartv.com

    Hey just wanted to give you a quick heads up.
    The words in your post seem to be running off
    the screen in Chrome. I’m not sure if this is a format
    issue or something to do with web browser compatibility but I thought I’d post to let you know.
    The design and style look great though! Hope you get the problem
    fixed soon. Cheers

  21. web page

    I need some knowledge of carpentry before I go up north to learn how to build log homes.

    I don’t know anything about how to build a house! haha. .
    Any suggestions?. . Thanks in advance!.

  22. Neelima Bandla

    Hi Cormac,

    It looks like our vm’s go offline when there is latency issues aquiring locks ( Storage array), it looks like ESX has 15 seconds watchdog timeout determining if lock is lost or not. Is the 15 seconds timeout for VAAI ATS command? Can you please confirm if that is the timeout for ATS command. Also, can this be changed from the default.

    thanks,
    Neelima

Comments are closed.