vSphere 5.0 Storage Features Part 1 - VMFS-5
One of the primary objectives of Storage enhancements in 5.0 is to make the management of storage much simpler. One way to do this is to reduce the number of storage objects that a customer has to manager, i.e. enable our customers to use far fewer and much larger datastores. To that end, we are increasing the scalability of the VMFS-5 filesystem. These scalability features are discussed here. In future postings, I will discuss further features which aim to fulfil this vision of simplifying storage management.
VMFS-5 Enhancements
- Unified 1MB File Block Size. Previous versions of VMFS used 1,2,4 or 8MB file blocks. These larger blocks were needed to create large files (>256GB). These large blocks are no longer needed for large files on VMFS-5. Very large files can now be created on VMFS-5 using 1MB file blocks.
- Large Single Extent Volumes. In previous versions of VMFS, the largest single extent was 2TB. With VMFS-5, this limit has been increased to ~ 60TB.
- Smaller Sub-Block. VMFS-5 introduces a smaller sub-block. This is now 8KB rather than the 64KB we had in previous versions. Now small files < 8KB (but > 1KB) in size will only consume 8KB rather than 64KB. This will reduce the amount of disk space being stranded by small files.
- Small File Support. VMFS-5 introduces support for very small files. For files less than or equal to 1KB, VMFS-5 uses the file descriptor location in the metadata for storage rather than file blocks. When they grow above 1KB, these files will then start to use the new 8KB sub blocks. This will again reduce the amount of disk space being stranded by very small files.
- Increased File Count. VMFS-5 introduces support for greater than 100,000 files, a three-fold increase on the number of files supported on VMFS-3, which was ~ 30,000.
- ATS Enhancement. This Hardware Acceleration primitive, Atomic Test & Set (ATS), is now used throughout VMFS-5 for file locking. ATS is part of the VAAI (vSphere Storage APIs for Array Integration), and will be revisited in a future posting. This enhancement improves the file locking performance over previous versions of VMFS.
Here is a vmkfstools output of a newly created VMFS-5 volume showing many of the new scalability characteristics:
~ # vmkfstools -Pv 10 /vmfs/volumes/newly-created-vmfs5/
VMFS-5.54 file system spanning 1 partitions.
File system label (if any): newly-created-vmfs5
Mode: public
Capacity 3298534883328 (3145728 file blocks * 1048576), 3297500987392 (3144742 blocks) avail
Volume Creation Time: Tue Jun 14 14:35:53 2011
Files (max/free): 130000/129992
Ptr Blocks (max/free): 64512/64496
Sub Blocks (max/free): 32000/32000
Secondary Ptr Blocks (max/free): 256/256
File Blocks (overcommit/used/overcommit %): 0/986/0
Ptr Blocks (overcommit/used/overcommit %): 0/16/0
Sub Blocks (overcommit/used/overcommit %): 0/0/0
UUID: 4df771c9-f6419df2-81bc-0019b9f1ecf6
Partitions spanned (on "lvm"):
naa.60a98000572d54724a34642d71325763:1
DISKLIB-LIB : Getting VAAI support status for /vmfs/volumes/newly-created-vmfs5/
Is Native Snapshot Capable: NO
~ #
VMFS-3 to VMFS-5 Upgrades
- Upgrading from VMFS-3 to VMFS-5 is an online & non-disruptive upgrade operation, i.e. VMs can continue to run on the datastore.
- Upgraded VMFS-5 can use the new 1KB small-files feature.
- Upgraded VMFS-5 can be grown to ~ 60TB, same as a newly created VMFS-5.
- Upgraded VMFS-5 has all the VAAI ATS improvements that a newly created VMFS-5 has.
Here is a vmkfstools output on an upgraded VMFS-5 volume:
~ # vmkfstools -Pv 10 /vmfs/volumes/upgrade-testvol
VMFS-5.54 file system spanning 1 partitions.
File system label (if any): upgrade-testvol
Mode: public
Capacity 3298534883328 (3145728 file blocks * 1048576), 3297916223488 (3145138 blocks) avail
Volume Creation Time: Mon Jun 13 13:03:04 2011
Files (max/free): 30720/30713
Ptr Blocks (max/free): 64512/64496
Sub Blocks (max/free): 3968/3968
Secondary Ptr Blocks (max/free): 256/256
File Blocks (overcommit/used/overcommit %): 0/590/0
Ptr Blocks (overcommit/used/overcommit %): 0/16/0
Sub Blocks (overcommit/used/overcommit %): 0/0/0
UUID: 4df60a88-8eaa51ea-3108-0019b9f1ecf6
Partitions spanned (on "lvm"):
naa.60a98000572d54724a34642d71325763:1
DISKLIB-LIB : Getting VAAI support status for /vmfs/volumes/upgrade-testvol
Is Native Snapshot Capable: NO
~ #
Differences between newly created and upgraded VMFS-5 datastores:
- VMFS-5 upgraded from VMFS-3 continues to use the previous file block size which may be larger than the unified 1MB file block size.
- VMFS-5 upgraded from VMFS-3 continues to use 64KB sub-blocks and not new 8K sub-blocks.
- VMFS-5 upgraded from VMFS-3 continues to have a file limit of 30720 rather than new file limit of > 100000 for newly created VMFS-5.
- VMFS-5 upgraded from VMFS-3 continues to use MBR (Master Boot Record) partition type; when the VMFS-5 volume is grown above 2TB, it automatically & seamlessly switches from MBR to GPT (GUID Partition Table) with no impact to the running VMs.
- VMFS-5 upgraded from VMFS-3 continue to have its partition starting on sector 128; newly created VMFS5 partitions will have their partition starting at sector 2048.
RDM - Raw Device Mappings
- There is now support for passthru RDMs to be ~ 60TB in size.
- Non-passthru RDMs are still limited to 2TB - 512 bytes.
- Both upgraded VMFS-5 & newly created VMFS-5 support the larger passthru RDM.
Misc.
I decided to add this section as I know many of you will have questions about it.
- The maximum size of a VMDK on VMFS-5 is still 2TB -512 bytes.
- The maximum size of a non-passthru (virtual) RDM on VMFS-5 is still 2TB -512 bytes.
- The maximum number of LUNs that are supported on an ESXi 5.0 host is still 256.
These enhancements to the scalability of VMFS should assist in the consolidation of more VMs onto less datastores, reducing the number of storage objects that an administrator has to manage, and in turn making storage management that little bit easier in vSphere.
Recommendation
If you have the luxury of doing so, I would recommend creating a new VMFS-5 filesystem rather than upgrading VMFS-3 to VMFS-5. Storage vMotion operations can then be used to seamlessly move your VMs to the newly created VMFS-5. This way, you will enjoy all the benefits that VMFS-5 brings.
How do you do the upgrade?
Posted by: Ceri Davies | 07/12/2011 at 02:49 PM
Simply select the VMFS-3 datastore in the Configuration tab in the vSphere client, and there is a new link called 'Upgrade to VMFS-5' in the datastore details. Click that, and you're done.
Posted by: Chogan | 07/12/2011 at 03:08 PM
Any recommendations as far as number of vms per vmfs volume now w/ the new format?
Posted by: Bad Dos | 07/12/2011 at 09:18 PM
Well, in 5.0, the theoretical maximum number of powered on virtual machines that we document/support per VMFS volume is 2048. However, there are many considerations to take into account when scoping the number of VMs per datastore such as the capabilities of the underlying storage & the IOPS and latency requirements for the VMs themselves. So all of these would have to be taken into account before a VM:datastore ratio could be calculated.
Posted by: Chogan | 07/13/2011 at 04:53 AM
Fantastic! Almost cant wait to get started on upgrading! :)
Posted by: Chris | 07/14/2011 at 01:39 AM
Interesting - how disk size has been increased to 60TB without change in VMDK size. This coincides with the move to GPT disks rather than MBR which perhaps means that the data starts spanning VMDK files once the disk size exceeds 2TB(is this a band-aid patch ?). Wonder what happens to our existing SAN - this would perhaps need to be re certified for VMFS5 ...
Posted by: anjaneshbabu | 07/16/2011 at 07:31 AM
Interesting - how disk size has been increased to 60TB without change in VMDK size. This coincides with the move to GPT disks rather than MBR which means that the data starts spanning VMDK files once the disk size exceeds 2TB(is this a band-aid patch ?). Wonder what happens to our existing SAN - this would perhaps need to be re certified for VMFS5 ...
Posted by: anjaneshbabu | 07/16/2011 at 07:31 AM
Did they return the ability to recover a deleted file in VMFS with this new version?
Posted by: Paul Henry | 07/19/2011 at 06:34 PM
Hi Anjaneshbahu, as you mention, the VMDK size is still 2TB -512 bytes & we need GPT to address larger partition sizes. I'm not sure I follow your comment about data spanning VMDK files. I guess the only way this would happen is if multiple VMDKs were presented to a VM, and the Guest OS used some software RAID technology. However this is a function of the Guest OS, and would not use any feature of the VMkernel to achieve this. I'm not sure about the requirement to recertify your SAN. The best thing to do is to check the HCL when 5.0 releases.
Posted by: Chogan | 07/28/2011 at 02:50 AM
Hi Paul, are you referring to the experimental vmfs-undelete script that appeared in ESX 3.5 as per http://kb.vmware.com/kb/1007243?
If so, then the answer is no. This script was unsupported in ESX 4.x, and is also not available in ESXi 5.0.
Posted by: Chogan | 07/28/2011 at 02:57 AM
So what is the point of VMFS5?
I can get VMFS3 to address 20TB of space already by just creating multiple 2TB disks within my server's RAID controller and expanding them within the ESX Client.
I mean, if the VMDK sizes are still limited to 2TB, the only major point to VMFS5 is that I don't have to create multiple 2TB virtual disks within the hardware RAID controller anymore. All that to save 10 minutes of extra work?
I was hoping to grow my VMDKs beyond 2TB... disappointing...
Posted by: Singh | 07/29/2011 at 09:58 AM
Cormac Hogan, the difference is no more messy extents and ability to structure with fewer LUNS, ... which would be desirable if you've adopted tiered storage that'ss being rolled out by most vendors.
Posted by: Dan | 07/30/2011 at 06:30 AM
Hello Dan,
Absolutely. And this even gets more convoluted when you have to present these same LUNs/extents to all ESX hosts in a cluster.
Posted by: Chogan | 07/30/2011 at 06:43 AM
:o) sorry I read the name above instead of below the post. Previous post was in response to "Singh", but absolutly agree with Cormac, great post.
Posted by: Dan | 07/30/2011 at 06:55 AM
is there a tool now like fsck to repair damaged Datastores?
Posted by: Dennis | 08/12/2011 at 01:31 AM
Hi Dennis,
There are no tools shipped with the ESXi, but there are internal tools available to diagnose datastore issues. If you suspect that you have a damaged datastore, open a Service Request with GSS for assistance & diagnosis.
Posted by: Chogan | 08/12/2011 at 01:51 AM
only for diagnose or also for repair? Since we lost data caused by a corrupt vmfs we are using only VRDMs for all Data.
Posted by: Dennis | 08/12/2011 at 04:33 AM
Hi Dennis,
sorry to hear about you experience but yes, GSS has expertise to diagnose and repair in certain scenarios, but obviously their ability is limited. It all depends on the type of issue.
Posted by: Chogan | 08/12/2011 at 06:21 AM
So, what is the proper way to get larger then 2TB partition in a VM (Windows)? I’m concerned about performance. Was hoping to be able to introduce 8TB partition to few vm's, but I’m not sure what is the best way to do it avoiding all sort of pitfalls like vm os corruption will leave the volume useless..
Also, some of my current VMFS-3 stores are very low on space and some warnings are showing in vSpher. I wonder if they will upgrade to VMFS-5, or there are different free space requirements that will fail the upgrade.
Posted by: Nati | 08/17/2011 at 10:20 AM
Hi Nati,
You have two choice here - multiple 2TB VMDKs assigned to the same VM, and using the Guest OS Volume Manager to build an 8TB volume, OR, you can use pass-thru RDMs passed directly into the Guest. PT RDMs can now be much larger than before, as mentioned in the blog.
Good question on the amount of free space. In order for upgrade from VMFS-3 -> VMFS-5, you need at least 2 free file blocks and 1 free inode. If these are not available, the upgrade will not succeed.
Posted by: Chogan | 08/19/2011 at 02:10 AM
Are queue lengths (file locking) still an issue with VMFS-5? Have always had issues with multiple high-performance VMs in the same datastore queuing up commands and making overall performance suffer for everything in the datastore.. (Which is why we went to NFS)
Posted by: JD | 09/23/2011 at 09:28 AM
Hi John, thanks for commenting.
Have you looked at the Storage I/O Control feature which has been in vSphere since 4.1? This addresses exactly the issue you describe by allowing you to prioritize VMs and assigning each VM a certain amount of bandwidth to each datastore when contention arises.
Posted by: Chogan | 09/27/2011 at 04:42 AM
The big thing for me about the new vmfs 5 is the changes to the block size. We had a major meltdown of our vphere 4 system. Our vendor configured our SAN raid controllers to 256Mb stripes and 8MB block size to support the 2TB vmdk. This resulted in a massive read overhead. Each 8MB block takes up 6.4 stripes. With our SAN we had 6 disks that had to be scanned 6 time with 2 being scanned 7 times. This is just under 40 scans to access a single 8MB block. We were getting read latency in the seconds!!! I am hoping the 1MB block size of vsphere 5 will alleviate this while still allowing the 2TB vmdk.
Our problem was the interaction of the larger than normal vmfs block size with the underlying raid stripe size and the unintended consequence to disk latency.
Moral of the story: small configuration can make HUGE impacts.
Posted by: Blair | 10/04/2011 at 09:27 AM
VMFS3 allowed a maximim of 8 systems to mount a volume concurrently. Has this limitation been changed in VMFS 5? Nad, secondly, can VMFS5 be used on VMFS 4.1 ESXi, or does the system have to be upgraded to ESX5 first?
Posted by: jmayes | 01/20/2012 at 11:28 AM
This limitation has not changed in VMFS5/vSphere 5.0. However it is high on our agenda to address in a future release, since it has a direct impact on both our View & vCloud Director products. Both of these products use linked clones for provisioning VMs and would benefit from a higher number of hosts sharing a file.
On your second point, VMFS5 volumes are only recognised by ESXi 5.0 hosts. ESX hosts which wish to use VMFS5 will need to be upgraded to ESXi 5.0. vCenter will not allow you to upgrade a VMFS3 to VMFS5 unless it detects that all hosts access the datastore are running ESXi 5.0.
Posted by: Chogan | 01/23/2012 at 02:00 AM