Yep - this old chestnut.
This has come up time and time again, and I am going to share with you some conversations that have been occurring within VMware on this topic. In fact, we've been having these conversations for a long time now.
What is it that defragmentation is supposed to give you?
Well, historically, if you ran a defragmentation* operation against an OS disk (typically Windows), you would expect to see a performance improvement. Defragmentation moves blocks around the disk to bring together blocks belonging to the same file in an effort to make the file contiguous on disk. This means that sequential I/O operations should be faster after a defrag. Here's a view of the Disk Fragementer that is part of the System Tools with Windows 7:
This is very different to running a defrag on a physical host with a local disk. Typically you are going to have multiple VMs running together on a VMFS or NFS volume. Therefore the overall I/O to the underlying LUN is going to be random so defragmenting individual Guest OS'es is not really going to help performance. However, there are other concerns that you need to keep in mind. The easiest way to explain the concerns is to give you some scenarios of what might happen to a VM which is defraged, and what impact it has on the various vSphere technologies. You can then make up you own mind about whether it is a good idea or not.
- Thin Provisioned VMs. If you defragment a Thin Provisioned VM, as file blocks are moved around, the TP VMDK bloats up, consuming much more disk space.
- Linked Clone VMs (vCloud Director, View). In the case of a VM running off of a linked clone, the defragmenter bloats up the linked clone redo logs.
- Replicated VMs (Site Recovery Manager, vSphere Replicator). If your VM was being replicated, and you defragemented the VM on the protected site, it could well cause a lot of data to be sent over the WAN to the replicated site.
- Snapshot'ed VMs. This is a similar use case to Linked Clones. Any VMs running off of a snapshot which ran a defrag would cause the snapshot to inflate considerably, depending on how many blocks were moved during the defrag operation.
- Change Block Tracking (VMware Data Recovery). The CBT feature is used heavily by backup products, including VMware Data Recovery (VDR). This feature tracks changes to a VM's disk blocks during a backup operation. If a defrag is run during a backup operation, the number of blocks that changes will increase, which means more data will have to be backed up, meaning a longer backup time.
- Storage vMotion. Storage vMotion also uses CBT in vSphere 4.0. If a VM was being Storage vMotion'ed when a defrag operation was initiated, it would also impact the time to complete the operation since the defrag is changing blocks during the migration.
Defragmentation also generates more I/O to the disk. This could be more of a concern to customers than any possible performance improvement that might be gained from the defrag. I should point out that I have read that, internally at VMware, we have not observed any noticeable improvement in performance after a defragmentation of Guest OSes residing on SAN or NAS based datastores.
I also want to highlight an additional scenario that uses an array based technology rather than a vSphere technology. If your storage array is capable of moving blocks of data between different storage tiers (SSD/SAS/SATA), e.g. EMC FAST, then defragmentation of the Guest OS doesn't really make much sense. If your VM has been running for some time on tiered storage, then in all likelihood the array has already learnt where the hot-blocks are, and has relocated these onto the SSD. If you now go ahead and defrag, and move all of the VM's blocks around again, the array is going to have to relearn where the hot-spots are.
If you automate the defrag to run regularly, I think this could cause a performance decrease rather than give you any sort of performance gain if the VM is deployed on a datastore backed by tiered storage. This may already be enabled on some Operating Systems.
What do the Storage Array vendors say?
NetApp have a very good vSphere/NetApp interoperability WP in which they briefly discuss this topic. Quoting directly from the paper - "VMs stored on NetApp storage arrays should not use disk defragmentation utilities because the WAFL file system is designed to optimally place and access data at a level below the guest operating system (GOS) file system. If a software vendor advises you to run disk defragmentation utilities inside of a VM, contact the NetApp Global Support Center before initiating this activity."
What do you recommend?
My recommendation is not to use any defrag tools in the Guest OS. If you are being advised to use a defragmentation tool, you should now have a number of questions to raise about possible outcomes using the content in this blog posting.
* [1-March-2013] I wanted to add a clarification with regards to the defrag operation. This article is written with the generic Windows OS defragmenter in mind. Customers should be aware that VMware partners with vendors such as Condusiv/Diskeeper & Raxco who provide products which intelligently avoid fragmentation occurring in the first place, and also understand features like snapshots, etc. If excessive fragmentation is an issue in your environment, have a look at what these partners can offer.