As you use a computer, over time your files will tend to get scattered around the disk. This scattering is called fragmentation, and can slow down performance as the disk head has to seek back and forth between fragments (Note: doesn’t apply to solid state media, which doesn’t involve disk heads). Defragmentation (or defragging) is the act of reversing the process, putting order back into your system. With virtual machines, proper defragmentation is a little more complex than it is on a physical machine because of the layers involved.
Before we begin, it’s important to note that defragmentation isn’t a necessary task – your virtual machine will still work just fine even if you never defrag, and the effects of fragmentation are usually not noticeable. Personally, I’ve never feel the need to defrag. However, if for some reason you do feel the need to defrag, here’s how to do it. Note that snapshots get in the way of proper defragmenting.
Consider this hypothetical sparse virtual disk. The guest files (yellow) are fragmented in the guest filesystem (green), which is in turn fragmented in the sparse virtual disk (red), which is finally scattered over the host filesystem (blue). In all, a bit of a mess. The first thing to do is defrag the guest filesystem using your favorite third-party tool – this will of course depend on the guest OS you’re using.
Defragmenting the guest will organize guest files nicely on the virtual disk. In this example, it’s put the four-block file back together and eliminated some holes. However, defragging the guest on a sparse virtual disk will probably also cause the .vmdk file to grow (if previously untouched blocks need to be used for any reason, which they probably will). For this reason, it’s a bad idea to run automatic defragmentation on a sparse virtual disk – it’ll just cause the sparse virtual disk to keep growing, which sort of defeats the point of having a sparse virtual disk. If you really need automatic defragmentation, consider using a preallocated virtual disk.
Guest defragmentation causing the .vmdk to grow is also why you want to defragment the guest before anything else – it makes no sense to clean up lower layers if you’re immediately going to mess them up again by cleaning up the upper layers. Clean up from the inside out, not the outside in.
Follow up by shrinking the virtual disk using VMware Tools (note: requires not having snapshots). This step isn’t really related to defragmentation, but I think it’d be a good idea anyway. This will free up unused space from the .vmdk file.
After doing that, go to the virtual machine’s Settings and check under the Hard Disks pane. Fusion will tell you if it thinks disk cleanup is required – if so, do it, if not, it’s probably fine to skip this step. You can also manually defrag the .vmdk file with vmware-vdiskmanager, but I wouldn’t recommend doing this unless you know what you’re doing and why.
Finally, shut down the virtual machine (if you haven’t already) and defrag the host using your favorite third-party tool. While Apple claims OS X makes defragmenting mostly unnecessary, I believe OS X’s automatic
defragmentation only applies to files smaller than 20 MB. Chances are your virtual machine is larger than that 🙂
Again, I think that for the most part, defragmentation isn’t necessary. Even though a virtual machine has three opportunities to become fragmented (vs. a normal file, which has only one), the actual effect isn’t 3x worse. The major penalty from a fragmented disk is the time it takes for a disk head to seek to the next segment. Even though there are three layers that can get fragmented instead of just one, only the host filesystem involves the disk head (and if you’re using a solid state drive, none of them do).
4 comments have been added so far
Could there be a impact on deduplication processes on a storage box?
UML (usermode linux) has a future allowing the guests mount a directory from the host and just passes all requests to the host filesystem.
Isn’t it possible to write a Windows device driver that can do this. For example a driver could pass all file IO trough wine, and thereby use its emulation of the windows filesystem.
This would totally eliminate the need for virtual diskfiles and probably speed up disk IO a lot.
And it would also eliminate the need to run defragmentation inside the VM.
Tomas: Can you be more specific about what sort of impact you have in mind?
falde: You’re pretty much describing HGFS shared folders, which are available in most guests which have Tools support. It doesn’t completely eliminate the need for a virtual disk, since such a service will depend on the guest filesystem and you need to bootstrap that somehow. There’s also additional overhead in translating between filesystems. You’re right that if you keep your files on a HGFS shared folder (or network share, or anything else not actually on the virtual disk) you don’t need to worry as much about fragmentation.
I lifted the omega purple nature only if he are made to congradulate which you can be lovely. The prettiest writing is the readings that she love around their life. Will you make intense addons. Nature!