Delta Disk Support in OVF
Here is a conceptual figure of the virtual disks in a delta disk compressed OVF package and how it looks when it is deployed:
In this blog post you will learn all about how to design your OVF package to take advantage of delta disks and how to apply this type of compression to your package using OVF Tool.
- We start out by looking at a brief example of how a typical OVF package with multiple disks could look like and how delta disk compression would reduce the size of it.
- Next we look at what delta disk hierarchies are and how they are expressed in the OVF descriptor. This is important to understand what they are to make delta disk compression work.
- Then we give some advice on how you should construct your OVF package to utilize delta disk compression.
- Here we also give some tips on how to shrink your virtual disks to reduce the space of your OVF package.
- Finally, we show how OVF Tool can delta disk compress your OVF package.
Example
Let us assume the two VMs run the same Linux OS (for example Ubuntu Server 9). Then much of the data on the two disks would be identical and only the bits concerning the Apache HTTP Server, PHP software, and MySQL would be different. Here is a rough estimate of how much space each component will need when stored on a compressed virtual disk:
- Ubuntu Server 9: 500 MB.
- Apache HTTP Server and PHP: 50 MB
- MySQL: 50 MB
SizeOf(Web server) + SizeOf(Database) = (500 MB + 50 MB) + (500 MB + 50 MB) = 1,100 MB
In this blog post we will explain how this space can be reduced using the delta disk feature supported by the OVF specification and OVF Tool. Using delta disk compression we can extract all the components that are equal in the two VMs (the Linux OS part), only keeping one copy of them. This leaves us with an OVF package that only take up about 600 MB of space.
Technical Details of Delta Disks
In the figure we see a tree with three nodes: Disk1 (root) with red data, Disk 2 with blue data and Disk 3 with green data. A disk element in an OVF descriptor can refer to any of the nodes in the delta disk hierarchy. For instance, if a disk in the OVF descriptor refers to Disk 3 it will essentially get the flattened Disk 3 shown in the lower half of the picture when it is deployed. The deployment semantics of a delta disk node is basically to overlay the nodes in the parent chain (omitting the white space) from the root all the way down to the chosen delta disk node. More concretely, in the example to get the flattened Disk 3, we would first write Disk 1. Then we overwrite this with the contents of Disk 2 (omitting the empty space) and finally with Disk 3 (omitting the empty space).
In the above paragraph we mention empty space. Empty space is simply a segment of a disk with containing zeroes, which be a bit misleading since it may actually used by the VM using the disk. However, for all intents and purposes it does not matter either way we look at it.
In the figure parentRefs annotate the arrows that tie the disks together. This is also what the attribute is called in the OVF descriptor which link Disk elements together and it is used on Disk elements in the DiskSection of the OVF descriptor. This is what the disk section with the three disks could look like:
<DiskSection>
<Info>Meta-information about the virtual disks</Info>
<Disk ovf:capacity="1073741824"
ovf:diskId="disk1"
ovf:fileRef="diskFile1"
ovf:format="http://www.vmware.com/interfaces/specifications/vmdk.html#streamOptimized" />
<Disk ovf:capacity="1073741824"
ovf:diskId="disk2"
ovf:fileRef="diskFile2"
ovf:parentRef="disk1"
ovf:format="http://www.vmware.com/interfaces/specifications/vmdk.html#streamOptimized"/>
<Disk ovf:capacity="1073741824"
ovf:diskId="disk3"
ovf:fileRef="diskFile3"
ovf:parentRef="disk2"
ovf:format="http://www.vmware.com/interfaces/specifications/vmdk.html#streamOptimized" />
</DiskSection>
The LAMP example can be described as this delta disk hierarchy:
<DiskSection>
<Info>Meta-information about the virtual disks</Info>
<Disk ovf:capacity="1073741824"
ovf:diskId="parentDisk"
ovf:fileRef="parentDiskFile"
ovf:format="http://www.vmware.com/interfaces/specifications/vmdk.html#streamOptimized" />
<Disk ovf:capacity="1073741824"
ovf:diskId="WebServerDisk"
ovf:fileRef="WebServerDiskFile"
ovf:parentRef="parentDisk"
ovf:format="http://www.vmware.com/interfaces/specifications/vmdk.html#streamOptimized"/>
<Disk ovf:capacity="1073741824"
ovf:diskId="DataBaseDisk"
ovf:fileRef="DatabaseDiskFile"
ovf:parentRef="parentDisk"
ovf:format="http://www.vmware.com/interfaces/specifications/vmdk.html#streamOptimized" />
</DiskSection>
Preparing an OVF Package for Delta Disk Compression
The second requirement can be difficult to satisfy if you are not careful in how you construct the OVF package, but there are ways to do it. To explain how, let us first look at the LAMP stack example that we looked at in the beginning of the blog post, to see how we can prepare it for delta disk compression. This LAMP stack had a Linux VM running Apache HTTP Server and PHP and another Linux VM running MySQL. Each VM had a single disk.
The above example is rather canonical in how you achieve the best results from delta disk compression when having multiple VMs using the same operating system, so to summarize:
- Install a plain operating system in a VM;
- Clone the plain VM the number of times you need for your solution;
- Install the remaining software specific to each VM.
If cloning is not an option when making the OVF package then perhaps VMware Studio is. It can create VMs well suited for delta disk compression, since it builds the VMs operating system and other software components in a scripted manner that that can be replayed to produce almost identical VMs.
Shrinking the Disks
When you export your VMs in your OVF package you want to make sure
that all unused space is zeroed out, since this compresses really well
in the VMDK disk format. However, space used by swap disks and deleted
files often take up space on disk, since they are not eagerly zeroed
out by default by most operating systems. This means that even though
your VM says it only uses about 500 MB it may actually take up a lot
more space. Even worse, you may have confidential information on, e.g.,
your swap drive or old deleted files that you do not want to distribute
with the OVF package. There are several ways to solve this problem. On
most Linux distributions it is possible to do the following things to
clean up a disk before you export the VM: 1) Un-mount the swap drive;
2) Write a single file to disk containing only zeroes as large as
possible; 3) Delete the file immediately after you created it. On the
command line you can do these three steps by invoking these commands:
- /sbin/swapoff -a (this will un-mount all swap disks)
- dd if=/dev/zero of=zeroFile.tmp
- rm zeroFile.tmp
We start out by installing VMware tools on the Windows Server 2008 VM and when it is installed, open VMware tools and choose “Shrink…”. This will zero out the disk. To zero out the swap disk you need to set an option under Administrative Tools. Go to Administrative Tools -> Local Security Policy -> Security Settings -> Security Options and enable the policy “Shutdown: Clear virtual memory pagefile”. When you shutdown the VM the swap disk will then be zeroed out. Please note, however, that enabling this option will increase the shutdown time significantly for large swap disks. One way of working around this problem could be to first delete the swap disk, reboot the VM and disabling the option again (and hopefully no data is written to the swap disk), and then shutting the VM before putting it in an OVF package.
Creating an OVF Package with Delta Disk Compression using OVF Tool
ovftool --makeDeltaDisks LAMP.ovf output-dir/
The disks that OVF Tool generates are compressed in the VMDK virtual disk format, but it is possible to apply a second layer of compression which may yield even smaller disks by using the –compress option. Use –compress=9 for the best compression. On a package the size of the LAMP OVF package (about 600 MB) it would yield about 30-40 MB less disk space (in our experience). Delta disk compressing our LAMP OVF package with this extra option would then simply be done by invoking:
ovftool --makeDeltaDisks -compress=9 LAMP.ovf output-dir/