In the previous blog posts we have talked how OVF packages can be deployed using the vSphere client and OVF Tool, and how to use OVF to create self-configurable virtual machine templates. In this blog post we will look inside an OVF package to see how it is structured and organized.
OVF is an open standard developed by the Distributed Management Task Force (DMTF) with cooperation from VMware, Citrix, IBM, Microsoft, Sun and other companies. The standard has arisen to meet the growing demand from industry to create a portable format for virtual machines that is vendor and platform independent. The goal of OVF is to provide a standard format that can robustly and efficiently deliver software solutions inside a set of virtual machines. The OVF format needs to be extensible and flexible enough to be able to serve the needs of describing a simple single VM image as well as describing the next-generation, dynamic applications for the cloud. In this blog entry, we will discuss the basic file structure of the OVF package, as well as many of the extensibility features built into the format.
On the configuration side, OVF 1.0 standardizes the basic settings for a hypervisor, including virtual hardware, disks, networks and resource allocation. An OVF package can also carry additional meta-data, including:
- Product information that describes the software installed in the virtual machines, this covers both operating systems and application level components
- End-user-license agreements (EULA)
- Self-configuration parameters and deployment options for customization at deployment time
Ok, let's dive into the technical details to see how all this is achieved. We will cover three main areas: File layout and integrity, disk formats and compression, and the OVF descriptor and extensibility.
File Layout and Integrity
An OVF package consists of an OVF descriptor, a set of virtual disks, a set of localization bundles, a manifest, and a certificate (some of these are optional). For example:
The OVF descriptor (.ovf) is the main document of an OVF package. It contains all meta-data for the OVF package and has links to external files, such as virtual disks. We will get back into how this is structured in a moment.
The manifest (.mf) and certificate (.cert) files are optional and are used for integrity and authenticity checks. The manifest file contains the SHA1 digest of all files in the package (except for the .mf and .cert files themselves), and the certificate file contains a signed digest for the manifest file and an X.509 certificate. If present, they must be in the same directory as the OVF descriptor and have the same base name. We will skip string bundles for localization for now, that is a topic for a future blog entry.
An OVF package can be distributed as a set of discrete files as shown as above. For example, they can be uploaded to a web server and the URL to the OVF descriptor (.ovf) can be emailed to the recipients. However, often it is convenient to distribute a single file. The OVF specification defines a standard archive called an Open Virtualization Format Archive (.ova) exactly for this. This format is a "tarball" of the individual files that makes up the OVF package (with certain restrictions). For example:
If you get an OVA package, try running tar tf <name>.ova and you can see the content. You can also create OVAs using the tar tool (or the OVF Tool or vSphere Client, of course). However, keep in mind that the OVA format is not simply a tar. It places certain restrictions on the ordering and naming of files. In particular, the OVF descriptor (.ovf) must be the first file and the files must be listed in the tar archive in the same order as they are listed in the OVF descriptor (see the OVF specification for all the details). These rules ensure that OVA archives are easy to stream – a tool or hypervisor does not need to download an entire OVA first and then unpack it.
Disk Formats and Compression
An OVF package is intended for distribution, so compression is the next big issue after ensuring package integrity. The granularity of compression is per file and works for both OVA archives and multi-files (.ovf) archives. In other words, you don't need to create an OVA to get file compression.
In the OVF descriptor, you can specify compression options on each external reference, such as a string bundle or an ISO file (using the compression attribute in the Reference section). The most important part to compress is the virtual disk files. All OVF packages generated by VMware products utilize a variant of the VMDK format that is especially designed for distribution and already compressed; this is called the stream-optimized format. In this format, the content of a disk is stored, by default, in 64KB compressed chunks and only non-zero chunks are included. The format is designed so it is efficient to both consume and generate on the fly, as well as being able to provide good feedback on progress while being downloaded.
Note that the VMDK format used for a deployed VM can and and typically will be different from the stream-optimized format. For instance, when deploying on ESX, the flat VMDK format is often used. The conversion from the stream-optimized format (used in the OVF package) to the flat VMDK format (used at runtime) is seamlessly done as part of the import step and is performed on the fly as part of the download process.
OVF Descriptor and Extensibility
Having covered the basic file level infrastructure, compression and integrity checking, let's take a look at the OVF descriptor (.ovf). The OVF descriptor is an XML document that stores or links to all information about the software contained in the OVF package. The purpose of the OVF descriptor itself is to provide the basic structure for embedding and discovering application meta-data. The outer most tag in the OVF descriptor is named Envelope, since the OVF descriptor is, in fact, modeled after an envelope where a variety of notes can be put into. The following outlines the structure of an OVF descriptor:
<Envelope xmlns="http://schemas.dmtf.org/ovf/envelope/1" … >
<File ovf:href="MyPackage.vmdk" ovf:id="disk1" ovf:size="68608"/>
… List of OVF sections …
… A Virtual System or VirtualSystemCollection …
… String bundle references …
Each OVF descriptor has a required section, called References, that lists all file in the package (except for the descriptor itself, the manifest and the certificate files). In this way a tool can always check an OVF package for integrity or copy it, without understand the content of each file or the meta-data sections that has included the external file references.
The rest of the OVF descriptor is made up of two parts: Entities and OVF sections. An entity is either a VirtualSystem or a VirtualSystemCollection, describing a single virtual machine or a container of multiple virtual machines, respectively. The syntax for describing a virtual machine is simply:
<Info> User-friendly description of the purpose of
this entity (e.g., it describes a VM) </Info>
<Name> Display name of the VM </Name>
… OVF Sections …
An OVF package can also contain more than one virtual machine. This is done using the VirtualSystemCollection element:
<Info> User-friendly description of the purpose of the entity
g., it describes a VM) </Info>
<Name> Display name of the vApp </Name>
… OVF Sections …
… child entities – either VirtualSystem or VirtualSystemCollection …
The real meat of a VirtualSystem or VirtualSystemCollection is in the OVF sections. An OVF section is an XML fragment that contains the data for a specific functionality or aspect, such as virtual hardware requirements, operating system type, or maybe a dependency on an external system. The general format of an OVF section is as follows:
<myns:MyOvfSection ovf:required="true or false">
<Info>A description of the purpose of the section</Info>
… section specific content …
The fully-qualified element tag (myns:MyOvfSection in the above example) uniquely identifies the OVF section. The ovf:required and <Info> elements are part of the extensible design for OVF. OVF is designed to handle both forwards- and backwards compatibility. An OVF-compliant product cannot be expected to know about all possible sections, since OVF sections can be created by third-parties or defined in newer versions of the OVF specification. The ovf:required attribute tells whether a tool can safely ignore the section, or whether it must fail if it does not understand the section. The <Info> provides a user-friendly description of the section that can be shown to the user in either case. Note that ovf:required defaults to true, so commonly it is left out.
An example of a custom third-party section in an OVF descriptor could be:
<Info>Useful info for incident tracking purposes</Info>
<BuildSystem>Acme Corporation Official Build System</BuildSystem>
The OVF 1.0 specification defines 10 core sections that cover the basic vApp information:
- DiskSection: Information about all virtual disks in the OVF package
- NetworkSection: Defines the network topology inside an OVF package
- ResourceAllocationSection: Resource settings for a VirtualSystemCollection
- AnnotationSection: A description or annotation on an entity
- ProductSection: Information about the software installed in the guest, including properties for self-configuration
- EulaSection: An end-user license agreement
- StartupSection: Defines the start-up and shutdown order for the children in a VirtualSystemCollection
- DeploymentOptionSection: Defines a set of pre-defined configurations that can be selected at deployment time
- OperatingSystemSection: Specifies the operating system installed in a guest
- InstallSection: Specifies whether a guest OS needs to be booted once (or more) before the application is fully deployed
These sections are described in the OVF Specification. See DSP0243 for the details, as well as the accompanying XML Schemas DSP8023 and DSP8027. The OVF whitepaper DSP2017 also includes many examples on how these are used. The set of OVF sections are expected to grow significantly as the OVF format gains traction.
We have only dived a little into the OVF descriptor, focusing on the file layout and the overall structure. We have covered the basic features around distribution support and extensibility. In future blog posts, we will dive into specific areas, such as localization, virtual hardware description, delta-disk compression, custom OVF sections and extensibility, multi-VM encodings, and much more. So until next time, keep deploying those vApps!