nutsandbolts

VMware Fusion 201: Split vs. Monolithic Virtual Disks

In addition to the sparse and preallocated virtual disks, there’s another, orthogonal set of options: split and monolithic. You can have a sparse/split virtual disk (the default in Fusion 2.0), a sparse/monolithic virtual disk (the default in Fusion 1.x), a preallocated/split virtual disk, or a preallocated/monolithic virtual disk.

While sparse vs. preallocated affects how the data inside the guest is stored in the .vmdk file, split vs. monolithic affects how the .vmdk file is stored on the host. In a monolithic virtual disk, everything in a virtual disk is kept in one file – this includes metadata about the virtual disk (e.g. size, geometry, parent disk, and so on). Note: You might still have multiple vmdk files in a virtual machine (either because you have multiple disks or because you have snapshots). The previous posts about sparse and preallocated virtual disks showed monolithic disks.

Split_sparse_virtual_disk
In contrast, a split virtual disk is, well, split into multiple files. There’s a small, plaintext metadata file, and a number of slice files. If you have a preallocated/split virtual disk, each slice (except possibly the last) will be 2 GB. If you have a sparse/split virtual disk, each slice can be up to 2 GB, depending on how much data falls into that slice. Preallocated/split virtual disks have a -f### suffix (where ### is a number), while sparse/split virtual disks use a -s### suffix.

So why choose one over the other? Split disks are critical in some cases – for example, some filesystems (such as FAT) can’t deal with files larger than a certain size. By splitting virtual disks to be below this limit (typically 4 GB), you can keep a virtual machine on such a filesystem without losing data. Another advantage of split disks is that you don’t need as much space to consolidate snapshots or shrink virtual disks. We try hard not to lose data, so rather than doing these operations in place (where something could go wrong if the power fails), we make a copy and only replace it when we’re sure it succeeded. Because of this, if you use a monolithic disk, you might need as much free space as the virtual disk occupies to complete such an operation. On the other hand, with a split virtual disk, you only need 2 GB (or less, if you have a sparse slice that’s smaller) since each slice can be done individually.

On the other hand, monolithic disks have some advantages too. In addition to more obvious limited computing resources such as CPU or disk space, one of the not as well known ones is something called file handles. OSes need to keep track of which files are being used, and has a limited number of file handles to do this with. If the OS runs out of file handles, no more files can be opened. Remember that you’re using a lot more files than just the documents you’re working on – programs need to open files to read resources, for temporary use, and lots of other not immediately obvious things. With a monolithic virtual disk, you use only one file handle per virtual disk. With a sparse virtual disk, you use one file handle per slice, which can quickly add up if you’ve got a large virtual disk with a lot of snapshots.