[Updated for VSAN beta refresh – November 2013] I had a number of questions and queries on VSAN over the past week at VMworld 2013 in Europe. Many of these questions related to the role of disk groups. I will try to answer a number of them in this post.
The first question is what purpose do disk groups serve? Disk groups may be considered a container in which a relationship between SSD/PCIe flash devices and magnetic disks are formed. When a virtual machine is created, it is placed on magnetic disks/hard disks. However its I/O is accelerated through an SSD or PCIe flash device. The SSD acts as a read cache and write buffer for that VM’s I/O. The SSD or PCIe flash device that is used for that I/O acceleration is the one which is in the same disk group as the magnetic disks on which the VM is placed.
Currently in the VSAN beta, there are a maximum number of five disk groups per host. Each of these disk groups can contain a maximum of one SSD and seven HDDs. That is 35 magnetic disks per host, scaled out by 8 hosts in a VSAN cluster. Pretty considerable.
Disk groups can contain at most one SSD. In the event that a vSphere administrator finds that there are multiple SSDs in an ESXi host that wishes to participate in a VSAN cluster, multiple disk groups would have to be created. One can then decide the ratio of SSD to HDD if performance was a requirement (the more SSD to magnetic disks, the greater the size of the cache available to virtual machines). Alternatively a vSphere administrator may decide to keep a constant SSD to magnetic disks ratio across all disk groups for consistent VM performance. This would be the VMware best practice for the VSAN beta – uniform configurations across all hosts.
Another reason for multiple disk groups is that it allows a vSphere administrator to define their failure domain.
When availability is chosen for a virtual machine (through the FailuresToTolerate policy setting), virtual machine objects are replicated across disk groups on multiple hosts. Data is never replicated across disk groups in the same host, or indeed within the same disk group. This would be silly, since a host failure may take down multiple replica copies of the virtual machine’s storage objects.
One might ask about what happens to a disk group when one drive fails? If the disk failure is the SSD, or the PCIe flash device, then the virtual machine storage objects stored in that disk group become inaccessible. VSAN will then start building these storage objects on other disk groups in the cluster to maintain availability and adhere to policy requirements. If the failure in a single magnetic disk, then only those objects residing on that magnetic disk need to be rebuilt.
With multiple disk groups with a single SSD and a few magnetic disks, should the SSD in that disk group fail, the failure domain is limited to only those magnetic disks in that disk group. With one very large disk group containing lots of magnetic disks, an SSD failure can impact a greater number of virtual machines. The failure domain should also be a consideration when designing disk group configuration.
Get notification of these blogs postings and more VMware Storage information by following me on Twitter: @VMwareStorage