VMware vSAN 6.2 and newer versions include the option to enable deduplication and compression for all-flash configurations. These are features that are implemented at the cluster level. In other words, data on all capacity drives in a vSAN cluster is deduplicated and compressed when the features are enabled. It is easy to turn on vSAN deduplication and compression – it is a simple checkbox.
Turning on deduplication and compression requires a rolling change across all hosts and vSAN disk groups in the cluster. This process can take a considerable amount of time – especially with larger clusters and/or larger numbers of drives and disk groups. However, the process does not require virtual machine downtime.
In some cases, there might be a need to add more capacity to an existing vSAN cluster with deduplication and compression enabled. That is what we will focus on in this article.
Capacity can be added in a few ways. The first is by adding one or more hosts to the cluster. This process is very simple. A server with drives installed is added to the cluster using the vSphere Web Client. One or more vSAN disk groups are then created in the Disk Management section of the UI.
A click-through demo of this process is available here: vSAN 6.5 Scale Out by Adding a Host
Recommendation: Initiate a proactive rebalance of disks after adding capacity to a vSAN cluster. Go to the vSAN Health UI in the vSphere Web Client, expand Cluster in the Health UI, click vSAN Disk Balance, and click Proactive Rebalance Disks.
Another way to add capacity is to add more drives to hosts in the cluster. This is commonly referred to as scaling up. One way to do this is to add capacity drives to existing disk groups. This is assuming you do not already have seven capacity drives in each disk group, which is the maximum number of capacity drives per disk group. If existing disk groups have seven capacity drives, you will need to create a new disk group to scale up. A click-through demo of this process is available here: vSAN 6.5 Scale Up by Adding Disks
Note that when a drive is added to a disk group with deduplication and compression enabled, the newly added drive does not automatically participate in the deduplication and compression process. The new drive is used to store data, but that data is not deduplicated and compressed. The disk group must be deleted and recreated so that all of the capacity drives in the disk group (including the new drive) store deduplicated and compressed data.
Recommendation: After adding new drives to a host where deduplication and compression are enabled, place the host in maintenance mode using the “Evacuate all data to other hosts” option. Delete and recreate each disk group to include the new drives. This ensures all new drives are storing deduplicated and compressed data. A proactive rebalance will probably be needed after the addition of drives is completed.
The third option for adding capacity is replacing existing drives with higher capacity drives. This is easily accomplished by removing a drive in the Disk Management section of the vSAN UI (select “Evacuate all data to other hosts” – see image below), replacing the old drive in the server with the new drive, and then adding the new drive to the disk group.
This process works fine if deduplication and compression are not enabled. If these space efficiency features are enabled, you will not be able to remove an individual disk from a disk group. This is due to the way vSAN stores data in a disk group when deduplication and compression are enabled.
vSAN deduplication block size is 4K fixed. Deduplication and compression are applied when data is destaged from the cache tier to the capacity tier. When data is destaged, a hash is computed for each block. vSAN checks this hash against the hashmap to see if it already exists. If not, a new block is allocated on a capacity drive and the data is written. If the hash already exists, a reference to the existing block is created. It is possible to have several references to a single block on one capacity drive. Removing this drive would “break” all of the cross-references to the block.
vSAN prevents this issue by not allowing an individual drive to be removed from a disk group when deduplication and compression are enabled.
To replace one or more drives in a disk group when deduplication and compression are enabled, you must remove the entire disk group (select Evacuate all data to other hosts – see image below), replace the old drive(s) with the new drive(s), and then recreate the disk group with the new drive(s).
As recommended above, it is good to perform a proactive rebalance of disks in the cluster after adding new drives.
- If deduplication and compression are not turned on, you can add or remove an individual capacity drive in a disk group.
- If deduplication and compression are enabled, you can add a capacity drive to a disk group, but you cannot remove an individual drive. You must remove the entire disk group when removing or replacing cache and/or capacity drives. The disk group must be recreated using the new drive(s).
4 comments have been added so far
I’ve a question, on the official doc I read:
“If a capacity disk fails, the entire disk group becomes unavailable. To resolve this issue, identify and replace the failing component immediately. When removing the failed disk group, use the No Data Migration option.”
Imagine I have to wait for the replacement disk to arrive. Do it’s better to evacuate data from the failed DG?
Is there an efficient way to calculate the right amount of free space to handle that kind of failure?
Hi, Lorenzo. If the disk group is offline, you will not be able to evacuate data from it. vSAN will automatically rebuild the data on other healthy disk groups across the cluster using the redundant copy or copies of data (assuming all of your objects are protected with a minimum of FTT=1). It is best to maintain a minimum of 25-30% free “slack” space in a cluster for this reason and other reasons such as policy changes, host offline, etc. This is the case for nearly all storage types. As with any HCI storage, traditional storage array, and so on, running out of space causes bad things to happen.
Good article, one follow up if you have a moment – when adding capacity disks into an existing disk group, the official documentation states “You can add a capacity disk to a disk group with enabled deduplication and compression. However, for more efficient deduplication and compression, instead of adding capacity disks, create a new disk group to increase cluster storage capacity.” I’m struggling to find out less efficient this simple expansion would be, without destroying and re-creating the disk group. Should we delete and re-add the disk groups, or can you quantify somehow the mention of “more efficient” in the VSAN documentation?