vSAN Hyperconverged Infrastructure Software-Defined Storage vSAN

Enhanced Durability During Maintenance Mode Operations

Placing a host in maintence mode is a common operational task for vSphere administrators. Host reboots, firmware and software upgrades could be some of the reasons why you must choose a data evacuation mode to be applied while the host is not operational. Hosts in a vSAN cluster placed into maintenance mode do not contribute with their resources to the entire cluster until they become operational again. The “Ensure accessibility” evacuation mode is the default option for entering a host into maintenance mode, as it provides the most flexibility in ensuring data remains available while minimizing the amount of data movement.
 
The level of policy compliance during the maintenance window might be reduced for some objects. Meaning, some of the data replicas won’t be migrated from the host placed in maintenance mode.
 
vSAN 7 Update 1 incorporates additional protection against data loss by introducing enhanced durability during maintenance operations. This new method helps to store the incremental data writes in case an unexpected host failure occurs within your cluster while a host is in maintenance mode. Thus, the level of data availability can be restored after planned maintence accompanied by additional unforeseen failure occurrence. This enhancement provides benefits for entering a vSAN host into maintenance mode when objects in the cluster use a level of failure to tolerate (FTT) of 1 and Ensure accessibility evacuation mode. Until now, when the sole node with the data replica suffers unrecoverable failure, then both data objects will become unavailable for the time the host containing the original data object is in maintenance mode.
 
Now, when a host is placed in maintence mode, vSAN will select an alternate available host or fault domain, on which it will house a second set of newly written data from the moment the host enters maintenance mode. While there is no issue with the host containing the data mirror, all the writes will be redundantly updated on the newly selected host and on the replica object as well. These new data writes are not complete replicas of the original objects, they are writes accumulated after a host enters maintenance mode. The differential writes will be visualized within the UI as RAID-D components under the VM object placement section. These components will be deleted as soon as the host containing the original data object exits maintenance mode.
 
In case the host holding the sole replica of the data object has failed, the writes will be resynced with the original data residing on the host that is no longer in maintenance mode. This way the VM object will be available and up to date even though the data object has been at risk. Take a look at the animated video below showcasing the process:
 

 

The same level of availability can be reached by applying FTT=2 or Full data evacuation. However, these options do not offer two significant advantages. First, by using the Enhanced durability option you’ll obtain faster resync time after EMM, since you’ll only resync the incremental writes. Second, you’ll save time and resources from using the Ensure accessibility mode instead of evacuating all the data residing on the host.
 
Although the impact in scenarios where FTT=1 is applied have greater impact, FTT=2 cases could also benefit from this feature. For example, if there’s sufficient capacity on alternate host or fault domain, differential changes could be stored to avoid data writes loss during maintenance mode.
 
Interested in learning more about vSAN 7 Update 1? [Check out our resource page](https://blogs.vmware.com/virtualblocks/vsan-7/ “Check out our resource page”).