Continuous Improvement in Simplicity and Resilience with vSAN

When thinking about operations and complexity, look no further than the cockpit of a commercial jetliner. Modern commercial aircraft strive to simplify the operational experience while guarding against a countless number of environmental conditions and errors. When done well, the payoff is simpler experience that accommodates the underlying complexity of the operations. The outcome is far more predictable, and less prone to error.

Simplicity and Robustness in the Data Center

VMware believes the benefit of simplified operations goes far beyond reducing the number of steps to accomplish a task. It increases repeatability and perhaps most importantly, accommodates for the unexpected. It allows for the non-trivial to become trivial, enabling IT teams to focus on other more strategic initiatives.

Introducing simplification for the end-user comes in several forms.

  • Careful design. Efforts upfront in the design and architecture of a feature or a product allows for the capabilities to be extended while keeping the process simple and robust for the end-user.
  • Guided workflows. These are used when a set of tasks is needed to create the desired result. Guided workflows ensure all steps to achieve a result are executed in a complete and orderly way. When left unstructured, configuration errors may occur from not only incorrect entries, but performed in the incorrect order.
  • Adopting desired-state models. This allows a user to define the desired outcome, while the system takes care of how that outcome should be achieved. Storage Policy Based Management (SPBM) and the new vSphere Lifecycle Manager (vLCM) are great examples of desired-state design being introduced into VMware products.
  • Intelligence embedded into the product. Introducing more intelligence in the product to monitor and adjust for conditions at a frequency that cannot be achieved by a human. Algorithms are ideal for continuously evaluating a set of conditions and determining the proper course of action.

All of these add up to a solution that offers the proper boundaries and guidance for simple operation, and expected outcomes. This is the strategy that VMware has taken in a recent release that makes vSAN and VMware Cloud Foundation (VCF) environments powered by vSAN more robust, and easier to use.

A History of Improving the Simplicity and Robustness of vSAN

Coupled with delivering all the important features and capabilities that make vSAN the industry leader in HCI, we’ve balanced the delivery of new capabilities with efforts to simplify operation to ensure that the holistic effort of administration is as easy as possible. Let’s look at this effort to improve simplification over the past few releases up to and including vSAN 7.

vSAN 6.7

Adaptive Resync. VMware took the mystery out of how resources should be balanced when data is resynchronized. This feature evaluates the condition of the host’s storage stack and dynamically adjust resources ensuring that under contention, front-end VM operations always have priority over back-end operations to ensure guest I/O is sent and received in a timely manner.

Replica consolidation for more flexible EMM. This feature helped vSAN’s flexibility on data placement increasing the scenarios in which a host could successfully enter into maintenance mode. Replica component consolidation helps consolidate components of an object that become spread across multiple hosts as a result of rebalancing, component splitting, and other scenarios.

Improved policy inheritance of VM swap files. This enhancement simplified policy adherence to all objects tied to a VM, and improved resilience handling and capacity management of the VM.

Improved site continuity in Stretched clusters. The operation of stretched clusters under failure conditions was simplified in with this release, improving the behavior of partial and full restoration of the sites.

vSAN 6.7 U1

Cluster Quickstart for creating and extending clusters. This guided workflow has provided a dramatic improvement for all types of clusters, but especially vSAN. It includes the additional steps required for the basic configuration of a vSAN cluster.

EMM precheck simulations. Entering a host into maintenance mode in an inherently distributed storage system means a lot of intelligence must go into the process. This enhancement introduced a clever way of performing a precheck without moving any data.

Health and diagnostics enhancements. Alerting and automated health checks take the guesswork out of determining what may be in an error state, or simply not configured correctly. This enhancement brought a new level of capabilities to the vSAN health check engine.

vSAN 6.7 U3

Enhanced resync monitoring. The proper visibility into ongoing operations brings the clarity needed for an administrator. This feature introduced all-new levels of visibility to resynchronizations – an event that is inherent to a distributed storage system that places data across nodes for resilience. This enhancement offers a continuous, real time status of the number of objects currently resyncing, the reason for the resyncronization, and the amount of data remaining to be resynhronized.

Cluster level EMM precheck analysis. This feature built off of the EMM enhancements of the previous release and provided a cluster-wide view when entering a host into maintenance mode. Its pre-check engine will provide a detailed summary of object compliance and accessibility, cluster capacity, and the predicted health of the cluster should that host be entered into maintenance mode.

Disk group on-disk format prechecks. Much like the sophisticated prechecks engine used for entering a host into maintenance mode, vSAN introduced a way to make the process of transitioning to a new disk format version fully integrated into the health check engine, making the effort clearer and easier to do.

Pausing of resyncs during capacity constrained conditions. This feature automatically detects and acts to temporarily pause resynchronizations of data if it senses strained capacity resources.

Automatic rebalancing. New capabilities inside of vSAN allowed our engineering teams to automate the rebalancing of data to be more evenly distributed across the hosts in a cluster. Capacity symmetry rebalancing ensures better resource utilization across host resources, all through an automated, intelligent manner.

Automated capacity management of resyncs during policy changes. This improvement guards against the potential of a capacity strained condition as the result of storage policy changes, and will automatically perform this task in batches: Especially important to large numbers of policy changes with limited cluster capacity.

Online resizing of iSCSI volumes. Thanks to this enhancement, volumes being provided courtesy of the vSAN iSCSI service can be resized without taking the volume offline. The guest I/O will be quiesced automatically during the brief moment that it is resized.

Adaptive parallel resynchronizations. While this enhancement is typically viewed as only a performance enhancement, it has an important role to play in improved simplicity and resilience. It allows vSAN to automatically detect when resources are available to spawn multiple data streams of resyncronization traffic. Paired with Adaptive Resync, this means that resyncronizations can happen faster than ever before, while maintaining proper resources for guest VM activity.

vSAN 7

Stretched cluster I/O redirect based on strained capacity. This enhancement improves the uptime of VMs running in a stretched cluster in which one site has a capacity strained condition. vSAN will adjust the status of objects to ensure that guest I/O continues on the VMs not under capacity constraints.

NVMe hotplug support. Introducing hotplug support for NVMe devices means that device replacement can have the same operational procedures as SAS based storage devices.


Be an expert at outcomes, not discrete tasks. Running the very latest versions of vSphere and vSAN provide a level of robustness and operational simplicity that is simply unattainable in previous editions. Since vSAN provides the core cloud capabilities found in the full-stack private cloud offering of VCF, running the latest editions is the key to your private and hybrid cloud future.