Products vSAN

VMware Virtual SAN Stretched Cluster

vsan-sc-logoWith the release of vSphere 6 update 1, and Virtual SAN 6.1, VMware is expanding the capabilities of Virtual SAN as a platform and is introducing higher levels of enterprise availability and data protection of the solution.

Virtual SAN 6.1 introduces the new feature Stretched Cluster. Virtual SAN Stretched Clusters provides customers with the ability to deploy a single Virtual SAN cluster across multiple data center.

Some of the critical aspects of Virtual SAN refocused around the mitigation of interruptions in operations ensuring that data is never lost, in Virtual SAN 6.1 Stretched Cluster furthers that concept by enabling the ability for no interruption in operations even in the event of a complete site failure.

Virtual SAN Stretched Cluster is a specific configuration implemented in environments where disaster/downtime avoidance is a key requirement.

The Virtual SAN Stretched Cluster feature is built upon the same architectural concepts applied in the creation of a traditional Virtual SAN cluster and the minimum of three failure domains are required to successfully form a Virtual SAN cluster.

A bit of background on the failure domains. In Virtual SAN 5.5 the concept of failure domain was implemented at a host level. Virtual machine objects could be distributed amongst multiple hosts within the cluster (minimum number of hosts being three) and should remain accessible in the event of a single hardware device, or host failure would occur.

vsan-host-fdWith Virtual SAN 6.0, we expanded the failure domain concept and introduced a configurable feature called Fault Domains.

The Fault Domain feature basically introduced rack awareness in Virtual SAN 6.0. The feature allows customers to group multiple hosts into failure zones across multiple server racks in order to ensure that replicas of virtual machine objects are not provisioned onto the same logical failure zones or server racks.  Here similar to the host configuration, a minimum of three failure zones or server racks are required.

In this scenario virtual machines components could be distributed through multiple hosts on to multiple racks, and should a rack failure event occur, the virtual machine that are not hosted on the hosts located where the server rack failed would continue to be available. However racks are typically expected be hosted in the same data center, and if a data center wide failure event were to occur, fault domains would not be able to assist with virtual machines availability.


Virtual SAN Stretched Cluster builds on the foundation of Fault Domains, where the three required failure zones is based on three sites (two active – active sites and witness site). The witness site is a unique concept as this site is only utilized to host witness virtual appliances that stores witness objects and cluster metadata information and also provide cluster quorum services during failure events.


The witness virtual appliance is a nested ESXi host designed with the sole purpose of providing the services mentioned above. The appliance does not contribute compute nor storage resources to the cluster and it is not able to host virtual machines. The witness virtual appliance is a solutions that is exclusively available and supported for Virtual SAN Stretched Clusters and Virtual SAN ROBO edition. In a Virtual SAN Stretched Cluster the maximum supported “FailuresToTolerate” is 1 due to the support of only three fault domains.

Moving on to the supported distance for the solutions, the distances between sites is predominantly dictated by the network latency requirements of the solution. For the most part, the configuration of Virtual SAN Stretched Cluster is dependent on high bandwidth and low latency links. The network bandwidth and latency requirements are illustrated and listed below:


Network Requirements between active – active sites (data fault domains)

  • 10 Gbps connectivity or greater
  • < 5-millisecond latency RTT
  • Layer 2 or Layer 3 network connectivity with multicast

Network Requirements from active – active site (data fault domains) to witness site

  • 100 Mbps connectivity
  • 100 milliseconds latency (200ms RTT)
  • Layer 3 network connectivity without multicast

While the minimum network connectivity and bandwidth requirements is 10 Gbps, higher bandwidth maybe be required depending on the size of the environment and the number of virtual machines hosted on each site. The network bandwidth requirements can be calculated based on the number of virtual machines and write operations between the active – active sites. This approach should produce accurate results for the bandwidth required to support the size of a particular environment.

Fort example an environment that consists of five nodes per site (5 + 5 + 1) with about 300 virtual machines would required about 4Gbps (about 2Gbps each direction) for normal operations. This leaves spare bandwidth that can be utilized in the event of failure scenarios. The formula below can be utilized to calculate bandwidth requirements for Virtual SAN Stretched Cluster: N (number of nodes) * W (Amount of 4K IOPS per Node) * 125 Kbps.

The networking communication for Virtual SAN (Virtual SAN Network) can be fully implemented over layer 3 as well as the traditionally recommended implementation of a stretched layer 2 domain. In a layer 3 implementations the virtual machine networks will need to be managed as independent layer 2 networks with some 3rd party overlay solution across both active – active sites. Solutions such as OTV, MPLS, or VPLS are utilized in meet the necessary requirements. When ever possible, I would preferred using VMware NSX to fulfill this requirements as it is scalable solution for this types of scenarios and it is also an organic compliment to the Software-Defined Datacenter. I’ll provide more information and details about Virtual SAN and NSX configuration semantics in a future post.


With Stretched Cluster, Virtual SAN now utilizes an algorithm that implement read locality on a per site basis. Read operations are served from a copy of the data that resides on the same site (local) where the virtual machine is running. In the event a virtual machine is migrated to the other site, then the read operations will be served from the copy of the data located on the second site. This unique behavior is exclusively applied for read operations only as there is no locality for write operations.

The configuration details of a Virtual SAN Stretched Cluster follow along the same simplistic management and configuration principals of Virtual SAN. The entire configurations perform in a few steps where 95% of the configuration is performed from the wizard in the vSphere Web Client UI (ONLY).

After the necessary and traditional Virtual SAN cluster configuration requirements are completed such as the preparation of the network for all sites (necessary L3 and L2 with multicasts network connectivity) enabled Virtual SAN in the vSphere Cluster and claim the locally attached devices proceed with the configuration of the Stretched Cluster. The  steps for configuring the Virtual SAN Stretched Cluster is illustrated below:


In typical Virtual SAN fashion, all of the configuration steps in the workflow illustrated above are performed from simplified wizard that takes a couple of minutes to complete. Once the creation of the Stretched Cluster is completed there are a few configurations steps that necessary in order to control the behavior of virtual machines from a management and deployment site perspective.

  • Create Hosts Groups – utilized to defined the location pertaining to the definition of each site (preferred and secondary).
  • Create VM Host Groups – utilized to group virtual machine based on respective characteristic (sites or application stack).
  • Create VM/Host Rules – utilized to associate and define the host group where a virtual machines is to run
  • Set HA Rule Settings – utilized to define the vSphere HA behavior in the event of a complete site failure. Set to vSphere HA should respect VM/Host affinity rules.

Note: Prior to the configuration of Stretched Cluster and right after enabling Virtual SAN on the cluster, vSphere DRS, and vSphere HA should be enabled and configured properly in order for the Stretched Cluster and its configuration to operate as expected.

vSphere HA Recommended Configuration Settings:

  • Ensure the adequate amount of resources are reserved and guaranteed for each site from an HA perspective. Use Admission Control to defined the failover capacity by defining a percentage of the clusters failover capacity. Set CPU and Memory to 50%.
  • Enable isolation response and configure the host isolation response to Power Off and Restart VMs
  • Utilized the vSphere HA advanced configuration options and manually specify multiple isolation addresses for each site utilizing the advanced HA configuration syntax:
    •  das.isolationaddressX
  • Prevent vSphere HA from using the default gateway by using
    • das.useDefaultIsolationAddress=false
  • Change the default settings of vSphere HA and Configure it to Respect VM to Host affinity rules during failover.

vSphere DRS Recommended  Configuration Settings:

  • Enabled vSphere DRS to fully automated option – DRS will only migrate virtual machines to hosts that belong to their respective VM/Host group
  • Use vSphere DRS Should Rules and avoid the use of Must Rules. vSphere 6.0 is able to support and honor the DRS Should Rules.

A Virtual SAN Stretched Cluster looks, feels, and its is managed in the same fashion as a traditional Virtual SAN cluster. From an administrative perspective there isn’t much of a difference and it proves once again that vSphere admins can leverage their existing skill sets to administer this new solution.


Much more to come on Virtual SAN Stretched Cluster topic and the greatest storage platform ever created.

– Enjoy

For future updates on Virtual SAN (VSAN), vSphere Virtual Volumes (VVol) and other Storage and Availability technologies, as well as vSphere Integrated OpenStack (VIO), and Cloud-Native Apps (CNA) be sure to follow me on Twitter: @PunchingClouds


3 comments have been added so far

  1. Connectivity requirements for the Witness appliance are stated as 100Mbps min. bandwidth and max. 200 msec RTT latency. For the ROBO edition this is stated as 1.5Mbps min. and 500 msec RTT max. Why this big difference especially with regard to min. required bandwidth to connect the Witness appliance (-host) ?

  2. Pingback:

Leave a Reply

Your email address will not be published. Required fields are marked *