Recently I’ve participated in a number of discussions around Virtual SAN and vSphere HA where a couple of great and interesting questions have been brought up with regards to Virtual SAN and vSphere HA interoperability and behavior.
For the most part, the discussions have been around vSphere HA and how it works and supports network partitions and isolation events for Virtual SAN enabled clusters. Those concerns required a bit more detail to provide the adequate technical guidance.
Before diving into the details, let me start with an official statement about Virtual SAN and vSphere HA. vSphere HA fully supports and is integrated with Virtual SAN. This support required some changes in vSphere HA which impact vSphere HA behavior and result in some unique Virtual SAN related configuration considerations for vSphere HA.
In this post I will detail the following information and recommendations:
- Architecture Changes Impacting Isolation and Partition Support
- Heartbeat Datastore Recommendations
- Host Isolation Address Recommendations
- Isolation Response Recommendations
Architecture Changes Impacting Isolation and Partition Support
In vSphere 5.5 anytime HA is enabled in a cluster that is also enabled for Virtual SAN, the vSphere HA FDM agents and heartbeating monitoring operations use the Virtual SAN network instead of the Management Network(s).
The modifications to the design and behavior of vSphere HA were made in order to prevent network partition events for non-overlapping partitions as illustrated here:
- HA partition A: hosts esxi-01, 05, 06
- Virtual SAN partition A: hosts esxi-01, 02, 03
- HA partition B: hosts esxi-02, 03, 04
- Virtual SAN partition B: hosts esxi-04, 05, 06
Such partitions are hard to reason about and troubleshoot. They would also have required significant additional HA logic to support.
The “same networks” constraint leads to simpler partitions. The desired and actual behavior is illustrated here:
- HA & Virtual SAN partition A: hosts esxi-01, 02, 03
- HA & Virtual SAN partition B: hosts esxi-04, 05, 06
In a vSphere 5.5 Virtual SAN enabled cluster, Virtual SAN datastores are not utilized by the vSphere HA agents as a means for monitoring partitioned or isolated hosts. This is because during a partition or isolation event the impacted hosts will not have been able to access the heartbeat information stored on the Virtual SAN datastore.
In a scenario where a partition event occurs, the heartbeat information would have been accessible to only one segment of the cluster therefore defeating the purpose. Virtual SAN utilizes a proprietary mechanism in partition or isolation scenarios that prevents data corruption and as a byproduct, would have prevented all hosts from accessing the heartbeat information.
Heartbeat Datastore Recommendation
Heartbeat datastores are not necessary in a Virtual SAN cluster, but like in a non-Virtual SAN cluster, if available, they can provide additional benefits. VMware recommends provisioning Heartbeat datastores when the benefits they provide are sufficient to warrant any additional provisioning costs.
For example, if you are using converged networking, provisioning Heartbeat datastores can be quite expensive since separate switch infrastructure should be used for providing each host with access to a fault-isolated datastore. Hence, there is a higher cost to realizing the benefits. However, if you already have multiple physical networks, the cost of setting up an iSCSI or NFS datastore could be much lower.
Heartbeat Datastores provide the following benefits:
- They allow vCenter to report the actual state of a partitioned or isolated host rather than reporting that it appears to have failed
- For non-Virtual SAN VMs, they increase the likelihood that a FDM master will respond to a VM that fails after its host becomes partitioned or isolated.
- They prevent vSphere HA from causing VM MAC address conflicts on the VM network after a host isolation or partition when the VM network is not affected by the event. The conflict will exist until the original instance is powered off, which for a partition would occur automatically only after the partition was resolved
Heartbeat datastores provide these benefits by allowing an FDM master agent to determine if a non-responsive host is isolated, partitioned or dead and if alive, which VMs are running on that host.
Only use a datastore that all hosts will be able to access during a Virtual SAN network partition or isolation event. If you are already using a non-Virtual SAN datastore in a Virtual SAN cluster, there is no need to add another datastore just for heartbeating if the existing datastores are fault isolated from the Virtual SAN network.
To that end, look at your design holistically. For example, If you add an iSCSI datastore as a Heartbeat datastore and your Management, Virtual SAN and iSCSI vmkernal interfaces all use the same 10GbE link, you won’t be getting the same benefit as you would if the Virtual SAN and iSCSI interfaces used different links. In the first example, if the 10GbE link fails even with a Heartbeat datastore, the FDM master won’t be able to determine if non-responsive slaves are isolated or dead.
Host Isolation Addresses Recommendations
The HA agent on a host declares a host isolated if it observes no HA agent to agent network traffic and if attempts to ping the configured isolation addresses fail. Thus, isolation addresses prevent a HA agent from declaring its host isolated if, for some reason, the HA agent cannot communicate with other HA agents, such as the other hosts having failed. HA allows you to set 10 isolation addresses.
- When using Virtual SAN and vSphere HA consider configuring an isolation addresses that will allow all hosts to determine if they have lost access to the Virtual SAN network. For example: utilize the default gateway(s) of the Virtual SAN network(s). Isolation addresses are set using the vSphere HA advanced option das.isolationAddressX.
- Configure HA not to use the default management network’s default gateway. This is done using the vSphere HA advanced option das.useDefaultIsolationAddress=false
- If isolation and partitions are possible, ensure one set of isolation addresses is accessible by the hosts in each segment during a partition.
A couple of additional notes regarding the above recommendations:
- If the Virtual SAN network is non-routable and a single-host partition is possible, then provide pingable isolation addresses on the Virtual SAN subnet. If a single host partition is not likely, either provision such an isolation address, or use some of the Virtual SAN network IP address of the cluster hosts as isolation addresses, selecting a subset from any physical partitions in your environment.
- Each Virtual SAN network should be on a unique subnet. Using the same subnet for two VMkernel networks can cause unexpected results. For example, vSphere HA may fail to detect Virtual SAN network isolation events.
Isolation Response Recommendations
The HA isolation response configuration for a VM can be used to ensure the following during a host isolation event:
- To avoid VM MAC address collisions if independent heartbeat datastores are not used. Note: These can be caused by the FDM master restarting an isolated VM resulting in 2 instances of the same VM on the network.
- To allow HA to restart a VM that is running on an isolated host
- To minimize the likelihood that the memory state of the VM is not lost when its host becomes isolated.
The isolation response selection to use depend on a number of factors. These are summarized in the tables below. The tables include recommendations for Virtual SAN and non Virtual SAN virtual machines since clusters may contain a mixture of both.
Decisions Table 1
Decisions Table 2
Note: “Shutdown” may also be used anytime “power off” is mentioned if it is likely that a VM will will retain access to some of its storage but not all during a host isolation. However, note that in such a situation some of its virtual disks may be updated while others are not, which may cause inconsistencies when the VM is restarted. Further, shutdown can take longer than power off.
I want to thank GS Khalsa from the Storage & Availability Technical Marketing Team. He’s responsible for availability features such as HA, FT, App HA and SRM, for his contributions and discussion on Virtual SAN and vSphere HA. I also want to thank our engineers Keith Farkas and Manoj Krishnan for validating the accuracy of and contributing greatly to this article.
For future updates, be sure to follow me on Twitter: @PunchingClouds