Monday, January 23, 2017

vSAN Availability Part 11 - Stretched Cluster Failure Scenarios

Previous articles have covered how vSAN responds to and recovers from various environmental failures in a standard (non-stretched) cluster. This article provides concise coverage of similar environmental failures in a stretched cluster environment.

Keep in mind the following when it comes to storage policies and component placement in a stretched cluster: RAID-1 mirroring is currently the only fault tolerance method that can be used. Erasure coding requires a minimum of four fault domains. A vSAN stretched cluster only has three fault domains - the preferred, secondary, and witness fault domains. Components that make up objects such as virtual disks are distributed across the sites, i.e., fault domains, to help ensure availability. One copy of each object is placed in the preferred fault domain. A second copy of each object is placed in the secondary fault domain. A witness component for each object is located on the witness VM at a third site.

Disk Failure

Normally in a stretched cluster, all reads are performed locally and writes occur at both sites synchronously. In other words, a VM in "Site A" will read from "Site A" and write to "Site A" and "Site B". When a disk goes offline, vSAN will continue to read from and write to the other copies of impacted objects.  Reads will occur across the site link if a VM is running in one site and the accessible copy of the VM's object is located at the other site. If the vSAN cluster is a hybrid configuration, the read cache for the affected VMs will need to be re-warmed, which can impact performance for a brief amount of time.

The second copy of the object will be rebuilt on one or more healthy disks in the same site after 60 minutes if the failed disk is marked as absent. The rebuild will start immediately if the failed disk is marked as degraded. See Part 4 of this blog series for more information in component states and rebuilds.

Host Failure

Behavior in a host failure scenario is similar to a disk failure. The main difference is host failure will, of course, impact VMs running on the failed host. vSAN and vSphere HA are tightly integrated. These VMs are automatically restarted by vSphere HA on other nodes in the cluster.

Naturally, a host failure will likely have a larger impact than a single disk failure as there are commonly multiple disks in each host. If a host is offline for an extended period of time or permanently, rebuilds will probably take considerably longer than what would be observed with the loss of a single disk. Note that vSAN prioritizes normal production traffic higher than rebuild traffic to minimize any potential performance impact from having to rebuild large amounts of data.

Local Network Failure

If there is a network failure within a site (inter-site link is still intact), vSAN will respond as described in Part 3 of this blog series. In the case where one or more hosts are isolated from the rest of the hosts at the same site, the isolated hosts will form a separate network partition group until the network issue is resolved. The isolated host(s) will lack the necessary number of components/votes (greater than 50%) to achieve quorum. vSphere HA will power off the VMs on the isolated hosts and attempt to restart the VMs on other hosts that have proper network connectivity in the cluster.

Preferred or Secondary Site Failure

When the preferred or secondary site (fault domain) goes offline, it is still possible for vSAN to achieve quorum using the components at the healthy site and the witness components. vSphere HA automatically restarts the VMs that were running at the offline site. VMs running at the healthy site continue to run without downtime. When the offline site is returned to normal operation, all changes at the healthy site are sync'd between both sites.

For details on the difference between the Preferred site and Secondary site in a vSAN stretched cluster, see Part 10 in this blog series.

DRS affinity rules can be used to automate the process of migrating running VMs back to their original location after a site outage has been resolved. This click-through demo shows the process of configuring a vSAN stretched cluster, its behavior when a site goes offline, and the return to normal operations when the failed site is back online.

Witness Site Failure

Loss of connectivity to the witness VM has minimal impact on the VMs running in a stretched cluster. VMs continue to run and all data remains accessible. If the witness VM lost connectivity with the preferred and/or secondary site, vSAN is still able to achieve quorum between the preferred and secondary sites.

However, the additional loss of connectivity between the preferred and secondary sites or one of these sites going offline while the witness is offline would cause considerable impact with behaviors similar to what is documented in the "All Hosts Are Disconnected" section of Part 3 in this blog series. That is why it is important to treat the loss of the witness site just like the loss of a main data site and bring the witness back online as soon as possible. If the witness VM is lost permanently, a new one can be deployed with minimal effort. Witness components and vSAN metadata will be re-sync'd with the new witness.

Something else to keep in mind is the witness must be able to communicate with both sites to participate normally in the cluster. If it is unable to communicate with the preferred and secondary fault domains, the witness is removed from the cluster. The witness will automatically rejoin the cluster when connectivity to both sites is restored. vSAN verifies connectivity between sites through the use of "heartbeat" packets. These are transmitted between sites once per second. If five consecutive heartbeats are missed, the connection is down from a vSAN perspective.


A vSAN stretched cluster provides resiliency to disk failure, host failure, and the loss of an entire site. Stretched clusters are easy to configure as shown in the click-through demo mentioned earlier. vSphere HA is tightly integrated with vSAN to enable rapid, automated recovery from outages. The next and final installment of this series discusses data protection and disaster recovery in vSAN environments.


  1. Do i need a license of hypervisor Enterprise Plus or with Standar is ok??

    1. Please see this licensing guide: