Monday, December 5, 2016

vSAN Availability Part 9 - Configuring a Stretched Cluster

A vSAN stretched cluster configuration provides a simple solution for extending a vSAN cluster across geographically disbursed locations. These locations could be opposite sides of a data center with separate power feeds or two different cities. Stretched clusters enable rapid recovery from site failure with no data loss. They also provide an excellent option for migrating workloads between locations with zero downtime if maintenance at one site or the other is needed. For more of an introduction to vSAN stretched clusters, take a look at the previous article (Part 8) in this blog series.

This article covers the simplicity of configuring a vSAN stretched cluster complete with a video demo. Before we get to the video, let's take a moment to cover the main prerequisites that must be in place prior to configuring the stretched cluster. For starters, note that stretched clusters are supported with vSAN versions 6.1 and higher and vSAN Enterprise licensing is required for this feature.

Host Requirements

A vSAN stretched cluster consists of exactly two main sites, which are commonly referred to as "data sites". The data sites contain the larger components that make up virtual machine objects such as the configuration files (VMX, NVRAM, etc.) and virtual disks (VMDK files). vSAN currently supports as few as one host in each data site and as many as 15 hosts in each data site. In other words, a stretched cluster consists of at least two physical hosts up to a maximum of 30 physical hosts. While not a strict requirement, it is highly recommended the host configuration (CPUs, memory, vSAN disk groups, etc.) is identical or very similar and the number of hosts at each site are the same. Similar configurations at both sites help ensure the amount of compute and storage capacity are sufficient regardless of which site a particular group of virtual machines is running. Hybrid and all-flash configurations are supported with stretched clusters.

Witness VM

vSAN stretched clusters require the deployment of a "witness" host at a third location. VMware provides a pre-configured OVA complete with its own license to make the deployment of a witness simple and fast. The witness stores only witness components (see Part 2 of this series for a refresher on component types, if needed). It does not store large amounts of data such as VMDK files. The purpose of the witness VM and the witness components stored on that VM is to serve as a "tie-breaker" in the event network connectivity between the two data sites is lost and enable the vSAN cluster to achieve quorum if either of the data sites are offline.

RAID-1 mirroring is currently the only supported fault tolerance method in a vSAN stretched cluster. As an example, let's take a look at component placement for a 100GB VMDK file in a stretched cluster. The diagram below shows that one copy of the VMDK - a single component, in this case - is placed at one data site, the second mirror component is placed at the other data site. The witness component is contained in the witness VM at the third site.

Remember that vSAN requires access to greater than 50% of the components that make up an object to achieve quorum and make the data available. If one of the data sites is offline, greater than 50% of an objects components are still available at the other data site and on the witness VM. Therefore, vSAN can achieve quorum and maintain access to the objects so that the VMs can run. If network connectivity is lost between the two data sites, the witness acts as a tie-breaker between the two data sites and enables one site to still achieve quorum to maintain data accessibility. We will take a closer look at this scenario in the next part of this series.


Many individuals have asked what the supported maximum distance is between sites. The limitation is actually centered around network latency. The farther the distance, the higher the network latency will be in most cases. vSAN stretched clusters support a maximum of 5ms round-trip time (RTT) between the two data sites. vSAN writes data to each site in a synchronous fashion. If the RTT latency is greater than 5ms, performance will likely suffer. As for bandwidth, VMware strongly recommends a 10Gbps connection between data sites. The actual amount of available bandwidth required depends on the workload (number of VMs, amount of data written, etc.). See the vSAN Stretched Cluster Bandwidth Sizing Guide for more information. A Layer-2 (L2) network connection between data sites is also strongly recommended. While L3 is supported, it is not recommended due to the current requirement for multicast communication between the data sites.

Layer-3 (L3) network connectivity is suggested between the data sites and the witness site as this traffic is unicast. The bandwidth and latency requirements are significantly lower since the witness VM only stores witness components, which are very small compared to other vSAN component types. Only metadata updates are transmitted between the witness and each data site. The amount of bandwidth needed depends primarily on the number of objects present in the vSAN stretched cluster. The general recommendation is at least 2Mbps of available bandwidth for every 1000 objects. A RTT latency of up to 200ms is supported between the witness and each data site.


Now that we have covered the primary requirements for a vSAN stretched cluster, let's take a look at the actual process of configuring a stretched cluster. The video (no audio) below shows this process in approximately three minutes. As you might expect, we have already taken care of the prerequisites previously discussed: Physical hosts have been provisioned and configured at each data site, networking is in place, and the witness VM has been deployed to a third site.

The video starts with the separation of hosts into two fault domains: The "preferred" site and the "secondary" site. We will discuss the significance of preferred and secondary in the next article of this series. Next, we select a witness VM (there are two deployed in this particular environment, but only one is required for each stretched cluster). The third step consists of configuring the (virtual) disk group in the witness.

With the stretched cluster configured, the video proceeds showing the resynchronization process for some of the objects in the cluster. This is because both data components for an object ended up on hosts in the same site when the cluster was divided into two sites. Each site becomes a fault domain for a total of three fault domains: two data sites plus the witness site. vSAN "reconfigures" (redistributes) the components so that one copy of a data component is located at each data site and the witness component for each object is located at the witness site. Note that the process of reconfiguring components in a stretched cluster can take a fair amount of time. Once the process is complete (1 minute, 20 seconds into the video), we can see that components for a virtual disk are properly distributed in the stretched cluster.

The second half of the video demonstrates behavior when the hosts at one site go offline. A number of VMs are shown running on Host 05. We power off Hosts 03, 04, and 05 - the three hosts at one of the data sites. At approximately the two-minute mark, we refresh the vSphere Web Client and who that the three hosts are offline. Since vSAN stretched clusters are a synchronous stretched cluster solution, no data is lost and vSphere HA immediately begins restarting the affected VMs at the other site. About 10 seconds later in the video, we see some of the VMs that were running on Host 05 are now running on Host 07 at the other site. The recovery of these VMs took only a few minutes. The remainder of the video shows the affected hosts coming back online and VMs running at the site that was previously offline.

As mentioned above, the next article - Part 10 in this series - provides more details on the concept of a preferred site and a secondary site and how this relates more specifically to  scenarios when network connectivity is lost between the two data sites.

For more details on vSAN stretched cluster configuration, requirements, and recommendations, see the vSAN Stretched Cluster & 2 Node Guide.


No comments:

Post a Comment