Thursday, July 17, 2025

Technical Introduction to vSAN ESA 2-Node Clusters

  • VMware vSAN can be deployed in a 2-node cluster.
  • vSAN ESA 2-node clusters provide excellent performance and availability in a small form factor.
  • A vSAN Witness Host virtual appliance enables protection against "split-brain."

Introduction

VMware's vSAN Express Storage Architecture (ESA) has streamlined hyperconverged infrastructure (HCI) by optimizing performance and efficiency. A particularly compelling deployment model within this architecture is the 2-node cluster. This setup offers a high-availability solution ideal for small sites and edge computing environments where space and hardware resources are limited.

Core Architecture and Requirements

A vSAN ESA 2-node cluster is a specialized configuration that provides data redundancy and high availability with just two physical hosts at the primary site. Unlike larger clusters, it doesn't require a minimum of three nodes. Instead, it relies on a virtual machine appliance called a vSAN Witness Host located on a host other than the two physical nodes.

VMware vSAN 2-node Cluster

Key Components

  1. Two Physical Nodes: These are two physical ESXi hosts that run virtual machines and store the data associated with those virtual machines. In ESA, these nodes must be all-NVMe, utilizing high-performance, enterprise-class NVMe storage devices for both caching and capacity. There are no traditional vSAN disk groups; instead, there's a single, flexible storage pool.

  2. Witness Host: The witness is a virtual machine appliance that resides on a third host, typically at a primary data center, when deploying a 2-node cluster in a remote site. Its primary role is to act as a tiebreaker in the event of a failure or network partition between the two data nodes. The witness host stores only metadata in the form of witness components. It does not run virtual machines or store virtual machine data (VMDK files).

Networking Essentials

Proper networking is critical for a stable 2-node cluster.

  • Data Node Interconnect: The two data nodes need a high-speed, low-latency network connection, usually a direct link or dedicated switch. At least 10 Gbps is required to support vSAN traffic and vMotion effectively. 25Gbps or higher is recommended.
  • vSAN Witness Host Traffic: The connection between the physical hosts and the vSAN Witness Host has specific requirements. While the latency requirements are less strict than for stretched clusters, they are still important.

    • Latency: The round-trip time (RTT) to the witness should be less than 500 ms.

    • Bandwidth: A 2 Mbps connection is generally sufficient, as only metadata is transmitted.

    • Traffic: It's best practice to tag the witness traffic on a separate VMkernel adapter to keep it isolated from other network traffic.

Data Placement and Protection

In a vSAN ESA 2-node cluster, data protection is achieved through mirroring. The default storage policy for a 2-node cluster is RAID 1 (Mirroring).

Here's how it works:

  1. A virtual machine's disk object (VMDK) is created on the vSAN datastore.

  2. The object has two complete copies, or replicas.

  3. One replica is placed on Node A, and the second replica is placed on Node B.

  4. A small witness component is created and placed on the witness host.

This creates a data layout of Replica 1, Replica 2, and Witness. This configuration ensures that if one data node fails, a complete copy of the data is still available on the surviving node. The witness component ensures that there is a quorum () of components available to keep the VM object online.

For example, if Node A fails, Node B still has a complete data replica, and the witness host provides the third component vote. The cluster recognizes that a valid, complete copy of the data exists and keeps the VM running (after a vSphere HA restart) on Node B.

Handling Failures: Split-Brain Scenario

The primary function of the witness is to prevent a "split-brain" scenario. Imagine the direct network link between Node A and Node B fails, but both nodes can still communicate with the witness host.

  • Without a witness, both nodes would think the other is offline and would attempt to take sole ownership of the virtual machines, leading to data corruption.

  • With a witness, both Node A and Node B will attempt to place a lock on the witness components for their VMs. The node that successfully acquires the lock gains ownership of the VM objects and continues writing to its replica, preventing a split-brain condition and ensuring data integrity.

Summary

vSAN ESA 2-node clusters offer a robust and efficient high-availability solution for small environments and edge solutions such as remote offices. By utilizing the power of NVMe hardware and a simple yet effective witness architecture, it delivers enterprise-grade speed and resilience in a compact form factor.