VSAN Component States
Another fairly common component state is Reconfiguring. This state is observed when a change to a storage policy is made or a new storage policy is assigned to an object. For example, when the Failure Tolerance Method is changed from RAID-1 mirroring to RAID-5/6 erasure coding on an all-flash VSAN cluster. The screen shot below shows a component in the Reconfiguring state.
There are other component states related to availability that are observed when a disk or host is offline. Let's take a closer look at these states.
As mentioned above, a component is in an Active state when operations are normal. If an issue occurs that takes a storage device or entire host offline, components on that device are marked as Absent or Degraded depending on the issue. Let's start with Absent components.
As you would expect, the losing access to a component typically reduces the availability level of an object. Consider this example: A 100GB VMDK object with a storage policy assigned where the Failure Tolerance Method is RAID-1 mirroring and FTT=1. VSAN creates two mirrored copies (replicas) each on a separate host. A witness object is placed on a third host in the cluster. A host containing one of these objects goes offline. Two of the three components are still active, which means the object is still accessible. However, if a second host containing another one of those objects goes offline, the object would be inaccessible.
VSAN, like many other storage solutions, will perform rebuilds to restore the appropriate level of resiliency. This operation on any storage platform is expensive in terms of I/O - especially if large amounts of data must be copied. VSAN attempts to make an "educated guess" on whether components will become available again in a reasonable amount of time. If components will be available again shortly - after a host reboot, for example - it does not make sense to start a (costly) rebuild process.
If a component goes offline without additional information, VSAN expects the component will come back online shortly. This situation is often referred to as All Paths Down (APD). VSAN will mark missing components in this scenario as Absent.
Examples of this nature include host reboots, network partitions, and pulling a disk from a server chassis. VSAN will wait 60 minutes by default before starting the rebuild process for components marked as Absent. The goal of this approach is to avoid unnecessary rebuilds. The 60-minute rebuild timer can be changed. This VMware KB article discusses the process.
Recommendation: Avoid changing the default rebuild timer setting of 60 minutes as this provides a good balance of avoiding unnecessary rebuilds while minimizing risk of downtime. If a situation occurs where a more timely rebuild is desired, it is possible to trigger a rebuild using the Repair Objects Immediately in the VSAN Health Check user interface.
A component that is marked as Degraded occurs when a storage device fails and error codes are sensed. VSAN assumes a defective device will not come back online and the rebuild process is started immediately. This scenario is commonly referred to as a Permanent Device Loss (PDL).
As a side note, I want to point out why simply pulling a disk is not an accurate way to simulate a disk failure when performing a proof of concept. VSAN sees this as an APD situation and expects the device will come back online shortly. Therefore, VSAN will wait 60 minutes (default) before starting the rebuild process. This has led several administrators to believe that VSAN was not working properly when in fact it was working as designed.
The last component state I will discuss in this article is Stale, as shown in the following screen shot.
VSAN use sequence numbers to verify a component has the latest updates. This sequence number is normally kept consistent across components that make up an object. When a change to the object occurs, the data is written to both replicas, for example, and the sequence number is updated for the components. If a component is active, but its sequence number is different/older than the current sequence number for the object, the component is marked as Active - Stale. This usually occurs when the components of an object go offline and come back online concurrently at different times.
To help clarify this, consider the screen shot above. The storage policy assigned to the object was RAID-1 mirroring and FTT=1. Host 02 went offline first. The object was still available as VSAN was able to achieve quorum with the replica on Host 04 and the witness on Host 01. Updates to the components on Host 04 and Host 01 continued to occur while Host 02 was offline. Then, Host 04 also went offline. At this point two of the three components were offline so the object became inaccessible. Host 02 eventually comes back online and the component becomes Active. However, the sequence number of the component on Host 02 is outdated and Host 04 is still offline. In other words, the component on Host 02 is missing the most recent changes and VSAN is "aware" of this due to the difference in sequence numbers. Even though two of the three components that make up the object are active, VSAN keeps the object inaccessible to avoid data loss or corruption. The object will remain inaccessible until the object on Host 04 is online with the most recent data. VSAN will then synchronize the stale component with the component that contains the latest data and enable access to the object.
Part 5 in the series covers VSAN fault domains.