Thursday, September 22, 2016

Virtual SAN Availability Part 5

"Fault domain" is a term that comes up fairly often in availability discussions. In IT, a fault domain usually refers to a group of servers, storage, and/or networking components that would be impacted collectively by an outage. A very common example of this is a server rack. If a top-of-rack (TOR) switch or the power distribution unit (PDU) for a server rack would fail, it would take all of the servers in that rack offline even though the servers themselves are functioning properly. That server rack is considered a fault domain.

Virtual SAN (VSAN) includes a feature called "Rack Awareness", which enables an administrator to configure fault domains in the context of a Virtual SAN cluster. Before we get into the details of this feature, let's briefly revisit the default behavior of VSAN.

Monday, September 19, 2016

Virtual SAN Availability Part 4

VSAN Component States


VSAN components can be found in a few different states. The most common state is Active, which means the component is accessible and is up to date. Below we see two components that are Active.


Another fairly common component state is Reconfiguring. This state is observed when a change to a storage policy is made or a new storage policy is assigned to an object. For example, when the Failure Tolerance Method is changed from RAID-1 mirroring to RAID-5/6 erasure coding on an all-flash VSAN cluster. The screen shot below shows a component in the Reconfiguring state.


There are other component states related to availability that are observed when a disk or host is offline. Let's take a closer look at these states.

Wednesday, September 14, 2016

Virtual SAN Availability Part 3

VSAN Utilizes the Network


VSAN consists of two or more physical hosts typically connected by a 10GbE networking. 1GbE is supported with hybrid VSAN configurations, but 10GbE is recommended. 10GbE is required for all-flash VSAN clusters. These network connection are required to replicate data across hosts for redundancy and to share metadata updates such as the location of an object's components.

As with any storage fabric, redundant connections are strongly recommended. The VMware Virtual SAN 6.2 Network Design Guide provides more details on network configurations and recommendations. Considering VSAN's dependence on the network, this often brings up questions around what happens if one of more hosts lose network connectivity with other hosts in the cluster. This article aims to address those questions.

Friday, September 9, 2016

Virtual SAN Availability Part 2

Storage Policies Affect the Number of Components


In the first part of this blog series, we started with the basics of Virtual SAN (VSAN) architecture and how data is stored on a VSAN datastore. As discussed in that post, objects such as virtual disks (VMDKs) are stored as one or more components. The maximum size of a component is 255GB. If an object is larger than 255GB, it is split up into multiple components.

Another factor that affects the number of components that make up an object is the level of availability. This is determined by the availability rule(s) configured in a storage policy, which is assigned to an object. These rules and how they affect component counts is the topic of this article.

Wednesday, September 7, 2016

Virtual SAN Availability Part 1

Introduction


VMworld 2016 U.S. featured many popular breakout sessions covering VMware storage and availability products including Virtual SAN, Site Recovery Manager, and Virtual Volumes. One of these sessions is STO8179 - Understanding the Availability Features of Virtual SAN, which was delivered by GS Khalsa and I. Most of the VMworld sessions are available for playback online, but I thought it made sense to create a blog series on this topic considering the popularity of the session. A finite amount of content can be delivered within the 60-minute time frame of a VMworld breakout session. A series of blog articles enables another way to consume the information and it allows for the addition of supplemental content. This article is the first of the series. As stated in the video recording of the VMworld session, this discussion assumes you have some basic knowledge of Virtual SAN or "VSAN" as it is often called. If you need a primer, start with this VMware Virtual SAN 6.2 Data Sheet.


Friday, July 8, 2016

Assigning a Storage Policy to Multiple VMs with PowerCLI

Storage Policy-Based Management (SPBM) is one of the most compelling benefits of Virtual SAN. With SPBM, you can define a number of policies each with a rule-set that governs items such as availability and performance. A policy can then be assigned to existing virtual machines and when creating a new virtual machine. It is even possible to assign a policy to individual virtual disks. The ability to assign policies at the virtual machine and virtual disk levels enables precise management of storage without having to create LUNs, define RAID sets, mask LUNs, etc. There are cases where an administrator might need to assign a policy to a group of virtual machines. This article provides a short script for automating the assignment of a storage policy to multiple virtual machines using VMware vSphere PowerCLI.


Friday, May 27, 2016

vSphere Replication Appliance Failure Prevention And Recovery

vSphere Replication is an excellent host-based, per-VM replication solution that is included with vSphere Essentials Plus Kit and higher editions. That’s right – if you have vSphere Essentials Plus or higher, you have vSphere Replication. There are several use cases for vSphere Replication: Migrating VMs from old hardware to new hardware, migrating VMs between data centers, and disaster recovery – with or without vCenter Site Recovery Manager (SRM) – to name a few. When talking with customers, we tend to cover the features and benefits for starters and move on to how it works – and then what happens when issues such as hardware failure, administrative mistakes, etc. occur.

In this article, we will not discuss all of the details around how it works, but at a high level, changed data for each protected VM is replicated from vSphere hosts at the source location to one or more vSphere Replication virtual appliance(s) at the target location. The vSphere Replication appliance(s) then write this replicated data to vSphere storage at the target location. This often raises questions about what happens if these vSphere Replication appliances go offline or are lost. That is what we will cover in this post.