In vSphere 7.0 Update 1 (released in October 2020) a new feature was released called vSphere Cluster Services (vCLS). The purpose of vCLS is to ensure that cluster services, such as vSphere DRS and vSphere HA) are available to maintain the resources and health of the workload’s running the cluster. vCLS is independent of the vCenter Server availability.
vCLS uses agent virtual machines to maintain cluster services health. vCLS run in every cluster, even when cluster services like vSphere DRS and vSphere HA are not enabled.
The architecture of the vCLS control plane consists of max 3 virtual machines, also called system or agent VMs. The vCLS machines are placed on sperate hosts in a cluster. On a smaller environment (less than 3 host) the number of vCLS VMs will be equal to the number of hosts. SDDC (Software Defined Datacenter) admin’s do not need to maintain the life cycle of these vCLS VMs.
The architecture for the vSphere Cluster Services is displayed in this image.
The vCLS VMs that form the cluster quorum state, are self correcting. This means that when the vCLS VMs are not available the vSphere Cluster Services will try to create, update or power-on the vCLS VMs automatically.
There are three health states for the cluster services:
- Healthy: The vSphere Cluster Services heath is green when at least one vCLS VM is running in the cluster. To maintain vCLS VM availability, there’s a cluster quorum of three vCLS VMs deployed.
- Degraded: This is a transient state when at least one of the vCLS VMs is not available, but DRS maintains functionality. The cluster could also be in this state when either vCLS VMs are being re-deployed or getting powered-on after some impact to the running vCLS VMs.
- Unhealthy: A vCLS unhealthy state happens when DRS loses it’s functionality due to the vCLS Control plane not being available.
The vCLS VMs are automatically places in there own folder within the cluster.
The vCLS VMs are small, with minimum resources. If no shared storage is available the vCLS VMs are created on local storage. If a cluster is created before shared storage is configured on the ESXi host (for instance vSAN), it would be strongly recommended to move the vCLS VMs to the shared storage once it is created.
The vCLS VMs are running a customized Photon OS. In the image below you see the resources of a vCLS VM.
The two GB virtual disk is thin provisioned. The vCLS VM has no NIC, it does not need one to communicate because vCLS leverages a VMCI/vSocket interface to communicate with the hypervisor.
The health of vCLS VMs, including power state, is managed by vSphere ESX Agent Manager (EAM). In case of power on failure of vCLS VMs, or if the first instance of DRS for a cluster is skipped due to lack of quorum of vCLS VMs, a banner appears in the cluster summary page along with a link to a Knowledge Base article to help troubleshoot the error state. Because vCLS VMs are treated as system VMs, you do not need to backup or snapshot these VMs. The health state of these VMs is managed by vCenter services.