vSphere Availability Update 1 VMware vSphere 6.5 VMware ESXi 6.5 vCenter Server 6.5 This document supports the version of each product listed and supports all subsequent versions until the document is replaced by a new edition. To check for more recent editions of this document, see http://www.vmware.com/support/pubs.
vSphere Availability You can find the most up-to-date technical documentation on the VMware Web site at: http://www.vmware.com/support/ The VMware Web site also provides the latest product updates. If you have comments about this documentation, submit your feedback to: docfeedback@vmware.com Copyright © 2009–2017 VMware, Inc. All rights reserved. Copyright and trademark information. VMware, Inc. 3401 Hillview Ave. Palo Alto, CA 94304 www.vmware.com 2 VMware, Inc.
Contents About vSphere Availability 5 1 Business Continuity and Minimizing Downtime 7 Reducing Planned Downtime 7 Preventing Unplanned Downtime 8 vSphere HA Provides Rapid Recovery from Outages 8 vSphere Fault Tolerance Provides Continuous Availability 9 Protecting the vCenter Server Appliance with vCenter High Availability Protecting vCenter Server with VMware Service Lifecycle Manager 10 10 2 Creating and Using vSphere HA Clusters 11 How vSphere HA Works 11 vSphere HA Admission Control 19 vSphere H
vSphere Availability Configure MSCS for High Availability 81 Index 83 4 VMware, Inc.
About vSphere Availability ® vSphere Availability describes solutions that provide business continuity, including how to establish vSphere High Availability (HA) and vSphere Fault Tolerance. Intended Audience This information is for anyone who wants to provide business continuity through the vSphere HA and Fault Tolerance solutions. The information in this book is for experienced Windows or Linux system administrators who are familiar with virtual machine technology and data center operations.
vSphere Availability 6 VMware, Inc.
Business Continuity and Minimizing Downtime 1 Downtime, whether planned or unplanned, brings considerable costs. However, solutions that ensure higher levels of availability have traditionally been costly, hard to implement, and difficult to manage. VMware software makes it simpler and less expensive to provide higher levels of availability for important applications.
vSphere Availability ® The vSphere vMotion and Storage vMotion functionality in vSphere makes it possible for organizations to reduce planned downtime because workloads in a VMware environment can be dynamically moved to different physical servers or to different underlying storage without service interruption. Administrators can perform faster and completely transparent maintenance operations, without being forced to schedule inconvenient maintenance windows.
Chapter 1 Business Continuity and Minimizing Downtime vSphere HA has several advantages over traditional failover solutions: Minimal setup After a vSphere HA cluster is set up, all virtual machines in the cluster get failover support without additional configuration. Reduced hardware cost and setup The virtual machine acts as a portable container for the applications and it can be moved among hosts. Administrators avoid duplicate configurations on multiple machines.
vSphere Availability Protecting the vCenter Server Appliance with vCenter High Availability vCenter High Availability (vCenter HA) protects not only against host and hardware failures but also against vCenter Server application failures. Using automated failover from active to passive, vCenter HA supports high availability with minimal downtime. vCenter HA Deployment Options vCenter HA protects your vCenter Server Appliance.
Creating and Using vSphere HA Clusters 2 vSphere HA clusters enable a collection of ESXi hosts to work together so that, as a group, they provide higher levels of availability for virtual machines than each ESXi host can provide individually. When you plan the creation and usage of a new vSphere HA cluster, the options you select affect the way that cluster responds to failures of hosts or virtual machines.
vSphere Availability Master and Subordinate Hosts When you add a host to a vSphere HA cluster, an agent is uploaded to the host and configured to communicate with other agents in the cluster. Each host in the cluster functions as a master host or a subordinate host. When vSphere HA is enabled for a cluster, all active hosts (that are not in standby, maintenance mode or not disconnected) participate in an election to choose the cluster's master host.
Chapter 2 Creating and Using vSphere HA Clusters If a master host cannot communicate directly with the agent on a subordinate host, the subordinate host does not respond to ICMP pings. If the agent is not issuing heartbeats, it is viewed as failed. The host's virtual machines are restarted on alternate hosts. If such a subordinate host is exchanging heartbeats with a datastore, the master host assumes that the subordinate host is in a network partition or is network isolated.
vSphere Availability To use the Shutdown and restart VMs setting, you must install VMware Tools in the guest operating system of the virtual machine. Shutting down the virtual machine provides the advantage of preserving its state. Shutting down is better than powering off the virtual machine, which does not flush most recent changes to disk or commit transactions. Virtual machines that are in the process of shutting down take longer to fail over while the shutdown completes.
Chapter 2 Creating and Using vSphere HA Clusters Host limits In addition to resource reservations, a virtual machine can only be placed on a host if doing so does not violate the maximum number of allowed virtual machines or the number of in-use vCPUs. Feature constraints If the advanced option has been set that requires vSphere HA to enforce VM to VM anti-affinity rules, vSphere HA does not violate this rule.
vSphere Availability After failures are detected, vSphere HA resets virtual machines. The reset ensures that services remain available. To avoid resetting virtual machines repeatedly for nontransient errors, by default, virtual machines will be reset only three times during a certain configurable time interval. After virtual machines have been reset three times, vSphere HA makes no further attempts to reset the virtual machines after subsequent failures until after the specified time has elapsed.
Chapter 2 Creating and Using vSphere HA Clusters Network Partitions When a management network failure occurs for a vSphere HA cluster, a subset of the cluster's hosts might be unable to communicate over the management network with the other hosts. Multiple partitions can occur in a cluster. A partitioned cluster leads to degraded virtual machine protection and cluster management functionality. Correct the partitioned cluster as soon as possible. n Virtual machine protection.
vSphere Availability vSphere HA Security vSphere HA is enhanced by several security features. 18 Select firewall ports opened vSphere HA uses TCP and UDP port 8182 for agent-to-agent communication. The firewall ports open and close automatically to ensure they are open only when needed. Configuration files protected using file system permissions vSphere HA stores configuration information on the local storage or on ramdisk if there is no local datastore.
Chapter 2 Creating and Using vSphere HA Clusters vSphere HA Admission Control vSphere HA uses admission control to ensure that sufficient resources are reserved for virtual machine recovery when a host fails. Admission control imposes constraints on resource usage. Any action that might violate these constraints is not permitted.
vSphere Availability 3 Calculates the Current CPU Failover Capacity and Current Memory Failover Capacity for the cluster. 4 Determines if either the Current CPU Failover Capacity or Current Memory Failover Capacity is less than the corresponding Configured Failover Capacity (provided by the user). If so, admission control disallows the operation. vSphere HA uses the actual reservations of the virtual machines.
Chapter 2 Creating and Using vSphere HA Clusters Figure 2‑1. Admission Control Example with Percentage of Cluster Resources Reserved Policy VM1 2GHz 1GB VM2 2GHz 1GB VM3 1GHz 2GB VM4 1GHz 1GB VM5 1GHz 1GB total resource requirements 7GHz, 6GB H1 H2 H3 9GHz 9GB 9GHz 6GB 6GHz 6GB total host resources 24GHz, 21GB The total resource requirements for the powered-on virtual machines is 7GHz and 6GB. The total host resources available for virtual machines is 24GHz and 21GB.
vSphere Availability Slot size is comprised of two components, CPU and memory. n vSphere HA calculates the CPU component by obtaining the CPU reservation of each powered-on virtual machine and selecting the largest value. If you have not specified a CPU reservation for a virtual machine, it is assigned a default value of 32MHz. You can change this value by using the das.vmcpuminmhz advanced option.
Chapter 2 Creating and Using vSphere HA Clusters Figure 2‑2. Admission Control Example with Host Failures Cluster Tolerates Policy VM1 2GHz 1GB VM2 2GHz 1GB VM3 1GHz 2GB VM4 1GHz 1GB VM5 1GHz 1GB slot size 2GHz, 2GB H1 H2 H3 9GHz 9GB 9GHz 6GB 6GHz 6GB 4 slots 3 slots 3 slots 6 slots remaining if H1 fails 1 Slot size is calculated by comparing both the CPU and memory requirements of the virtual machines and selecting the largest.
vSphere Availability vSphere HA Interoperability vSphere HA can interoperate with many other features, such as DRS and vSAN. Before configuring vSphere HA, you should be aware of the limitations of its interoperability with these other features or products. Using vSphere HA with vSAN You can use vSAN as the shared storage for a vSphere HA cluster. If enabled, vSAN aggregates the specified local storage disks available on the hosts into a single datastore shared by all hosts.
Chapter 2 Creating and Using vSphere HA Clusters Capacity Reservation Settings When you reserve capacity for your vSphere HA cluster with an admission control policy, you must coordinate this setting with the corresponding vSAN setting that ensures data accessibility on failures. Specifically, the Number of Failures Tolerated setting in the vSAN rule set must not be lower than the capacity that the vSphere HA admission control setting reserved.
vSphere Availability vSphere HA and DRS Affinity Rules If you create a DRS affinity rule for your cluster, you can specify how vSphere HA applies that rule during a virtual machine failover. The two types of rules for which you can specify vSphere HA failover behavior are the following: n VM anti-affinity rules force specified virtual machines to remain apart during failover actions.
Chapter 2 Creating and Using vSphere HA Clusters In addition to the previous restrictions, the following types of IPv6 address types are not supported for use with the vSphere HA isolation address or management network: link-local, ORCHID, and link-local with zone indices. Also, the loopback address type cannot be used for the management network. Note To upgrade an existing IPv4 deployment to IPv6, you must first disable vSphere HA.
vSphere Availability n vSphere HA supports both IPv4 and IPv6. See “Other vSphere HA Interoperability Issues,” on page 26 for considerations when using IPv6. n For VM Component Protection to work, hosts must have the All Paths Down (APD) Timeout feature enabled. n To use VM Component Protection, clusters must contain ESXi 6.0 hosts or later. n Only vSphere HA clusters that contain ESXi 6.0 or later hosts can be used to enable VMCP.
Chapter 2 Creating and Using vSphere HA Clusters 6 c Select Turn ON vSphere HA. d Select Turn ON Proactive HA to allow proactive migrations of VMs from hosts on which a provider has notified a health degradation. Under Failures and Responses select Enable Host Monitoring With Host Monitoring enabled, hosts in the cluster can exchange network heartbeats and vSphere HA can take action when it detects failures. Host Monitoring is required for the vSphere Fault Tolerance recovery process to work properly.
vSphere Availability Configuring Responses to Failures The Failure and Responses pane of the vSphere HA settings allows you to configure how your cluster should function when problems are encountered. In this part of the vSphere Web Client, you can determine the specific responses the vSphere HA cluster has for host failures and isolation.
Chapter 2 Creating and Using vSphere HA Clusters Respond to Host Isolation You can set specific responses to host isolation that occurs in your vSphere HA cluster. This page is editable only if you have enabled vSphere HA. Procedure 1 In the vSphere Web Client, browse to the vSphere HA cluster. 2 Click the Configure tab. 3 Select vSphere Availability and click Edit. 4 Click Failures and Responses and expand Response for Host Isolation.
vSphere Availability 4 Click Failures and Responses and expand VM Monitoring. 5 Select VM Monitoring and Application Monitoring. These settings turn on VMware Tools heartbeats and application heartbeats, respectively. 6 To set the heartbeat monitoring sensitivity, move the slider between Low and High or select Custom to provide custom settings. 7 Click OK. Your monitoring settings take effect.
Chapter 2 Creating and Using vSphere HA Clusters Configure Admission Control After you create a cluster, you can configure admission control to specify whether virtual machines can be started if they violate availability constraints. The cluster reserves resources so that failover can occur for all running virtual machines on the specified number of hosts. The Admission Control page appears only if you enabled vSphere HA. Procedure 1 In the vSphere Web Client, browse to the vSphere HA cluster.
vSphere Availability 5 To instruct vSphere HA about how to select the datastores and how to treat your preferences, select from the following options. Table 2‑3. Datastore Heartbeating Options Automatically select datastores accessible from the host Use datastores only from the specified list Use datastores from the specified list and complement automatically if needed 6 In the Available heartbeat datastores pane, select the datastores that you want to use for heartbeating.
Chapter 2 Creating and Using vSphere HA Clusters vSphere HA Advanced Options You can set advanced options that affect the behavior of your vSphere HA cluster. Table 2‑4. vSphere HA Advanced Options Option Description das.isolationaddress[...] Sets the address to ping to determine if a host is isolated from the network. This address is pinged only when heartbeats are not received from any other host in the cluster. If not specified, the default gateway of the management network is used.
vSphere Availability Table 2‑4. vSphere HA Advanced Options (Continued) 36 Option Description fdm.isolationpolicydelaysec The number of seconds system waits before executing the isolation policy once it is determined that a host is isolated. The minimum value is 30. If set to a value less than 30, the delay will be 30 seconds. das.respectvmvmantiaffinityrules Determines if vSphere HA enforces VM-VM anti-affinity rules. Default value is "false", whereby the rules are not enforced.
Chapter 2 Creating and Using vSphere HA Clusters Table 2‑4. vSphere HA Advanced Options (Continued) Option Description das.reregisterrestartdisabledvms When vSphere HA is disabled on a specific VM this option ensures that the VM is registered on another host after a failure. This allows you to power-on that VM without needing to re-register it manually. Note When this option is used, vSphere HA does not power on the VM, but only registers it. das.
vSphere Availability Best Practices for VMware vSphere® High Availability Clusters To ensure optimal vSphere HA cluster performance, you must follow certain best practices. This section highlights some of the key best practices for a vSphere HA cluster. You can also refer to the vSphere High Availability Deployment Best Practices publication for further discussion. Best Practices for Networking Observe the following best practices for the configuration of host NICs and network topology for vSphere HA.
Chapter 2 Creating and Using vSphere HA Clusters Network Isolation Addresses A network isolation address is an IP address that is pinged to determine whether a host is isolated from the network. This address is pinged only when a host has stopped receiving heartbeats from all other hosts in the cluster. If a host can ping its network isolation address, the host is not network isolated, and the other hosts in the cluster have either failed or are network partitioned.
vSphere Availability Best Practices for Interoperability Observe the following best practices for allowing interoperability between vSphere HA and other features. vSphere HA and Storage vMotion Interoperability in a Mixed Cluster In clusters where ESXi 5.x hosts and ESX/ESXi 4.1 or earlier hosts are present and where Storage vMotion is used extensively or Storage DRS is enabled, do not deploy vSphere HA.
Providing Fault Tolerance for Virtual Machines 3 You can use vSphere Fault Tolerance for your virtual machines to ensure continuity with higher levels of availability and data protection. Fault Tolerance is built on the ESXi host platform, and it provides availability by having identical virtual machines run on separate hosts. To obtain the optimal results from Fault Tolerance you must be familiar with how it works, how to enable it for your cluster, virtual machines and the best practices for its usage.
vSphere Availability A fault tolerant virtual machine and its secondary copy are not allowed to run on the same host. This restriction ensures that a host failure cannot result in the loss of both VMs. Note You can also use VM-Host affinity rules to dictate which hosts designated virtual machines can run on. If you use these rules, be aware that for any Primary VM that is affected by such a rule, its associated Secondary VM is also affected by that rule.
Chapter 3 Providing Fault Tolerance for Virtual Machines CPUs that are used in host machines for fault tolerant VMs must be compatible with vSphere vMotion or improved with Enhanced vMotion Compatibility. Also, CPUs that support Hardware MMU virtualization (Intel EPT or AMD RVI) are required. The following CPUs are supported. n Intel Sandy Bridge or later. Avoton is not supported. n AMD Bulldozer or later. Use a 10-Gbit logging network for FT and verify that the network is low latency.
vSphere Availability n Linked clones. You cannot use Fault Tolerance on a virtual machine that is a linked clone, nor can you create a linked clone from an FT-enabled virtual machine. n VM Component Protection (VMCP). If your cluster has VMCP enabled, overrides are created for fault tolerant virtual machines that turn this feature off. n Virtual Volume datastores. n Storage-based policy management. n I/O filters.
Chapter 3 Providing Fault Tolerance for Virtual Machines Using Fault Tolerance with DRS You can use vSphere Fault Tolerance with vSphere Distributed Resource Scheduler (DRS) only when the Enhanced vMotion Compatibility (EVC) feature is enabled. This process allows fault tolerant virtual machines to benefit from better initial placement.
vSphere Availability Host Requirements for Fault Tolerance You must meet the following host requirements before you use Fault Tolerance. n Hosts must use supported processors. n Hosts must be licensed for Fault Tolerance. n Hosts must be certified for Fault Tolerance. See http://www.vmware.com/resources/compatibility/search.php and select Search by Fault Tolerant Compatible Sets to determine if your hosts are certified.
Chapter 3 Providing Fault Tolerance for Virtual Machines Prerequisites Multiple gigabit Network Interface Cards (NICs) are required. For each host supporting Fault Tolerance, a minimum of two physical NICs is recommended. For example, you need one dedicated to Fault Tolerance logging and one dedicated to vMotion. Use three or more NICs to ensure availability. Note The vMotion and FT logging NICs must be on different subnets. If you are using legacy FT, IPv6 is not supported on the FT logging NIC.
vSphere Availability Validation Checks for Turning On Fault Tolerance If the option to turn on Fault Tolerance is available, this task still must be validated and can fail if certain requirements are not met. Several validation checks are performed on a virtual machine before Fault Tolerance can be turned on. n SSL certificate checking must be enabled in the vCenter Server settings. n The host must be in a vSphere HA cluster or a mixed vSphere HA and DRS cluster. n The host must have ESXi 6.
Chapter 3 Providing Fault Tolerance for Virtual Machines Turn On Fault Tolerance You can turn on vSphere Fault Tolerance through the vSphere Web Client. When Fault Tolerance is turned on, vCenter Server resets the virtual machine's memory limit and sets the memory reservation to the memory size of the virtual machine. While Fault Tolerance remains turned on, you cannot change the memory reservation, size, limit, number of vCPUs, or shares. You also cannot add or remove disks for the VM.
vSphere Availability Fault Tolerance is turned off for the selected virtual machine. The history and the secondary virtual machine for the selected virtual machine are deleted. Suspend Fault Tolerance Suspending vSphere Fault Tolerance for a virtual machine suspends its Fault Tolerance protection, but preserves the Secondary VM, its configuration, and all history. Use this option to resume Fault Tolerance protection in the future.
Chapter 3 Providing Fault Tolerance for Virtual Machines Test Restart Secondary You can induce the failure of a Secondary VM to test the Fault Tolerance protection provided for a selected Primary VM. This option is unavailable (dimmed) if the virtual machine is powered off. Procedure 1 In the vSphere Web Client, browse to the Primary VM for which you want to conduct the test. 2 Right-click the virtual machine and select Fault Tolerance > Test Restart Secondary.
vSphere Availability Host Configuration Hosts running the Primary and Secondary VMs should operate at approximately the same processor frequencies, otherwise the Secondary VM might be restarted more frequently. Platform power management features that do not adjust based on workload (for example, power capping and enforced low frequency modes to save power) can cause processor frequencies to vary greatly.
Chapter 3 Providing Fault Tolerance for Virtual Machines For virtual machines with Fault Tolerance enabled, you might use ISO images that are accessible only to the Primary VM. In such a case, the Primary VM can access the ISO, but if a failover occurs, the CD-ROM reports errors as if there is no media. This situation might be acceptable if the CD-ROM is being used for a temporary, noncritical operation such as a patch.
vSphere Availability Table 3‑2. Differences Between Legacy FT and vSphere FT (Continued) Legacy FT vSphere FT Eager-zeroed thick .vmdk disk files Required Not required because vSphere FT supports all disk file types, including thick and thin .vmdk redundancy Only a single copy Primary VMs and Secondary VMs always maintain independent copies, which can be placed on different datastores to increase redundancy.
Chapter 3 Providing Fault Tolerance for Virtual Machines vCenter Server version 6.5 or later can manage existing legacy FT VMs, but you cannot create legacy FT VMs, even on hosts with a version earlier than version 6.5. The following vSphere FT operations can be performed in this scenario: n Suspend or resume FT n Test failover n Restart secondary n Migrate secondary n Turn off FT Note Legacy FT VMs can exist only on ESXi hosts that are running on vSphere versions earlier than 6.5. VMware, Inc.
vSphere Availability 56 VMware, Inc.
vCenter High Availability 4 vCenter High Availability (vCenter HA) protects vCenter Server Appliance against host and hardware failures. The active-passive architecture of the solution can also help you reduce downtime significantly when you patch vCenter Server Appliance. After some network configuration, you create a three-node cluster that contains Active, Passive, and Witness nodes. Different configuration paths are available. What you select depends on your existing configuration.
vSphere Availability 7 Patching a vCenter High Availability Environment on page 78 You can patch a vCenter Server Appliance which is in a vCenter High Availability cluster by using the software-packages utility available in the vCenter Server Appliance shell. For more information, see vSphere Upgrade. Plan the vCenter HA Deployment Before you can configure vCenter HA, you have to consider several factors.
Chapter 4 vCenter High Availability Table 4‑1. vCenter HA Nodes Node Active Description n n n n Passive n n n Witness n n Runs the active vCenter Server Appliance instance Uses a public IP address for the management interface Uses the vCenter HA network for replication of data to the Passive node. Uses the vCenter HA network to communicate with the Witness node.
vSphere Availability vCenter HA Deployment Options You can set up your vCenter HA environment with an embedded Platform Services Controller or with an external Platform Services Controller. If you decide to use an external Platform Services Controller, you can place it behind a load balancer for protection in case of Platform Services Controller failure.
Chapter 4 vCenter High Availability n 2147038 Configuring F5 BIG-IP Load Balancer for use with vSphere Platform Services Controller (PSC) 6.5 n 2147046 Configuring NSX Edge Load Balancer for use with vSphere Platform Services Controller (PSC) 6.5 The environment setup is as follows. Figure 4‑3.
vSphere Availability Basic Configuration Workflow Basic configuration automatically clones the Active node. You must meet one of the following requirements to perform Basic configuration. n Either the vCenter Server Appliance that will become the Active node is managing its own ESXi host and its own virtual machine. This configuration is sometimes called a self-managed vCenter Server.
Chapter 4 vCenter High Availability Configure the Network Regardless of the deployment option and inventory hierarchy that you select, you have to set up your network before you can start configuration. To set the foundation for the vCenter HA network, you add a port group to each ESXi host, and add a virtual NIC to the vCenter Server Appliance that later becomes the Active node.
vSphere Availability n The wizard prompts you to clone the Active node. As part of the clone process, you perform additional network configuration. See “Configure vCenter HA With the Advanced Option,” on page 65. Configure vCenter HA With the Basic Option When you use the Basic option, the vCenter HA wizard creates and configures a second network adapter on the vCenter Server Appliance, clones the Active node, and configures the vCenter HA network.
Chapter 4 vCenter High Availability The Passive and Witness nodes are created. When vCenter HA configuration is complete, vCenter Server Appliance has high availability protection. What to do next See “Manage the vCenter HA Configuration,” on page 68 for a list of cluster management tasks.
vSphere Availability 3 4 Log in to the vCenter Server Appliance that will initially become the Active node, directly. Interface Action vCenter Server Appliance Go to https://appliance-IP-address-or-FQDN:5480 vSphere Web Client a b Go to https://appliance-IP-address-or-FQDN/vsphere-client Select Administration > System Configuration Configure the IP settings for the second network adapter.
Chapter 4 vCenter High Availability 2 3 4 For the first clone, which will become the Passive node, enter the following values. Option Value New Virtual Machine Name Name of the Passive node. For example, use vcsa-peer. Select Compute Resource Select Storage Use a different target host and datastore than for the Active node if possible.
vSphere Availability 2 Wait for vCenter HA setup to complete. Manage the vCenter HA Configuration After you configure your vCenter HA cluster, you can perform management tasks. These tasks include certificate replacement, replacement of SSH keys, and SNMP setup. You can also edit the cluster configuration to disable or enable vCenter HA, enter maintenance mode, and remove the cluster configuration.
Chapter 4 vCenter High Availability Set Up SNMP Traps You can set up Simple Network Management Protocol (SNMP) traps to receive SNMP notifications for your vCenter HA cluster. The traps default to SNMP version 1. Set up SNMP traps for the Active node and the Passive node. You tell the agent where to send related traps, by adding a target entry to the snmpd configuration. Procedure 1 Log in to the Active node by using the Virtual Machine Console or SSH.
vSphere Availability Manage vCenter HA SSH Keys vCenter HA uses SSH keys for password-less authentication between the Active, Passive, and Witness nodes. The authentication is used for heartbeat exchange and file and data replication. To replace the SSH keys in the nodes of a vCenter HA cluster, you disable the cluster, generate new SSH keys on the Active node, transfer the keys to the passive node, and enable the cluster. Procedure 1 Edit the cluster and change the mode to Disabled.
Chapter 4 vCenter High Availability Table 4‑3. vCenter HA Cluster Modes of Operation Mode Automatic Failover Manual Failover Replication Enabled Yes Yes Yes This default mode of operation protects the vCenter Server Appliance from hardware and software failures by performing automatic failover. Maintenance No Yes Yes Used for some maintenance tasks. For other tasks, you have to disable vCenter HA.
vSphere Availability Prerequisites Verify the interoperability of vCenter HA and the backup and restore solution. One solution is vCenter Server Appliance file-based restore. Procedure 1 Back up the Active node. Do not back up the Passive node and Witness node. 2 Before you restore the cluster, power off and delete all vCenter HA nodes. 3 Restore the Active node. The Active node is restored as a standalone vCenter Server Appliance. 4 Reconfigure the vCenter HA.
Chapter 4 vCenter High Availability Change the Appliance Environment When you deploy a vCenter Server Appliance, you select an environment. For vCenter HA, Small, Medium, Large, and X-Large are supported for production environments. If you need more space and want to change the environment, you have to delete the Passive node virtual machine before you change the configuration. Procedure 1 Log in to the Active node with the vSphere Web Client, edit the cluster configuration, and select Disable.
vSphere Availability n ® VMware vCenter HA Alarms and Events on page 77 If a vCenter HA cluster is in a degraded state, alarms and events show errors. vCenter HA Clone Operation Fails During Deployment If the vCenter HA configuration process does not create the clones successfully, you have to resolve that cloning error. Problem Clone operation fails. Cause Look for the clone exception. It might indicate one of the following problems. n You have a DRS-enabled cluster, but do not have three hosts.
Chapter 4 vCenter High Availability Cause The cluster can be in a degraded state for a number of reasons. One of the nodes fails n If the Active node fails, a failover of the Active node to the Passive node occurs automatically. After the failover, the Passive node becomes the Active node. At this point, the cluster is in a degraded state because the original Active node is unavailable.
vSphere Availability 2 If you cannot resolve the connectivity problem, you have to log in to Active node's console directly. a Power off and delete the Passive node and the Witness node virtual machines. b Log in to the Active node by using SSH or through the Virtual Machine Console. c To enable the Bash shell, enter shell at the appliancesh prompt. d Run the following command to remove the vCenter HA configuration. destroy-vcha -f e Reboot the Active node.
Chapter 4 vCenter High Availability VMware vCenter® HA Alarms and Events If a vCenter HA cluster is in a degraded state, alarms and events show errors. Problem Table 4‑4. The following events will raise VCHA health alarm in vpxd: Event Name Event Description Event Type Category vCenter HA cluster state is currently healthy vCenter HA cluster state is currently healthy com.vmware.vcha.cluster.st ate.
vSphere Availability Table 4‑7. Database replication-related events Event Name Event Description Event Type Category Database replication mode changed to {newState} Database replication state changed: sync, async or no replication com.vmware.vcha.DB.repli cation.state.changed info Table 4‑8. File replication-related events Event Name Event Description Event Type Category Appliance {fileProviderType} is {state} Appliance File replication state changed com.vmware.vcha.file.repli cation.state.
Using Microsoft Clustering Service for vCenter Server on Windows High Availability 5 When you deploy vCenter Server, you must build a highly available architecture that can handle workloads of all sizes. Availability is critical for solutions that require continuous connectivity to vCenter Server. To avoid extended periods of downtime, you can achieve continuous connectivity for vCenter Server by using a Microsoft Cluster Service (MSCS) cluster.
vSphere Availability The process for vCenter Server high availability in an MSCS environment is as follows. 1 Remove the MSCS configuration for vCenter Server. 2 Upgrade the vCenter Server from version 6.0 to version 6.5. 3 Configure MSCS to make vCenter Server highly available. Prerequisites n Verify that you are not deleting the primary node VM. n Verify that the primary node is the current active node. n Verify that all the services of vCenter Server 6.0 are running on the primary node.
Chapter 5 Using Microsoft Clustering Service for vCenter Server on Windows High Availability Configure MSCS for High Availability Use the following steps to set up Microsoft Cluster Service (MSCS) as an availability solution for vCenter Server. Prerequisites n Create a virtual machine (VM) with one of the following guest operating systems: n n Windows 2008 R2 Datacenter n Windows 2012 R2 Datacenter n Add two raw device mapping (RDM) disks to this VM.
vSphere Availability 5 Power off the VM. 6 Detach the RDM disks. Detaching the RDM disks is not a permanent deletion. Do not select Delete from disk and do not delete the vmdk files. 7 Clone the VM and select the Customize the operating system option, so that the clone has a unique identity. Create a unique identity through either the default sysrep file or the custom sysrep file. 8 Attach the shared RDMs to both VMs and power them on. 9 Change the host name and IP address on the first VM (VM1).
Index A Active node, functioning 58 admission control configuring 33 vSphere HA 19 Advanced configuration, vCenter HA 65 affinity rules 41, 45 anti-affinity rules 41 APD 16 application monitoring 31 Application Monitoring 12, 15 Auto Deploy 40 B best practices Fault Tolerance 51 vSphere HA clusters 38 vSphere HA networking 38 business continuity 7 C c 72 cloning nodes 66 cluster settings 28 complete Advanced configuration, vCenter HA 67 compliance check, Fault Tolerance 47 configure second NIC, vCenter H
vSphere Availability error messages 41 interoperability 43 logging 46 migrate secondary 50 networking configuration 46 options 47 overview 41 preparing for 45 prerequisites 45 restrictions for turning on 48 suspending 50 test failover 50 test restart secondary 51 turning off 49 turning on 49 use cases 42 validation checks 48 version 45 vSphere configuration 45 Fault Tolerance licensing 42 Fault Tolerance limits 42 Fault Tolerance requirements 42 fdm.
Index NAS 45 NFS 45 Storage DRS 40 Storage vMotion 7, 40, 43 suspending, Fault Tolerance 50 Symmetric multiprocessor (SMP) 44 symmetric multiprocessor (SMP) virtual machines 53 T TCP port 18 test failover, Fault Tolerance 50 test restart secondary, Fault Tolerance 51 tolerating host failures 21 transparent failover 9, 41 turning off, Fault Tolerance 49 U UDP port 18 unplanned downtime 8 upgrading for MSCS high availability 79 upgrading hosts with FT virtual machines 51 use cases, Fault Tolerance 42 V VA
vSphere Availability 86 VMware, Inc.