Site Recovery Manager Administration vCenter Site Recovery Manager 5.5 This document supports the version of each product listed and supports all subsequent versions until the document is replaced by a new edition. To check for more recent editions of this document, see http://www.vmware.com/support/pubs.
Site Recovery Manager Administration You can find the most up-to-date technical documentation on the VMware Web site at: http://www.vmware.com/support/ The VMware Web site also provides the latest product updates. If you have comments about this documentation, submit your feedback to: docfeedback@vmware.com Copyright © 2008–2013 VMware, Inc. All rights reserved. This product is protected by U.S. and international copyright and intellectual property laws.
Contents About VMware vCenter Site Recovery Manager Administration 7 1 SRM Privileges, Roles, and Permissions 9 How SRM Handles Permissions 10 SRM and the vCenter Server Administrator Role 10 SRM and vSphere Replication Roles 11 Managing Permissions in a Shared Recovery Site Configuration 11 Assign SRM Roles and Permissions 13 SRM Roles Reference 14 vSphere Replication Roles Reference 17 2 Replicating Virtual Machines 21 How the Recovery Point Objective Affects Replication Scheduling 21 Replicating a V
Site Recovery Manager Administration Create, Test, and Run a Recovery Plan 42 Create a Recovery Plan 43 Edit a Recovery Plan 43 Suspend Virtual Machines When a Recovery Plan Runs 44 Test a Recovery Plan 44 Clean Up After Testing a Recovery Plan 45 Run a Recovery Plan 45 Recover a Point-in-Time Snapshot of a Virtual Machine 46 Export Recovery Plan Steps 47 View and Export Recovery Plan History 47 Cancel a Test or Recovery 48 Delete a Recovery Plan 48 5 Reprotecting Virtual Machines After a Recovery 49 Ho
Contents Specify a Nonreplicated Datastore for Swap Files 80 Recovering Virtual Machines Across Multiple Hosts on the Recovery Site 81 Resize Virtual Machine Disk Files During Replication Using Replication Seeds 82 Resize Virtual Machine Disk Files During Replication Without Using Replication Seeds Reconfigure SRM Settings 82 Change Local Site Settings 83 Change Logging Settings 83 Change Recovery Settings 85 Change Remote Site Settings 86 Change the Timeout for the Creation of Placeholder Virtual Machin
Site Recovery Manager Administration vSphere Replication RPO Violations 121 vSphere Replication Does Not Start After Moving the Host 122 Unexpected vSphere Replication Failure Results in a Generic Error 122 Generating Support Bundles Disrupts vSphere Replication Recovery 123 Recovery Plan Times Out While Waiting for VMware Tools 123 Index 6 125 VMware, Inc.
About VMware vCenter Site Recovery Manager Administration VMware vCenter Site Recovery Manager (SRM) is an extension to VMware vCenter Server that delivers a business continuity and disaster recovery solution that helps you plan, test, and run the recovery of vCenter Server virtual machines. SRM can discover and manage replicated datastores, and automate migration of inventory from one vCenter Server instance to another.
Site Recovery Manager Administration 8 VMware, Inc.
SRM Privileges, Roles, and Permissions 1 SRM provides disaster recovery by performing operations for users. These operations involve managing objects, such as recovery plans or protection groups, and performing operations, such as replicating or powering off virtual machines. SRM uses roles and permissions so that only users with the correct roles and permissions can perform operations. SRM adds several roles to vCenter Server, each of which includes privileges to complete SRM and vCenter Server tasks.
Site Recovery Manager Administration n Assign SRM Roles and Permissions on page 13 During installation, SRM administrator rights are assigned to the vCenter Server administrator role. At this time, only vCenter Server administrators can log in to SRM, unless they explicitly grant access to other users. n SRM Roles Reference on page 14 SRM includes a set of roles. Each role includes a set of privileges, which allow users with those roles to complete different actions.
Chapter 1 SRM Privileges, Roles, and Permissions SRM and vSphere Replication Roles When you install vSphere Replication with SRM, the vCenter Server administrator role inherits all of the SRM and vSphere Replication privileges. If you manually assign an SRM role to a user or user group, or if you assign an SRM role to a user or user group that is not a vCenter Server administrator, these users do not obtain vSphere Replication privileges.
Site Recovery Manager Administration You can also create isolated resources on the shared recovery site and map the resources on the protected sites to their own dedicated resources on the shared recovery site. You might use this configuration if you must keep all of the customers' virtual machines separate from each other, for example if all of the customers belong to different organizations.
Chapter 1 SRM Privileges, Roles, and Permissions Assign SRM Roles and Permissions During installation, SRM administrator rights are assigned to the vCenter Server administrator role. At this time, only vCenter Server administrators can log in to SRM, unless they explicitly grant access to other users. To allow other users to access SRM, vCenter Server administrators must grant them permissions in the SRM interface. Permission assignments apply on a per-site basis.
Site Recovery Manager Administration 7 Select Propagate to Child Objects to apply the selected role to all of the child objects of the inventory objects that this role can affect. For example, if a role contains privileges to modify folders, selecting this option extends the privileges to all the virtual machines in a folder. You might deselect this option to create a more complex hierarchy of permissions.
Chapter 1 SRM Privileges, Roles, and Permissions Table 1‑1. SRM Roles Role Actions that this Role Permits SRM Administrator The SRM Administrator grants permission to perform all SRM configuration and administration operations. n Configure advanced settings. n Configure connections. n Configure inventory preferences. n Configure placeholder datastores. n Configure array managers. n Manage protection groups. n Manage recovery plans. n Perform reprotect operations.
Site Recovery Manager Administration Table 1‑1. SRM Roles (Continued) Role Actions that this Role Permits Privileges that this Role Includes Objects in vCenter Server Inventory that this Role Can Access Site Recovery Manager.Remote Site.Modify Datastore.Replication.P rotect Datastore.Replication.U nprotect Resource.Recovery Use Virtual Machine. SRM Protection.Protect Virtual Machine. SRM Protection.
Chapter 1 SRM Privileges, Roles, and Permissions Table 1‑1. SRM Roles (Continued) Privileges that this Role Includes Role Actions that this Role Permits SRM Recovery Plans Administrator The SRM Recovery Plans Administrator role allows users to create and test recovery plans. n Add protection groups to recovery plans. n Remove protection groups from recovery plans. Configure custom command n steps on virtual machines. n Create recovery plans. n Test recovery plans. n Cancel recovery plan tests.
Site Recovery Manager Administration Table 1‑2. vSphere Replication Roles 18 Role Actions that this Role Permits VRM replication viewer n VRM virtual machine replication user View replications. Manage datastores. n Configure and unconfigure replications. n Manage and monitor replications. Requires a corresponding user with the same role on the target site and additionally vSphere Replication target datastore user role on the target datacenter, or datastore folder or each target datastore.
Chapter 1 SRM Privileges, Roles, and Permissions Table 1‑2. vSphere Replication Roles (Continued) Objects in vCenter Server Inventory that this Role Can Access Role Actions that this Role Permits Privileges that this Role Includes VRM administrator Incorporates all vSphere Replication privileges. VRM remote.Manage VR VRM remote.View VR VRM remote.Manage VRM VRM remote.View VRM VRM datastore mapper.Manage VRM datastore mapper.View VRM diagnostics .Manage VRM session .Terminate Datastore.
Site Recovery Manager Administration Table 1‑2. vSphere Replication Roles (Continued) 20 Objects in vCenter Server Inventory that this Role Can Access Role Actions that this Role Permits Privileges that this Role Includes VRM target datastore user Configure and reconfigure replications. Used on target site in combination with the VRM virtual machine replication user role on both sites. Datastore.Browse datastore Datastore.
Replicating Virtual Machines 2 Before you create protection groups, you must configure replication on the virtual machines to protect. You can replicate virtual machines by using either array-based replication, vSphere Replication, or a combination of both. This information concerns replication using vSphere Replication. To configure array-based replication on virtual machines, consult the documentation from your storage array manager (SRA) vendor.
Site Recovery Manager Administration The replication scheduler tries to satisfy these constraints by overlapping replications to optimize bandwidth use and might start replications for some virtual machines earlier than expected. To determine the replication transfer time, the replication scheduler uses the duration of the last few instances to estimate the next one.
Chapter 2 Replicating Virtual Machines Every time that a virtual machine reaches its RPO target, vSphere Replication records approximately 3800 bytes of data in the vCenter Server events database. If you set a low RPO period, this can quickly create a large volume of data in the database. To avoid creating large volumes of data in the vCenter Server events database, limit the number of days that vCenter Server retains event data.
Site Recovery Manager Administration 7 Select a replication destination for each media device for the virtual machine. The next pages are created dynamically depending on the media devices installed on the virtual machine. They might include multiple virtual drives, all of which you can configure individually. Configurable settings include whether the virtual drive is replicated, the virtual drive's replication destination, and information about how the replicated virtual drive is configured.
Chapter 2 Replicating Virtual Machines 2 Select a folder or datacenter in the left pane and click the Virtual Machines tab. 3 Select the virtual machines to replicate using the Ctrl or Shift keys. 4 Right-click the virtual machines and select vSphere Replication. 5 Use the RPO slider or enter a value to configure the maximum amount of data that can be lost in the case of a site failure. The available RPO range is from 15 minutes to 24 hours.
Site Recovery Manager Administration 3 In the left pane, browse to the datastore that contains the files for the virtual machine, select the datastore, and in the right pane, click Browse this datastore. 4 Select the folders for all virtual machines to be physically moved, right-click the selection, and click Download. 5 Select a destination to which to copy the files and click OK. 6 Click Yes. 7 After the download finishes, transfer the files to a location on the paired site to upload them.
Chapter 2 Replicating Virtual Machines 6 7 (Optional) Change the target location for the virtual machine files. Option Description Reconfigure replication of a single virtual machine Click Browse to change the target location for the virtual machine files. Reconfigure replication of multiple virtual machines Select Initial copies of .vmdk files have been placed on the target datastore if you have copied replication seeds to a new target datastore.
Site Recovery Manager Administration 28 VMware, Inc.
Creating Protection Groups 3 After you configure a replication solution, you can create protection groups. A protection group is a collection of virtual machines and templates that you protect together by using SRM. You include one or more protection groups in each recovery plan. A recovery plan specifies how SRM recovers the virtual machines in the protection groups that it contains.
Site Recovery Manager Administration To configure array-based replication, you must assign each virtual machine to a resource pool, folder, and network that exist at the recovery site. You can specify defaults for these assignments by selecting inventory mappings. SRM applies inventory mappings when you create the protection group. If you do not specify inventory mappings, you must configure them individually for each member of the protection group.
Chapter 3 Creating Protection Groups n Two virtual machines share a raw disk mapping (RDM) device on a SAN array, as in the case of a Microsoft cluster server (MSCS) cluster. n Two datastores span extents corresponding to different partitions of the same device. n A single datastore spans two extents corresponding to partitions of two different devices.
Site Recovery Manager Administration 4 Select a datastore group from the list, and click Next. When you select a datastore group, the virtual machines in that datastore group appear in the Virtual Machines on the Selected Datastore Group pane, and are marked for inclusion in the protection group after you create the protection group. 5 Type a name and optional description for the protection group, and click Next.
Chapter 3 Creating Protection Groups 4 Type a name and optional description for the protection group, and click Next. 5 Click Finish to create the protection group. What to do next Create a recovery plan with which to associate your protection groups. See “Create a Recovery Plan,” on page 43. Edit vSphere Replication Protection Groups You can edit a vSphere Replication protection group to change its name and to add or remove virtual machines to the group.
Site Recovery Manager Administration 34 VMware, Inc.
Creating, Testing, and Running Recovery Plans 4 After you configure SRM at the protected and recovery sites, you can create, test, and run a recovery plan. A recovery plan is like an automated run book. It controls every step of the recovery process, including the order in which SRM powers on and powers off virtual machines, the network addresses that recovered virtual machines use, and so on. Recovery plans are flexible and customizable. A recovery plan includes one or more protection groups.
Site Recovery Manager Administration n How SRM Interacts with vSphere High Availability on page 41 You can use SRM to protect virtual machines on which vSphere High Availability (HA) is enabled. n Protecting Microsoft Cluster Server and Fault Tolerant Virtual Machines on page 41 You can use SRM to protect Microsoft Cluster Server (MSCS) and fault tolerant virtual machines, with certain limitations.
Chapter 4 Creating, Testing, and Running Recovery Plans Test Networks and Datacenter Networks When you test a recovery plan, SRM can create a test network that it uses to connect recovered virtual machines. Creating a test network allows the test to run without potentially disrupting virtual machines in the production environment.
Site Recovery Manager Administration Running a Recovery with Forced Recovery If the protected site is offline and SRM cannot perform its usual tasks, you can run the recovery with the forced recovery option. Forced recovery starts the virtual machines on the recovery site without performing any operations on the protected site.
Chapter 4 Creating, Testing, and Running Recovery Plans Table 4‑1. How Testing a Recovery Plan Differs from Running a Recovery Plan (Continued) Area of Difference Test a Recovery Plan Run a Recovery Plan Effect on replication SRM creates temporary snapshots of replicated storage at the recovery site. For array-based replication, SRM rescans the arrays to discover them.
Site Recovery Manager Administration n If you enable Storage DRS on the protection site, a datastore cluster must contain one and only one consistency group. Do not include any datastore that does not belong to the consistency group in the cluster. Placing multiple consistency groups into the same cluster might result in virtual machines being lost during a recovery. This guideline also applies on the recovery site if Storage DRS is enabled on the recovery site.
Chapter 4 Creating, Testing, and Running Recovery Plans How SRM Interacts with vSphere High Availability You can use SRM to protect virtual machines on which vSphere High Availability (HA) is enabled. HA protects virtual machines from ESXi host failures by restarting virtual machines from hosts that fail on new hosts within the same site. SRM protects virtual machines against full site failures by restarting the virtual machines at the recovery site.
Site Recovery Manager Administration n You can run a cluster of MSCS virtual machines in the following possible configurations. Cluster-in-a-box The MSCS virtual machines in the cluster run on a single ESXi Server. You can have a maximum of five MSCS nodes on one ESXi Server. Cluster-across-boxes You can spread the MSCS cluster across a maximum of five ESXi Server instances. You can protect only one virtual machine node of any MSCS cluster on a single ESXi Server instance.
Chapter 4 Creating, Testing, and Running Recovery Plans 4 Test a Recovery Plan on page 44 When you test a recovery plan, SRM runs the recovery plan on a test network and a temporary snapshot of replicated data at the recovery site. SRM does not disrupt operations at the protected site. 5 Clean Up After Testing a Recovery Plan on page 45 After you test a recovery plan, you can return the recovery plan to the Ready state by running a cleanup operation.
Site Recovery Manager Administration 6 Click Next. 7 Review the summary information and click Finish to make the specified changes to the recovery plan. You can monitor the update of the plan in the Recent Tasks view. Suspend Virtual Machines When a Recovery Plan Runs SRM can suspend virtual machines on the recovery site during a recovery and a test recovery.
Chapter 4 Creating, Testing, and Running Recovery Plans 6 Click the Recovery Steps tab to monitor the progress of the test and respond to messages. The Recovery Steps tab displays the progress of individual steps. The Summary tab reports the progress of the overall plan. NOTE SRM initiates recovery steps in the prescribed order, with one exception. It does not wait for the Prepare Storage step to finish for all protection groups before continuing to the next steps.
Site Recovery Manager Administration Procedure 1 Click Recovery Plans in the left pane, select the recovery plan to run, and click Recovery. 2 Review the information in the confirmation prompt, and select I understand that this process will permanently alter the virtual machines and infrastructure of both the protected and recovery datacenters. 3 Select the type of recovery to run. 4 Option Description Planned Migration Recovers virtual machines to the recovery site when both sites are running.
Chapter 4 Creating, Testing, and Running Recovery Plans 5 Run the recovery plan. When the recovery plan is finished, the virtual machine is recovered to the recovery site, with the number of point-in-time snapshots that you configured. 6 In the VMs and Templates view, right-click the recovered virtual machine and select Snapshot > Snapshot Manager. 7 Select one of the point-in-time snapshots of this virtual machine and click Go to.
Site Recovery Manager Administration Cancel a Test or Recovery You can cancel a recovery plan test at any time during its run. You can cancel a planned migration or disaster recovery at certain times during its run. When you cancel a test or recovery, SRM does not start steps, and uses certain rules to stop steps that are in progress. n Steps that cannot be stopped, such as powering on or waiting for a heartbeat, run to completion before the cancellation finishes.
Reprotecting Virtual Machines After a Recovery 5 After a recovery, the recovery site becomes the new protected site, but it is not protected yet. If the original protected site is operational, you can reverse the direction of protection to use the original protected site as a new recovery site to protect the new protected site. Manually reestablishing protection in the opposite direction by recreating all protection groups and recovery plans is time consuming and prone to errors.
Site Recovery Manager Administration Figure 5‑1. SRM Reprotect Process Site A Site B Protected site becomes recovery Recovery site becomes protected site Replica virtual machines power off apps apps apps OS OS OS OS apps apps apps apps apps OS OS OS OS apps apps apps apps OS OS OS OS Protection group Direction of replication is reversed after a planned migration n How SRM Performs Reprotect on page 50 The reprotect process involves two stages.
Chapter 5 Reprotecting Virtual Machines After a Recovery Preconditions for Performing Reprotect You can perform reprotect only if you meet certain preconditions. You can perform reprotect on protection groups that contain virtual machines that are configured for both array-based replication and for vSphere Replication. Before you can run reprotect, you must satisfy the preconditions. 1 Run a planned migration and make sure that all steps of the recovery plan finish successfully.
Site Recovery Manager Administration Reprotect States The reprotect process passes through several states that you can observe in the recovery plan in the SRM plug-in in the vSphere Client. If reprotect fails, or succeeds partially, you can perform remedial actions to complete the reprotect. Table 5‑1. Reprotect States State Description Remedial Action Reprotect In Progress SRM is running reprotect.
Restoring the Pre-Recovery Site Configuration By Performing Failback 6 To restore the original configuration of the protected and recovery sites after a recovery, you can perform a sequence of optional procedures known as failback. After a planned migration or a disaster recovery, the former recovery site becomes the protected site. Immediately after the recovery, the new protected site has no recovery site to which to recover.
Site Recovery Manager Administration Figure 6‑1. SRM Failback Process 2. Reprotect–Recovery site becomes protected site 1.
Chapter 6 Restoring the Pre-Recovery Site Configuration By Performing Failback 5 (Optional) If necessary, rerun reprotect until it finishes without errors. At the end of the reprotect operation, SRM has reversed replication, so that the original recovery site, site B, is now the protected site. 6 (Optional) Click Test and follow the prompts to test the recovery plan. Testing the recovery plan verifies that the recovery plan works after the reprotect operation.
Site Recovery Manager Administration 56 VMware, Inc.
Customizing a Recovery Plan 7 You can customize a recovery plan to run commands, display messages that require a response when the plan runs, and change the recovery priority of protected virtual machines. A simple recovery plan, that specifies only a test network to which the recovered virtual machines connect and response times that the test should expect, can provide an effective way to test an SRM configuration. Most recovery plans require configuration for use in production.
Site Recovery Manager Administration n Some steps are always skipped during test recoveries. Understanding recovery steps, their order, and the context in which they run is important when you customize a recovery plan. Recovery Order When you run a recovery plan, it starts by powering off the virtual machines at the protected site. SRM powers off virtual machines according to the priority that you set, with high-priority machines powering off last. SRM omits this step when you test a recovery plan.
Chapter 7 Customizing a Recovery Plan During reprotect, SRM preserves all custom recovery steps in the recovery plan. If you perform a recovery or test after a reprotect, custom recovery steps are run on the new recovery site, which was the original protected site. After reprotect, you can usually use custom recovery steps that show messages directly without modifications.
Site Recovery Manager Administration Message Prompt Recovery Steps Present a message in the SRM user interface during the recovery. You can use this message to pause the recovery and provide information to the user running the recovery plan. For example, the message can instruct users to perform a manual recovery task or to verify steps. The only action users can take in direct response to a prompt is to click OK, which dismisses the message and allows the recovery to continue.
Chapter 7 Customizing a Recovery Plan 9 Click OK to add the step to the recovery plan. Create Top-Level Message Prompt Steps You can add top-level message prompts anywhere in the recovery plan. Prerequisites You have a recovery plan to which to add custom steps. Procedure 1 Click Recovery Plans in the SRM interface, and select a recovery plan. 2 Click the Recovery Steps tab. 3 Right-click a step before or after which to add a custom step, and select Add Step. 4 Select Prompt.
Site Recovery Manager Administration Create Message Prompt Steps for Individual Virtual Machines You can configure custom recovery steps to prompt users to perform tasks for a virtual machine before and after the virtual machine powers on. SRM associates message prompt steps with a protected virtual machine in the same way as customization information. If different recovery plans contain the same virtual machine, the commands and prompts are the same.
Chapter 7 Customizing a Recovery Plan Table 7‑1. Environment Variables Available to All Command Steps Name Value Example VMware_RecoveryName Name of the recovery plan that is running. Plan A VMware_RecoveryMode Recovery mode. Test or recovery VMware_VC_Host Host name of the vCenter Server at the recovery site. vc_hostname.example.com VMware_VC_Port Network port used to contact vCenter Server.
Site Recovery Manager Administration The customizations that you specify become associated with the protected virtual machine. As a result, the settings are shared between all recovery plans that apply to this virtual machine. NOTE If you remove the protection of a virtual machine, all recovery customizations are lost. 64 VMware, Inc.
Customizing IP Properties for Virtual Machines 8 You can customize IP settings for virtual machines for the protected site and the recovery site. Customizing the IP properties of a virtual machine overrides the default IP settings when the recovered virtual machine starts at the destination site. If you do not customize the IP properties of a virtual machine, SRM uses the IP settings for the recovery site during a recovery or a test from the protection site to the recovery site.
Site Recovery Manager Administration After the IP customization process finishes, virtual machines power on according to the priority groups and any dependencies that you set. The power on process happens immediately before the Wait for VMTools process for each virtual machine. NOTE To customize the IP properties of a virtual machine, you must install VMware Tools or the VMware Operating System Specific Packages (OSP) on the virtual machine. See http://www.vmware.com/download/packages.html.
Chapter 8 Customizing IP Properties for Virtual Machines 9 Repeat Step 5 through Step 8 to configure recovery or protection settings, if required. For example, if you configured IP settings for the protected site, you might want to configure settings for the recovery site. 10 Repeat the configuration process for other NICs, as required, beginning by choosing another NIC as described in Step 3.
Site Recovery Manager Administration You can customize the IP settings for the protected and the recovery sites so that SRM uses the correct configurations during reprotect operations. See the Compatibility Matrix for vCenter Site Recovery Manager 5.5 for the list of guest operating systems for which SRM supports IP customization.
Chapter 8 Customizing IP Properties for Virtual Machines Table 8‑1. DR IP Customizer Options (Continued) Option Description Mandatory --cmd arg You specify different commands to run DR IP Customizer in different modes. Yes n The apply command applies the network customization settings from an existing CSV file to the recovery plans on the SRM Server instances. n The generate command generates a basic CSV file for all virtual machines that SRM protects for a vCenter Server instance.
Site Recovery Manager Administration Table 8‑2. Columns of the DR IP Customizer CSV File 70 Column Description Customization Rules VM ID Unique identifier that DR IP Customizer uses to collect information from multiple rows for application to a single virtual machine. This ID is internal to DR IP Customizer and is not the same as the virtual machine ID that vCenter Server uses. Not customizable. Cannot be blank.
Chapter 8 Customizing IP Properties for Virtual Machines Table 8‑2. Columns of the DR IP Customizer CSV File (Continued) Column Description Customization Rules Secondary WINS DR IP Customizer validates that WINS settings are applied only to Windows virtual machines, but it does not validate NetBIOS settings. Customizable. Can be left blank. IP Address IPv4 address for this virtual machine. Customizable. Cannot be blank. Virtual machines can have multiple virtual network adapters.
Site Recovery Manager Administration Modifying the DR IP Customizer CSV File You modify the DR IP Customizer comma-separated value (CSV) file to apply customized networking settings to virtual machines when they start on the recovery site. One challenge of representing virtual machine network configurations in a CSV file is that virtual machine configurations include hierarchical information.
Chapter 8 Customizing IP Properties for Virtual Machines This generated CSV file shows two virtual machines, vm-3-win and vm-1-linux. The virtual machines are present on the protected site and on the recovery site, vcenter-server-site-B, and vcenter-server-site-A. DR IP Customizer generates an entry for each virtual machine and each site with Adapter ID 0. You can add additional lines to customize each NIC, once you are aware of how many NICs are on each virtual machine.
Site Recovery Manager Administration n n Adds a NIC, Adapter ID 2, with primary and secondary WINS servers 2.2.3.4 and 2.2.3.5, a static IPv4 address 192.168.1.22, and DNS server 1.1.1.2. On the vcenter-server-site-A site: n Sets the DNS suffixes example.com and eng.example.com for all NICs for this virtual machine. n Sets the DNS servers 1.1.0.1 and 1.1.0.2 for all NICs for this virtual machine. n Adds a NIC, Adapter ID 1, with a static IPv4 address 192.168.0.21.
Chapter 8 Customizing IP Properties for Virtual Machines The information in this CSV file applies different static and dynamic IPv4 settings to vm-3-win on the protected site and on the recovery site. n n On site vcenter-server-site-B: n Sets the DNS suffixes example.com and eng.example.com for all NICs for this virtual machine. n Adds a NIC, Adapter ID 1, with primary and secondary WINS servers 2.2.3.4 and 2.2.3.5, that uses DHCP to obtain an IP address and sets the static DNS server 1.1.1.1.
Site Recovery Manager Administration Table 8‑5. Setting Static and DHCP IPv4 and IPv6 Addresses in a Modified CSV File (Continued) VM ID VM Nam e protec tedvm-10 301 vCe nter Serv er Ada pter ID vcen terserv ersiteB 2 Prim ary WIN S Sec ond ary WIN S 2.2.3. 4 2.2.3. 5 IP Addr ess Subn et Mask Gate way(s ) dhcp IPv6 Addr ess ::ffff: 192.16 8.1.22 IPv6 Subn et Prefix lengt h 32 IPv6 Gate way(s ) ::ffff: 192.16 8.1.1 DNS Serve r(s) DNS Suffix( es) 1.1.1.
Chapter 8 Customizing IP Properties for Virtual Machines n Adds a NIC, Adapter ID 1, that uses DHCP to obtain an IPv4 address and sets a static IPv6 address ::ffff:192.168.1.22. Adapter ID 1 uses static IPv6 DNS servers ::ffff:192.168.0.250 and ::ffff: 192.168.0.251. n Adds a NIC, Adapter ID 2, with primary and secondary WINS servers 1.2.3.4 and 1.2.3.5, a static IPv4 address 192.168.0.22, and DNS server 1.1.1.1. By leaving the IPv6 column blank, Adapter ID 2 uses DHCP for IPv6 addresses.
Site Recovery Manager Administration 78 VMware, Inc.
Advanced SRM Configuration 9 The SRM default configuration enables some simple recovery scenarios. Advanced users can customize SRM to support a broader range of site recovery requirements.
Site Recovery Manager Administration Procedure 1 Click Protection Groups in the SRM interface and select the protection group that includes the virtual machine to configure. 2 On the Virtual Machines tab, right-click a virtual machine and select Configure Protection. 3 In the Virtual Machine Properties window, review and configure properties as needed. a Click Folder to specify an alternate destination folder.
Chapter 9 Advanced SRM Configuration Procedure 1 In the vSphere Client, right-click an ESXi cluster and click Edit Settings. 2 In the Settings page for the cluster, click Swapfile Location, select Store the swapfile in the datastore specified by the host, and click OK. 3 For each host in the cluster, select a nonreplicated datastore. a Select a host and click the Configuration tab. b In the Software panel, click Virtual Machine Swapfile Location, and click Edit at the top right of the main panel.
Site Recovery Manager Administration Resize Virtual Machine Disk Files During Replication Using Replication Seeds vSphere Replication prevents you from resizing the virtual machine disk file during replication. If you used replication seeds for the target disk, you can resize the disk manually. Procedure 1 Unconfigure replication on the virtual machine. 2 Resize the disk on the source site. 3 Resize the target disk that is left over after you unconfigure replication.
Chapter 9 Advanced SRM Configuration To apply changes that you make to advanced settings to virtual machines that you have already protected, you must reconfigure protection for those virtual machines individually. You can also remove the protection from the virtual machine by removing it from a protection group, and then reconfigure protection by adding it back in the protection group.
Site Recovery Manager Administration error Records panic and error log entries. Error messages occur in cases of problems that might or might not result in a failure. warning Records panic, error, and warning log entries. Warning messages occur for behavior that is undesirable but that might be part of the expected course of operation. info Records panic, error, warning, and information log entries. Information messages provide information about normal operation.
Chapter 9 Advanced SRM Configuration 4 Option Description Set logging level for the SOAP Web Services adapter Select a logging level from the logManager.SoapAdapter drop-down menu. Due to the levels of traffic that the SOAP adapter generates, setting the logging level to trivia might affect performance. By default, SOAP adapter logging is set to info. Set logging level for storage issues Select a logging level from the logManager.Storage drop-down menu.
Site Recovery Manager Administration Change Remote Site Settings You can modify the default values that the SRM Server at the protected site uses to determine whether the SRM Server at the remote site is available. SRM monitors the connection between the protected site and the recovery site and raises alarms if the connection breaks. You can change the criteria that cause SRM to raise a connection event and change the way that SRM raises alarms.
Chapter 9 Advanced SRM Configuration 3 4 Modify the storage settings. Option Action Change SRA update timeout Type a new value in the storage.commandTimeout text box. Change the maximum number of concurrent SRA operations Type a new value in the storage.maxConcurrentCommandCnt text box. Change the minimum amount of time in seconds between datastore group computations Type a new value in the storage.minDsGroupComputationInterval text box.
Site Recovery Manager Administration 4 Option Action Delay host scans during testing and recovery SRAs can send responses to SRM before a promoted storage device on the recovery site is available to the ESXi hosts. When SRM receives a response from an SRA, it rescans the storage devices. If the storage devices are not fully available yet, ESXi Server does not detect them and SRM does not find the replicated devices when it rescans.
Chapter 9 Advanced SRM Configuration 3 4 Modify the vSphere Replication settings. Option Description Allow vSphere Replication to recover virtual machines that are included in SRM recovery plans independently of SRM If you configure vSphere Replication on a virtual machine and include the virtual machine in an SRM recovery plan, you cannot recover the virtual machine by using vSphere Replication independently of SRM.
Site Recovery Manager Administration 2 Set the srmMaxBootShutdownOps setting. Option Description Option text box Type srmMaxBootShutdownOps. Value text box Type the maximum number of boot shutdown operations, for example 32. 3 Click OK to save your changes. 4 Log into the SRM Server host. 5 Open the vmware-dr.xml file in a text editor. You find the vmware-dr.xml file in the C:\Program Files\VMware\VMware vCenter Site Recovery Manager\config folder.
Chapter 9 Advanced SRM Configuration Table 9‑1. Settings that Modify the Number of Simultaneous Power On or Power Off Operations Option Description srmMaxBootShutdownOps Specifies the maximum number of concurrent power-on operations for any given cluster. Guest shutdowns, but not forced power offs, are throttled according to this value. Guest shutdowns occur during primary site shutdowns (planned failover) and IP customization workflows.
Site Recovery Manager Administration 92 VMware, Inc.
Troubleshooting SRM Administration 10 To help identify the cause of any problems you encounter during the day-to-day running of SRM, you might need to collect SRM Server or client log files to review or send to VMware Support. Errors that you encounter during SRM operations appear in error dialog boxes or appear in the Recent Tasks window. Most errors also generate an entry in an SRM log file. Check the recent tasks and log files for the recovery site and the protected site.
Site Recovery Manager Administration n If a virtual machine is attached to a Raw Disk Mapping (RDM) disk device, you must store the mapping file in the same folder as the VMX file. RDM snapshots are only available if you create the RDM mapping using Virtual Compatibility Mode. If you are running a ESX or ESXi Server 4.1 or later, these limitations do not apply. vSphere Replication supports the protection of virtual machines with snapshots, but you can only recover the latest snapshot.
Chapter 10 Troubleshooting SRM Administration Disaster Recovery and Reprotect of Virtual Machines on Datastores that Use SIOC If you run a recovery with SIOC enabled, the recovery will succeed with errors. After the recovery, you must manually disable SIOC on the protected site and run a planned migration recovery again. You cannot run reprotect until you have successfully run a planned migration.
Site Recovery Manager Administration 2 Create a post-power on command step in the recovery plan to reenable Admission Control after the virtual machine powers on. Get-Cluster cluster_name | Set-Cluster -HAAdmissionControlEnabled:$true If you disable Admission Control during recovery, you must manually reenable Admission Control after you perform cleanup following a test recovery. Disabling Admission Control might affect the ability of High Availability to restart virtual machines on the recovery site.
Chapter 10 Troubleshooting SRM Administration n When the number of skipped failed pings exceeds a higher limit SRM sends a RemoteSiteDownEvent event for every failed ping and stops sending RemoteSitePingFailedEvent events. You can configure this higher limit of failed pings by setting the remoteSiteStatus.panicDelay setting. n SRM continues to send RemoteSiteDownEvent events until the connection is reestablished. Configure SRM Alarms SRM adds alarms to the alarms that vCenter Server supports.
Site Recovery Manager Administration Table 10‑1. Site Status Events (Continued) Event Key Event Description Cause RemoteSiteCreatedEvent Remote Site Created Remote site is created. RemoteSiteUpEvent Remote Site Up SRM Server re-establishes its connection with the remote SRM Server. RemoteSiteDeletedEvent Remote Site Deleted Remote SRM site has been deleted. Protection Group Events Protection Group events provide information about actions and status related to protection groups.
Chapter 10 Troubleshooting SRM Administration Table 10‑2. Protection Group Replication Informational Events (Continued) Event Key Event Description Cause PlaceholderVmCreatedEvent The placeholder virtual machine was created in the VMware vCenter Server inventory. Posted on the Recovery site vCenter Server only when we create the placeholder virtual machine as a result of protection, repair.
Site Recovery Manager Administration Recovery Events Recovery events provide information about actions and status related to the SRM recovery processes. Table 10‑5. Recovery Events 100 Event Key Event Description Cause RecoveryVmBegin Recovery Plan has begun recovering the specified virtual machine. Signaled when the recovery virtual machine was successfully created. If some error occurred before the virtual machine ID is known the event is not fired.
Chapter 10 Troubleshooting SRM Administration Table 10‑5. Recovery Events (Continued) Event Key Event Description Cause RecoveryPlanServerCommandBegin Recovery Plan has started to run a Command on the SRM Server machine. Signaled on the recovery site when SRM has started to run a Callout Command on the SRM Server machine. RecoveryPlanServerCommandEnd Recovery Plan has completed the execution of a Command on the SRM Server machine.
Site Recovery Manager Administration Table 10‑8. Array Pair Events Type Description Content StorageArrayPairDiscovered Discovered replicated array pair with Array Manager. User created Array Manager which discovered replicated array pairs. StorageArrayPairEnabled Enabled replicated array pair with Array Manager. User enabled an Array Pair. StorageArrayPairDisabled Disabled replicated array pair with Array Manager. User disabled an Array Pair.
Chapter 10 Troubleshooting SRM Administration Table 10‑10. Protection Events (Continued) Type Description Content StorageProviderDatastoreReplica tionLost Datastore included in specified protection group is no longer replicated. User turned off replication for devices backing the datastore. StorageProviderGroupProtectionR estored Protection has been restored for specified protection group. The previous (non-empty) issues of a protection group are cleared.
Site Recovery Manager Administration Licensing Events Licensing events provide information about changes in SRM licensing status. Table 10‑11. Licensing Events Type Description Content LicenseExpiringEvent The SRM License at the specified site expires in the specified number of days. Every 24 hours, non-evaluation, expiring licenses are checked for the number of days left. This event is posted with the results.
Chapter 10 Troubleshooting SRM Administration Table 10‑13. SNMP Traps Type Description Content RecoveryPlanExecuteTestBeginTra p This trap is sent when a Recovery Plan starts a test. SRM site name, recovery plan name, recovery type, execution state. RecoveryPlanExecuteTestEndTrap This trap is sent when a Recovery Plan ends a test. SRM site name, recovery plan name, recovery type, execution state, result status.
Site Recovery Manager Administration vSphere Replication Events and Alarms vSphere Replication supports event logging. You can define alarms for each event that can trigger if the event occurs. This feature provides a way to monitor the health of your system and to resolve potential problems, ensuring reliable virtual machine replication. Configure vSphere Replication Alarms You can define and edit alarms to alert you when a specific vSphere Replication event occurs.
Chapter 10 Troubleshooting SRM Administration Table 10‑14. vSphere Replication Events (Continued) Cate gory Event Target com.vmware.vcHm s.remoteSiteDownE vent Error Folder Connection to the remote vSphere Replication site is established com.vmware.vcHm s.remoteSiteUpEven t Info Folder VR Server disconnected vSphere Replication server disconnected com.vmware.vcHm s.
Site Recovery Manager Administration Table 10‑14. vSphere Replication Events (Continued) 108 Cate gory Event Target com.vmware.vcHm s.failedResolvingSto ragePolicyEvent Error Datastore vSphere Replication was paused as a result of a configuration change, such as a disk being added or reverting to a snapshot where disk states are different hbr.primary.System PausedReplication Error Virtual Machine Invalid vSphere Replication configuration Invalid vSphere Replication configuration hbr.primary.
Chapter 10 Troubleshooting SRM Administration Table 10‑14. vSphere Replication Events (Continued) Cate gory Event Target hbr.primary.Connec tionRestoredToHbr ServerEvent Info Virtual Machine hbr.primary.
Site Recovery Manager Administration n Download Details provides information on the log bundle file name and destination for the log bundle file. This process does not collect client logs. You must collect client logs separately. Collect SRM Log Files Manually You can download SRM Server log files in a log bundle that you generate manually. This is useful if you are unable to access the vSphere Client.
Chapter 10 Troubleshooting SRM Administration 5 Set the maximum number of log files to retain. You set the maximum number of logs by adding a section to the section. The default is 10 log files. 50 6 Change the location on the SRM Server in which to store the logs. You change the log location by modifying the section in the section. C:\ProgramData\VMware\VMware vCenter Site Recovery Manager\Logs
Site Recovery Manager Administration 11 (Optional) Set the level of logging for storage replication adapters. Setting the SRM logging level does not set the logging level for SRAs. You change the SRA logging level by adding a section to vmware-dr.xml to set the SRA logging level. The possible logging levels are error, warning, info, trivia, and verbose.
Chapter 10 Troubleshooting SRM Administration Resolve SRM Operational Issues If you encounter problems with creating protection groups and recovery plans, failover, recovery, or guest customization, you can troubleshoot the problem. SRM Doubles the Number of Backslashes in the Command Line When Running Callouts When a backslash is a part of the callout command line, SRM doubles all backslashes. Problem The command-line system interpreter treats double backslashes as a single backslash only in file paths.
Site Recovery Manager Administration Powering on Many Virtual Machines Simultaneously on the Recovery Site Can Lead to Errors When many virtual machines perform boot operations at the same time, you might see errors during arraybased and vSphere Replication recovery. Problem When powering on many virtual machines simultaneously on the recovery site, you might see these errors in the recovery history reports: n The command 'echo "Starting IP customization on Windows ..." > > % VMware_GuestOp_OutputFile %.
Chapter 10 Troubleshooting SRM Administration Cause SRM does not check how snapshot volumes are presented to ESXi hosts. SRM does not support setting the LVM.enableResignature flag to 0. If you set the flag from 1 to 0, a virtual machine outage might occur each time you perform a test failover or an actual failover occurs. Setting the LVM.enableResignature flag on ESXi hosts is a host-wide operation.
Site Recovery Manager Administration Cause The infrastructure on the recovery site is unable to handle the volume of concurrent creations of placeholder virtual machines. Solution Increase the replication.placeholderVmCreationTimeout setting from the default of 300 seconds. See “Change the Timeout for the Creation of Placeholder Virtual Machines,” on page 86. You do not need to restart SRM Server after changing this setting.
Chapter 10 Troubleshooting SRM Administration Recovery Fails with Unavailable Host and Datastore Error Recovery or test recovery fails with an error about host hardware and datastores being unavailable if you run the recovery or test shortly after changes occur in the vCenter Server inventory. Problem Recovery or test recovery fails with the error No host with hardware version '7' and datastore 'ds_id' which are powered on and not in maintenance mode are available....
Site Recovery Manager Administration Solution Install VMware Tools on the protected virtual machines. If you do not or cannot install VMware Tools on the protected virtual machines, you must configure SRM not to wait for VMware Tools to start in the recovered virtual machines and to skip the guest operating system shutdown step. See “Change Recovery Settings,” on page 85.
Chapter 10 Troubleshooting SRM Administration Scalability Problems when Replicating Many Virtual Machines with a Short RPO to a Shared VMFS Datastore on ESXi Server 5.0 Performance might be slow if you replicate a large number of virtual machines with a short Recovery Point Objective (RPO) to a single virtual machine file store (VMFS) datastore that is accessible by multiple hosts on the recovery site. Problem This problem occurs when running ESXi Server 5.0 on the recovery site.
Site Recovery Manager Administration Application Quiescing Changes to File System Quiescing During vMotion to an Older Host vSphere Replication can create an application quiesced replica for virtual machines with Windows Server 2008 and Windows 8 guest operating systems running on an ESXi 5.1 or newer host. Problem The ESXi 5.1 or newer host is in a cluster with hosts from older versions and you use vMotion to move the replicated virtual machine to an older host.
Chapter 10 Troubleshooting SRM Administration Configuring Replication Fails for Virtual Machines with Two Disks on Different Datastores If you try to configure vSphere Replication on a virtual machine that includes two disks that are contained in different datastores, the configuration fails. Problem Configuration of replication fails with the error Multiple source disks, with device keys device_keys, point to the same destination datastore and file path. The replication group remains in the error state.
Site Recovery Manager Administration vSphere Replication Does Not Start After Moving the Host If you move the ESXi Server on which the vSphere Replication appliance runs to the inventory of another vCenter Server instance, vSphere Replication operations are not available. vSphere Replication operations are also unavailable if you reinstall vCenter Server.
Chapter 10 Troubleshooting SRM Administration In addition to the generic error, the message provides more detailed information about the problem, similar to the following examples. n VRM Server generic error. Please check the documentation for any troubleshooting information. The detailed exception is: 'org.apache.http.conn.HttpHostConnectException: Connection to https://vCenter_Server_address refused'. This error relates to problems connecting to vCenter Server.
Site Recovery Manager Administration Cause SRM uses VMware Tools heartbeat to discover when recovered virtual machines are running on the recovery site. Recovery operations require that you install VMware Tools on the protected virtual machines. Recovery fails if you did not install VMware Tools on the protected virtual machines, or if you did not configure SRM to start without waiting for VMware Tools to start. Solution Install VMware Tools on the protected virtual machines.
Index A D Active Directory domain controllers, limits on protection 93 Admission Control clusters, using with SRM 93 Advanced Settings, vSphere Replication 88 advanced settings local site 83 logging 83 recovery 85 remote site 86 replication 86 storage 86 Advanced Settings dialog boxes 82 affinity rules, limits on recovery 93 alarms, SRM-specific 97 all paths down (APD) 45 all paths down, recovery plans 37 Application quiescing during vMotion 120 array based recovery plan, create 43 array-based protection
Site Recovery Manager Administration I inventory mappings to apply 33 to override 79, 80 IP address mappings, to report 67 IP customization, multiple virtual machines 67 IP customization,OSP Tools 65, 66 IP properties, customizing 65, 66 L licensing, events 104 limits, limits on recovery 93 linked clones, limitations on recovery of 93 log files, collecting 109 logging, set levels 83 logs, downloading 109 LVM.
Index replication configuration fails 121 missed RPO target 119 reconfigure 26 scalability issue with VMFS 119 sneakernet 25 troubleshooting 119, 121 virtual machines with two disks 121 replication seeds 25 reprotect diagram 49 error after restarting vCenter Server 118 overview 49 preconditions 51 process 50 remediate 52 run 51 states 52 timeout error 117 reservations, limits on recovery 93 Resize VMDK file, with replication seeds 82 Resize VMDK file, without replication seeds 82 reverse recovery 88 roles
Site Recovery Manager Administration 128 VMware, Inc.