Site Recovery Manager Administration Guide vCenter Site Recovery Manager 4.1 This document supports the version of each product listed and supports all subsequent versions until the document is replaced by a new edition. To check for more recent editions of this document, see http://www.vmware.com/support/pubs.
Site Recovery Manager Administration Guide You can find the most up-to-date technical documentation on the VMware Web site at: http://www.vmware.com/support/ The VMware Web site also provides the latest product updates. If you have comments about this documentation, submit your feedback to: docfeedback@vmware.com Copyright © 2008–2010 VMware, Inc. All rights reserved. This product is protected by U.S. and international copyright and intellectual property laws.
Contents About This Book 5 1 Administering VMware vCenter Site Recovery Manager 7 Protected Sites and Recovery Sites 7 Array-Based Replication 8 About Protection Groups and Recovery Plans 10 Understanding Recovery and Test Recovery 11 Operational Limits of Site Recovery Manager 12 About Failback 12 SRM and vCenter 13 About the Site Recovery Manager Database 14 SRM Licensing 14 SRM Authentication 15 How SRM Uses Network Ports 16 Site Recovery Manager Roles and Permissions 17 2 Installing and Updating Site
Site Recovery Manager Administration Guide Limitations on Recovery of Snapshots and Linked Clones 37 Create a Recovery Plan 37 Edit a Recovery Plan 38 Remove a Recovery Plan 39 4 Test Recovery, Recovery, and Failback 41 Test a Recovery Plan 41 Pause, Resume, or Cancel a Test 42 Run a Recovery Plan 42 Configuring and Executing Failback 43 Review and Execute Post-Failover Cleanup Tasks 44 Reconfigure Replication 44 Reconfigure SRM to Enable Failback to the Protected Site Restore the Original Configuration
About This Book ® VMware vCenter Site Recovery Manager (SRM) is an extension to VMware vCenter that enables integration with array-based replication, discovery and management of replicated datastores, and automated migration of inventory from one vCenter to another.
Site Recovery Manager Administration Guide Technical Support and Education Resources The following technical support resources are available to you. To access the current version of this book and other books, go to http://www.vmware.com/support/pubs. Online and Telephone Support To use online support to submit technical support requests, view your product and contract information, and register your products, go to http://www.vmware.com/support.
Administering VMware vCenter Site Recovery Manager 1 VMware vCenter Site Recovery Manager is a business continuity and disaster recovery solution that helps you plan, test, and execute a scheduled migration or emergency failover of vCenter inventory from one site to another.
Site Recovery Manager Administration Guide Site Pairing The protected and recovery sites must be paired before you can use SRM. Site pairing includes three main steps: 1 Exchange of authentication information between the two sites. 2 Discovery of the replicated storage arrays that support the protected site, and discovery of peer arrays at the recovery site. 3 Discovery of the replicated devices supported by the arrays, and mapping of these devices to datastores that support virtual machines.
Chapter 1 Administering VMware vCenter Site Recovery Manager You cannot designate a third site as a recovery site for one that is already paired with another site. If you want to use SRM to provide business continuity and disaster recovery services at a recovery site, you must configure that site as a protected site that uses its own array managers to replicate data to the other member of the site pair.
Site Recovery Manager Administration Guide About Protection Groups and Recovery Plans A protection group is a collection of virtual machines and templates that use the same replicated datastore or datastore group. A recovery plan specifies how the virtual machines in a protection group are recovered. When the replicated devices that support a datastore group failover, that operation affects all of the virtual machines and templates that use the datastores in the group.
Chapter 1 Administering VMware vCenter Site Recovery Manager machines. If you need to override inventory mappings for a few members of a protection group, use the vSphere Client to connect to the recovery site and edit the network settings of the placeholders or move them to a different folder or resource pool. If a member of a protection group loses its protection, its placeholder is removed from the recovery site until the protection has been restored.
Site Recovery Manager Administration Guide How SRM Interacts with DPM and DRS During Recovery Distributed Power Management (DPM) is a VMware facility that manages power consumption by ESX hosts. Distributed Resource Scheduler (DRS) is a VMware facility that manages the assignment of virtual machines to ESX hosts. When DPM and DRS are enabled on a recovery site cluster, SRM temporarily disables DPM for the cluster and ensures that all hosts in it are powered on before recovery begins.
Chapter 1 Administering VMware vCenter Site Recovery Manager A typical failback has two phases. In the first phase, the protected and recovery sites switch roles, and the virtual machines are migrated from the recovery site to the protected site under the control of a recovery plan. In the second phase, the relationship of the protected and recovery sites is restored, so that future failovers migrate the protected virtual machines from the protected site to the recovery site.
Site Recovery Manager Administration Guide About the Site Recovery Manager Database The SRM server requires its own database, which it uses to store recovery plans, inventory information, and similar data. The SRM database is a critical part of any SRM installation. The database must be initialized and a database connection created before you can install SRM.
Chapter 1 Administering VMware vCenter Site Recovery Manager SRM Authentication All communications between SRM and vCenter servers take place over an SSL connection and are authenticated by public key certificates or stored credentials. When you install an SRM server, you must choose either credential-based authentication or certificate-based authentication. You cannot mix authentication methods.
Site Recovery Manager Administration Guide Requirements When Using Public Key Certificates If you have installed SSL certificates issued by a trusted certificate authority (CA) on the vCenter server that supports SRM, the certificates you create for use by SRM must meet certain specific criteria. While SRM uses standard PKCS#12 certificate for authentication, it places a few specific requirements on the contents of certain field of those certificates.
Chapter 1 Administering VMware vCenter Site Recovery Manager Site Recovery Manager Roles and Permissions SRM uses vCenter roles and permissions but includes additional ones that allow fine-grained control over SRM-specific tasks and operations. SRM and vCenter use the same authorization model. The set of permissions applied to or inherited by an object determine the operations that are allowed on the object and the list of roles that can perform those operations.
Site Recovery Manager Administration Guide n Protection SRM Administrator role at the SRM site recovery root level (propagate). n Protection Groups Administrator role at the SRM protection groups level (propagate). You must have the following permissions at the recovery site: n Recovery Inventory Administrator role at the vCenter root. n Recovery Datacenter Administrator role at the datacenter level (propagate).
Installing and Updating Site Recovery Manager 2 You must install an SRM server at the protected site and also at the recovery site. After the SRM servers are installed, you can download the client plug-in from either server to any vSphere Client. You use the SRM client plug-in to configure and manage SRM at each site. Prerequisites SRM requires the support of a vCenter server at each site. The SRM installer must be able to connect with this server during installation.
Site Recovery Manager Administration Guide The SRM database at each site holds information about virtual machine configurations, protection groups, and recovery plans. SRM cannot use the vCenter database because it has different database schema requirements, though you can use the vCenter database server to create and support the SRM database. Each SRM site requires its own instance of the SRM database. The database must exist before SRM can be installed.
Chapter 2 Installing and Updating Site Recovery Manager DB2 Server Configuration A DB2 Server configuration must meet specific requirements to support SRM. DB2 Server has the following configuration requirements when used as the SRM database: n When creating the database instance, specify utf-8 encoding. n Because DB2 uses Windows authentication, you must specify the database owner as a domain account.
Site Recovery Manager Administration Guide 7 On the VMware vCenter Server page, enter information about the vCenter server at the site where you are installing SRM and then click Next. n vCenter Server Address—Enter the hostname or IP address of the vCenter Server. If you use the hostname, enter it in lowercase.
Chapter 2 Installing and Updating Site Recovery Manager 10 Enter the database configuration information and click Next. n Database Client— Select a database client type from the pulldown control. n Data Source Name— Select and existing DSN from the pulldown, or click ODBC DSN Setup to view existing DSNs or create a new one. n Username—A user ID valid for the specified database. n Password—The password for the specified user ID. n Connection Count—The initial connection pool size.
Site Recovery Manager Administration Guide Procedure 1 Download the SRA. You can download storage replication adapters and their documentation from http://www.vmware.com/download/srm/. Storage replication adapters downloaded from other sites are not supported by VMware. 2 Install the SRA on each SRM server host. Storage replication adapters come with their own installation instructions. The adapter you are using must be installed on the SRM server host at the protected and recovery sites.
Chapter 2 Installing and Updating Site Recovery Manager What to do next You can now install the updated client plug-in. See “Install the SRM Client Plug-In,” on page 25. Install the SRM Client Plug-In To install the Site Recovery Manager client plug-in, use a vSphere Client to connect to the vCenter Server at the protected or recovery site, then download the plug-in from the server and enable it in the vSphere Client.
Site Recovery Manager Administration Guide Repair a Site Recovery Manager Server Installation If you need to change any of the information you supplied when you installed the SRM Server, you can repair the installation and supply the changed information. Installing the SRM server binds the installation to a number of values that you supply, including the vCenter server to extend, the SRM database DSN and credentials, the type of authentication you want to use, and so on.
Chapter 2 Installing and Updating Site Recovery Manager 7 On the Database Configuration page, Enter the following database configuration information and click Next: n Data Source Name— Select and existing DSN from the pulldown, or click ODBC DSN Setup to view existing DSNs or create a new one. n Username—A user ID valid for the specified database. n Password—The password for the specified user ID. n Connection Count—The initial connection pool size.
Site Recovery Manager Administration Guide 28 VMware, Inc.
Configuring the Protected and Recovery Sites 3 After you have installed SRM at the protected and recovery sites, you must connect the two sites to create a site pair, configure the array managers at each site, and configure SRM at each site. You use the SRM client plug-in to administer SRM. Site pairing requires vSphere admnistrative privileges at both sites. Prerequisites Before you can connect the protected and recovery sites, you must: 1 Install an SRM server at each site.
Site Recovery Manager Administration Guide Procedure 1 Open a vSphere client and connect to the vCenter server at the site that you want to designate as the protected site. Log in as a vSphere administrator. NOTE The recovery site must be the replication target of arrays managed by the SRA at the protected site. 2 On the vSphere Client Home page, click the Site Recovery icon. 3 In the Protection Setup area of the Summary window, navigate to the Connection line and click Configure.
Chapter 3 Configuring the Protected and Recovery Sites Procedure 1 Open a vSphere client and connect to the vCenter server at the protected site. Log in as a vSphere administrator. 2 On the vSphere Client Home page, click Licensing. 3 For the report view, select Asset. 4 Right-click an SRM asset and select Change license key. 5 Select Assign a new license key and click Enter Key. 6 Enter the license key, enter an optional label for the key, and click OK. 7 Click OK.
Site Recovery Manager Administration Guide 5 Make sure that the array manager type that you want SRM to use appears in the Manager Type field. If more than one SRA has been installed on the SRM server host, click the drop-down arrow and select the manager type you want to use. If no manager type is displayed, no SRA has been installed on the SRM server host. For more information, see “Install the Storage Replication Adapters,” on page 23.
Chapter 3 Configuring the Protected and Recovery Sites Procedure 1 Open a vSphere Client and connect to the vCenter server at the recovery site. Log in as a vSphere administrator. 2 On the vSphere Client Home page, click the Site Recovery icon. 3 In the Recovery Setup area of the Summary window, navigate to the Recovery Plans line and click Repair Array Managers.
Site Recovery Manager Administration Guide Procedure 1 Open a vSphere Client and connect to the vCenter server at the protected site. Log in as a vSphere administrator. 2 On the vSphere Client Home page, click the Site Recovery icon. 3 In the Protection Setup area of the Summary window, navigate to the Inventory Mappings line and click Configure. The Inventory Mappings page displays a tree of resources at the protected site and a corresponding tree of resources at the recovery site.
Chapter 3 Configuring the Protected and Recovery Sites Configure Resource Mappings for a Virtual Machine If you have not specified inventory mappings for your site, you must configure resource mappings for individual virtual machines. You can configure resource mappings only if site-wide inventory mappings have not been established. If inventory mappings have been established for a site, you cannot override them by configuring the protection of individual virtual machines.
Site Recovery Manager Administration Guide 4 On the Name and Description page of the Create Protection Group wizard, type a name and optional description for the protection group, and click Next. 5 On the Select a Datastore Group page, select a datastore group from the list, and click Next. The datastores listed were discovered when you configured the array managers. Each datastore in the list is replicated to the recovery site and supports at least one virtual machine at the protected site.
Chapter 3 Configuring the Protected and Recovery Sites Adding and Removing Members of a Protection Group When you create a protection group, it includes all the virtual machines on the selected datastore. You can add or remove protection group members by adding or moving virtual machines to the datastore, or by removing them from the datastore. All virtual machines and templates that reside on a protected datastore are part of the protection group that applies to that datastore.
Site Recovery Manager Administration Guide Procedure 1 Open a vSphere Client and connect to the vCenter server at the recovery site. Log in as a vSphere administrator. 2 On the vSphere Client Home page, click the Site Recovery icon. 3 In the Recovery Setup area of the Summary window, navigate to the Recovery Plans line and click Create.
Chapter 3 Configuring the Protected and Recovery Sites Procedure 1 Open a vSphere Client and connect to the vCenter server at the recovery site. Log in as a vSphere administrator. 2 On the vSphere Client Home page, click the Site Recovery icon. 3 In the Recovery Setup area of the Summary window, navigate to the Recovery Plans line, right-click the plan that you want to edit, and select Edit Recovery Plan.
Site Recovery Manager Administration Guide 40 VMware, Inc.
Test Recovery, Recovery, and Failback 4 After you have configured SRM at the protected and recovery sites, you can test your recovery plan without affecting services at either site. You can also run a recovery plan and, if necessary, configure the two sites for failback so that you can restore services at the protected site. SRM makes it easy to test a recovery plan. The test does not disrupt replication or any ongoing activities at the protected site.
Site Recovery Manager Administration Guide 5 Click the Recovery Steps tab to monitor the progress of the test and respond to messages. The Recovery Steps tab displays the progress of individual steps. The Recent Tasks area reports the progress of the overall plan. NOTE If the SRM server loses contact with the recovery site vCenter while a recovery plan is being tested or run, the recovery plan fails and displays the message Error: The session is not authenticated.
Chapter 4 Test Recovery, Recovery, and Failback 5 Review the information in the confirmation prompt, and when you are ready to proceed, select I understand that this process cannot be undone and click Run Recovery Plan. 6 To monitor the progress of the recovery and respond to messages, click the Recovery Steps tab. The Recovery Steps tab displays the progress of individual steps. The Recent Tasks area reports the progress of the overall plan.
Site Recovery Manager Administration Guide 3 Reconfigure SRM to Enable Failback to the Protected Site on page 45 Before you can run a failback, you must create the protection groups and recovery plans required to migrate protected inventory from the recovery site back to the protected site. 4 Restore the Original Configuration on page 45 After a failback is complete, you can restore the original configuration so that the protected and recovery sites resume the roles they had before the failover.
Chapter 4 Test Recovery, Recovery, and Failback Reconfigure SRM to Enable Failback to the Protected Site Before you can run a failback, you must create the protection groups and recovery plans required to migrate protected inventory from the recovery site back to the protected site. After you have prepared both sites for failback, reconfigured array replication, and replicated the source devices at the recovery site to their targets at the protected site, you can create the environment needed for failback.
Site Recovery Manager Administration Guide 46 3 Configure the array managers (see “Configure Array Managers,” on page 31). 4 Configure the inventory mappings (see “Configure Inventory Mappings,” on page 33). 5 Create the protection groups (see in “Create Protection Groups,” on page 35). 6 Create the recovery plans (see “Create a Recovery Plan,” on page 37). 7 Test the recovery plan (see “Test a Recovery Plan,” on page 41). VMware, Inc.
Customizing Site Recovery Manager 5 In its default configuration, SRM enables a number of simple recovery scenarios. Advanced users can customize SRM to support a broader range of site recovery requirements. The default protection and recovery capabilities of SRM can be appropriate for sites that have simple configurations or recovery objectives.
Site Recovery Manager Administration Guide 5 To apply the selected role to all child objects of the selected inventory object, select Propagate to Child Objects. 6 To select the user or group for the role, click the Add button. 7 Identify the user or group. 8 a From the Domain drop-down menu, select the domain where the user or group is located. b Either enter a name in the Search text box or select a name from the Name list. c Click Add and then click OK when finished.
Chapter 5 Customizing Site Recovery Manager Virtual machines in all other priority groups are recovered serially per ESX host to enable a group of machines that spans several hosts to recover in parallel. During this type of recovery, machines on a specific ESX host are recovered in the order specified by the list, but the recovery order of the entire list is subject to the assignment of virtual machines to hosts.
Site Recovery Manager Administration Guide 3 Power on the virtual machine and verify that VMware Tools reports an OS heartbeat within the specified period. 4 Run any post-power-on command or message steps. NOTE Post-power-on command steps provide an application-specific way to verify that a recovered virtual machine has all the capabilities that you expect.
Chapter 5 Customizing Site Recovery Manager Table 5-1. Environment Variables Available to All Command Steps Name Value Example VMware_RecoveryName Name of the recovery plan that is executing "Plan A" VMware_RecoveryMode Recovery mode "test" or "recovery" VMware_VC_Host Host name of the vCenter host at the recovery site "vc_hostname.example.
Site Recovery Manager Administration Guide Specify Virtual Machine Recovery Priority By default, all virtual machines in a new recovery plan are members of the normal priority group. Members of this group are recovered in the order that they were created on the protected datastore. You can move a virtual machine to a different priority group or to a different priority within a group. Procedure 1 Open the Recovery Steps page for the plan, as described in “Customize Recovery Plan Steps,” on page 51.
Chapter 5 Customizing Site Recovery Manager Add Commands to a Recovery Plan You can customize a recovery plan to include commands that are executed on the SRM server host at the recovery site when the plan is tested or run. You can add command steps to any part of a recovery plan. When you create a command step to add to a recovery plan, make sure that it takes into account the environment in which it must run. For more information, see “Guidelines for Writing Command Steps,” on page 50.
Site Recovery Manager Administration Guide The customizations you specify are saved as properties of the placeholder virtual machine and then applied to the recovered virtual machine when a recovery plan is run or tested. NOTE If you remove the protection of a virtual machine, all recovery customizations are lost.
Chapter 5 Customizing Site Recovery Manager 3 Run the dr-ip-customizer.exe command, as shown in this example. dr-ip-customizer.exe -cfg ..\config\vmware-dr.xml -csv c:\tmp\example.
Site Recovery Manager Administration Guide Table 5-3. IP Customization Spreadsheet (Continued) VM ID VM Name Adap ter ID shdw3 1 shdw3 1 MAC Addr ess DNS Dom ain 00:1a: 3f:b8:f 3:79 exam ple.co m NetBI OS Prim ary WINS 10.10. 10.10 Seco ndary WINS IP Addr ess Subn et Mask Gate way( s) DNS Serve r(s) 10.13. 99.5 255.2 55.0.0 10.10. 10.10 0 10.10. 10.1 DNS Suffix( es) 10.10. 10.2 The following rules apply when you modify a CSV file created by the dr-ip-customizer utility.
Chapter 5 Customizing Site Recovery Manager The specified customizations are applied to all of the virtual machines named in the csv file during a recovery. (You do not need to select a customization specification for these machines when you edit their properties in a recovery plan.) Configure Protection for a Virtual Machine or Template You can edit the protection properties of any virtual machine or template in a protection group.
Site Recovery Manager Administration Guide 5 In the Edit Virtual Machine Properties window, review and configure properties as needed. a In the resource list, click Folder to review the recovery site folder to which this virtual machine is assigned. If inventory mappings have not been established for this site, you can edit this property. b Click Next to review the recovery site host to which the virtual machine is assigned.
Chapter 5 Customizing Site Recovery Manager Procedure 1 Open a vSphere Client and connect to the vCenter server at the protected site. Log in as a vSphere administrator. 2 On the vSphere Client Home page, click the Site Recovery icon. 3 In the Site Recovery tree view, expand the Protection Groups item. Protection groups that include virtual machines that need repair are highlighted with a warning icon. 4 Open the protection group and click the Virtual Machines tab.
Site Recovery Manager Administration Guide Procedure 1 Open a vSphere Client and connect to the vCenter server at the protected site. Log in as a vSphere administrator. 2 On the vSphere Client Home page, click the Site Recovery icon. 3 Right-click Site Recovery in the vSphere Client navigation pane and click Advanced Settings. 4 In the navigation pane of the Advanced Settings window, click a setting category. 5 In the category window, make your changes.
Chapter 5 Customizing Site Recovery Manager Procedure 1 Right-click Site Recovery in the vSphere Client navigation pane and click Advanced Settings. 2 In the navigation pane of the Advanced Settings window, click SanProvider. 3 Modify the SAN provider settings as needed. 4 n To change the length of time that SRM waits for a command issued by the SRA to complete, enter a new value in the SanProvider.CommandTimeout text box.
Site Recovery Manager Administration Guide Change Remote Site Settings Use the Advanced Settings remoteSiteStatus page to modify default values that the SRM server at the site to which the vSphere client is currently connected uses to determine whether the SRM server at the remote site is available SRM monitors the connection between the members of an SRM site pair (a protected site and its recovery site) and, by default, raises alarms when this connection is interrupted.
Chapter 5 Customizing Site Recovery Manager Procedure 1 In the vSphere Client, right-click an ESX cluster and click Edit Settings. 2 In the Settings window for the cluster, click Swapfile Location and select Store the swapfile in the datastore specified by the host, then click OK. 3 For each host in the cluster, select a nonreplicated datastore. a Click the Configuration tab. b On the Swapfile Location line, click Edit.
Site Recovery Manager Administration Guide 7 At the recovery site, use the vmkfstools command to create a clone of the copied disk. Create one clone for every placeholder virtual machine, but do not attach any clones to a virtual machine. The clones are assigned as part of the protection configuration process and are attached during recovery. 8 At the protected site, configure each protected virtual machine. a Use the vmkfstools command to clone the disk.
Troubleshooting SRM 6 If you have problems with storage replication, site pairing, or guest customization, you can try to troubleshoot the problem. To help identify the cause, you might need to collect SRM server or client log files to review or send to VMware Support. Errors encountered during SRM operations are displayed in error dialogs or shown in the Recent Tasks window. Most errors also generate an entry in an SRM log files.
Site Recovery Manager Administration Guide 3 In the Protection Setup area of the SRM Summary window, navigate to the Array Managers line and click Configure. 4 In the Configure Array Mangers wizard, click Next on the Protected Site Array Managers page and then click Next on the Recovery Site Array Managers page. The Review Replicated Datastores page should now display each replicated datastore that contains at least one virtual machine.
Chapter 6 Troubleshooting SRM Cause This error usually occurs when a virtual machine has been recently created but its files have not yet been replicated to the recovery site. For instance, you have created a virtual machine at the protected site, added it to a protection group, and then tested or run a recovery plan that includes the new virtual machine. If the virtual machine files have not yet been replicated to the recovery site, the recovery plan cannot recover the virtual machine.
Site Recovery Manager Administration Guide Collecting SRM Log Files SRM creates several log files that contain information that can help VMware Support diagnose problems. You can use the SRM log collector to simplify log file collection. The SRM server and client generate separate sets of log files. The SRM server log files contain information about the server configuration and messages related to server operations.
Index A administration, overview of 7 advanced serttings dialog boxes guest customization 60 local site 61 recovery site 60 remote site 62 SAN provider 60 alarms, SRM-specific 59 array managers and storage replication adapters 31 replicated device discovery 31 to configure 31 to configure when protected site is down 32 to rescan arrays 33 authentication certificate warnings and 15 methods used by Site Recovery Manager 15 C certificate public key 15 requirements for 16 to change type 26 to update 26 certif
Site Recovery Manager Administration Guide log files collecting 68 SRM client 68 SRM server 68 N network, test 11 P permissions Site Recovery Manager 17 to assign 47 placeholders in vCenter inventory 10 to repair 58 plug-in Site Recovery Manager Client 25 to install 25 ports, used by SRM 16 protected site configure array managers for 31 configuring 29 host compatibility requirements 7 to designate 29 to disconnect form 30 protection group maximum number supported 12 relationship to datastore group 10 rel