Designing Disaster Recovery Clusters using Metroclusters and Continentalclusters, Reprinted October 2011 (5900-1881)

Restoring Disaster Tolerance
After a failover to a cluster occurs, restoring disaster tolerance has many challenges, the most
significant of which are:
Restoring the failed cluster.
Depending on the nature of the disaster it may be necessary to either create a new cluster or
to restore the cluster.
Before starting up the new or the failed cluster, make sure the AUTO_RUN flag for all of the
Continentalclusters application packages is disabled. This is to prevent starting the packages
unexpectedly with the cluster.
Resynchronizing the data
To resynchronize the data, you either restore the data to the cluster and continue with the
same data replication procedure, or set up data replication to function in the other direction.
The following sections briefly outline some scenarios for restoring disaster tolerance.
Restore Clusters to their Original Roles
If the disaster did not destroy the cluster, there is the option to return both clusters in a recovery
pair to their original roles. To do this:
1. Make sure that both clusters are up and running, with the recovery packages continuing to
run on the surviving cluster.
2. Compare the clusters to make sure their configurations are consistent. Correct any
inconsistencies.
3. For each recovery group where the repaired cluster will run the primary package:
a. Synchronize the data from the disks on the surviving cluster to the disks on the repaired
cluster. This may be time-consuming.
b. Halt the recovered application on the surviving cluster if necessary, and start it on the
repaired cluster.
c. To keep application down time to a minimum, start the primary package on the cluster
before resynchronizing the data of the next recovery group.
4. View the status of the Continentalcluster.
# cmviewconcl
Primary Packages Remaining on the Surviving Cluster
Configure the failed cluster in a recovery pair as a recovery-only cluster and the surviving cluster
as a primary-only cluster. This minimizes the downtime involved with moving the applications back
to the restored cluster. It also assumes that the surviving cluster has sufficient resources to handle
running all critical applications indefinitely.
NOTE: In a multiple recovery pairs scenario, where more than one primary cluster are configured
to share the same recovery cluster, the following procedure to switch the role of the failed cluster
and the surviving cluster should not be used.
Use the following:
1. Halt the monitor packages. Issue the following command on each cluster:
# cmhaltpkg ccmonpkg
2. Edit the Continentalclusters ASCII configuration file. It is necessary to change the definitions
of monitoring clusters, and switch the names of primary and recovery packages in the definitions
of recovery groups. It may also be necessary to re-create data sender and data receiver
packages.
98 Designing Continentalclusters