Building Disaster Recovery Serviceguard Solutions Using Continentalclusters for Linux B.01.00.00

4 Restoring disaster recovery cluster after a disaster
After a failover to a cluster occurs, restoring disaster recovery is a manual processs, the most
significant of which are:
Restoring the failed cluster.
Depending on the nature of the disaster it might be necessary to either create a new cluster
or to repair the failed cluster.
Before starting up the new or the failed cluster, ensure the auto_run flag for all of the
Continentalclusters application packages is disabled. This is to prevent starting the packages
unexpectedly with the cluster.
Resynchronizing the data.
To resynchronize the data, you either restore the data to the cluster and continue with the
same data replication procedure, or set up data replication to function in the other direction.
The following sections briefly outline some scenarios for restoring disaster tolerance.
Retaining the original roles for primary and recovery cluster
After disaster recovery, the packages running on the recovery cluster can be moved back to the
primary cluster. To do this:
1. Ensure that both clusters are up and running, with the recovery packages continuing to run
on the surviving cluster.
2. Compare the clusters to ensure the configurations are consistent. Correct any inconsistencies.
3. For every recovery group where the repaired cluster will run the primary package do the
following:
a. Synchronize the data from the disks on the surviving cluster to the disks on the repaired
cluster. This might be time-consuming.
b. Halt the recovered application on the surviving cluster if necessary, and start it on the
repaired cluster.
c. To minimize application down time, start the primary package on the cluster before
resynchronizing the data of the next recovery group.
4. View the status of the Continentalclusters.
# cmviewconcl
Switching the primary and recovery cluster roles
Configure the failed cluster in a recovery pair as a recovery-only cluster and the recovery cluster
as a primary-only cluster. This minimizes the downtime involved with moving the applications back
to the restored cluster. It is also assumed that the original recovery cluster has sufficient resources
to run critical applications indefinitely.
NOTE: In a multiple recovery pairs scenario, where more than one primary cluster are configured
to share the same recovery cluster, do not perform the following steps to switch the role of the
failed cluster and the surviving cluster.
To switch the role of the failed cluster and the surviving cluster:
1. Halt the monitor packages. Run the following command on every cluster.
# cmhaltpkg ccmonpkg
2. Edit the Continentalclusters ASCII configuration file. It is necessary to change the definitions
of monitoring clusters, and switch the names of primary and recovery packages in the definitions
Retaining the original roles for primary and recovery cluster 27