Building Disaster Recovery Serviceguard Solutions Using Continentalclusters for Linux B.01.00.00

4 Restoring disaster recovery cluster after a disaster

After a failover to a cluster occurs, restoring disaster recovery is a manual processs, the most

significant of which are:

• Restoring the failed cluster.

Depending on the nature of the disaster it might be necessary to either create a new cluster

or to repair the failed cluster.

Before starting up the new or the failed cluster, ensure the auto_run flag for all of the

Continentalclusters application packages is disabled. This is to prevent starting the packages

unexpectedly with the cluster.

• Resynchronizing the data.

To resynchronize the data, you either restore the data to the cluster and continue with the

same data replication procedure, or set up data replication to function in the other direction.

The following sections briefly outline some scenarios for restoring disaster tolerance.

Retaining the original roles for primary and recovery cluster

After disaster recovery, the packages running on the recovery cluster can be moved back to the

primary cluster. To do this:

1. Ensure that both clusters are up and running, with the recovery packages continuing to run

on the surviving cluster.

2. Compare the clusters to ensure the configurations are consistent. Correct any inconsistencies.

3. For every recovery group where the repaired cluster will run the primary package do the

following:

a. Synchronize the data from the disks on the surviving cluster to the disks on the repaired

cluster. This might be time-consuming.

b. Halt the recovered application on the surviving cluster if necessary, and start it on the

repaired cluster.

c. To minimize application down time, start the primary package on the cluster before

resynchronizing the data of the next recovery group.

4. View the status of the Continentalclusters.

# cmviewconcl

Switching the primary and recovery cluster roles

Configure the failed cluster in a recovery pair as a recovery-only cluster and the recovery cluster

as a primary-only cluster. This minimizes the downtime involved with moving the applications back

to the restored cluster. It is also assumed that the original recovery cluster has sufficient resources

to run critical applications indefinitely.

NOTE: In a multiple recovery pairs scenario, where more than one primary cluster are configured

to share the same recovery cluster, do not perform the following steps to switch the role of the

failed cluster and the surviving cluster.

To switch the role of the failed cluster and the surviving cluster:

1. Halt the monitor packages. Run the following command on every cluster.

# cmhaltpkg ccmonpkg

2. Edit the Continentalclusters ASCII configuration file. It is necessary to change the definitions

of monitoring clusters, and switch the names of primary and recovery packages in the definitions

Retaining the original roles for primary and recovery cluster 27