Designing Disaster Recovery Clusters using Metroclusters and Continentalclusters, Reprinted October 2011 (5900-1881)

7. If using physical data replication, do not resync from the recovery cluster to the primary cluster.
Instead, manually issue a command that will overwrite any changes on the recovery disk array
that may inadvertently have been made.
8. Start the package up on the primary cluster and allow connection to the application.
Testing Continentalclusters Operations
Use the following procedures to exercise typical Continentalclusters behaviors:
1. Halt both clusters in a recovery pair, then restart both clusters. The monitor packages on both
clusters should start automatically. The Continentalclusters packages (primary, data sender,
data receiver, and recovery) should not start automatically. Any other packages may
or may not start automatically, subject to their configuration.
NOTE: If an UP status is configured for a cluster, then an appropriate alert notification (email,
SNMP, etc.) should be received at the configured time interval from the node running the
monitor package on the other cluster. Due to delays in email or SNMP, the notifications may
arrive later than expected.
In addition to alerts/alarms sent using the mechanisms defined in the Continentalclusters
configuration file, they are also recorded in the file /var/opt/resmon/log/cc/eventlog
on the system reporting the event.
2. While the monitor package is running on a monitoring cluster, halt the monitored cluster
(cmhaltcl -f). An appropriate alert notification (email, SNMP, etc.) should be received at
the configured time interval from the node running the monitor package. Run cmrecovercl.
The command should fail. Additional notifications should be received at the configured time
intervals. After the alarm notification is received, run cmrecovercl. Any data receiver
packages on the monitoring cluster should halt and the recovery package(s) should start with
package switching enabled. Halt the recovery packages.
3. Test 2 should be rerun under a variety of conditions (and multiple conditions) such as the
following:
Rebooting and powering off systems one at a time
Rebooting and powering off all systems at the same time
Running the monitor package on each node in each cluster
Disconnecting the WAN connection between the clusters
If physical data replication is used disconnect the physical replication links between the
disk arrays:
Powering off the disk array at the primary site
Powering off the disk array at the recovery site
Testing cmrecovercl -f as well as cmrecovercl
Depending on the condition, the primary packages should be running to test real life failures
and recovery procedures.
4. After each scenario in tests 2-4, restore both clusters to their production state, restart the
primary package(s) (as well as any data sender and data receiver packages) and note any
issues, time delays, etc.
5. Halt the monitor package on one cluster. Halt the other cluster. No notifications are generated
that the other cluster has failed. What mechanism is available to the organization to monitor
the monitor?
6. Halt the packages on one cluster, but do not halt the cluster. No notifications are generated
that the packages on that cluster have failed. What mechanism is available to the organization
to monitor package status?
Testing the Continentalclusters 93