Designing Disaster Tolerant High Availability Clusters, 10th Edition, March 2003 (B7660-90013)

Cascading Failover in a Continental Cluster

Data Replication Procedures

Chapter 8388

Scenario 2—Secondary Site within the Primary Cluster Fails

When the secondary site fails, or all SRDF links between the primary

Symmetrix and the secondary Symmetrix fail, unless domino mode is

used, the application running on the primary site is not aware of this

failure and continues to run on the primary site. This scenario is

illustrated in Figure 8-11.

Figure 8-11 Failure of Secondary Site in Primary Cluster

Without the secondary site, the current configuration doesn’t provide

any means to replicate the new data from the primary Symmetrix

directly to the recovery Symmetrix. If the secondary site is down for a

long time, the data in the recovery Symmetrix is very out-of-date. If the

primary site fails during this time, and the recovery takes over, the

customer will have to operate on an old copy of the data. Therefore, it's

important to fix and have the secondary site up and running as soon as

possible.

When the secondary site is fixed, the SRDF volume pair between the

primary Symmetrix and the secondary Symmetrix will be in

“Suspended” mode. If the BCV/R1 in the secondary Symmetrix contains

a good copy of the data, to protect this data from corruption in case of

rolling disaster, these devices must be split from the mirror group before

re-establishing the SRDF volume pairs between the primary Symmetrix

and the secondary Symmetrix. Use the following steps:

1. Split the BCV/R1 devices in the secondary Symmetrix from the

mirror group. From a host that connects to the primary Symmetrix:

# symmir -g <prisymdevgrpname> split -rdf