HP Serviceguard Extended Distance Cluster for Linux A.01.00 Deployment Guide, Second Edition, May 2008

Disaster Scenarios and Their Handling
Chapter 4 89
This is a multiple failure
scenario where the failures
occur in a particular sequence in
the configuration that
corresponds to figure 2 where
Ethernet and FC links do not go
over DWDM.
The package (P1) is running on
a node (N1). P1 uses a mirror
md0 consisting of S1 (local to
node N1, say /dev/hpdev/
mylink-sde) and S2 (local to
node N2).
The first failure occurs with all
FC links between the two data
centers failing, causing N1 to
lose access to S2 and N2 to lose
access to S1.
After recovery for the first
failure has been initiated, the
second failure occurs when
re-mirroring is in progress and
N1 goes down.
The package (P1) continues
to run on N1 after the first
failure, with md0 consisting
of only S1.
After the second failure, the
package (P1) fails over to
N2 and starts with S1.
Since S2 is also accessible,
the extended distance
cluster adds S2 and starts
re-mirroring of S2.
For the first failure scenario,
complete the following procedure
to initiate a recovery:
1. Restore the links in both
directions between the data
centers. As a result, S2
(/dev/hpdev/mylink-sdf) is
accessible from N1 and S1 is
accessible from N2.
2. Run the following commands
to remove and add S2 to md0
on N1:
# mdadm --remove /dev/md0
/dev/hpdev/mylink-sdf
# mdadm --add /dev/md0
/dev/hpdev/mylink-sdf
The re-mirroring process is
initiated. The re-mirroring process
starts from the beginning on N2
after the second failure. When it
completes, the extended distance
cluster detects S2 and accepts it as
part of md0 again.
Table 4-1 Disaster Scenarios and Their Handling (Continued)
Disaster Scenario
What Happens When
This Disaster Occurs
Recovery Process