HP Serviceguard Extended Distance Cluster for Linux A.01.00 Deployment Guide, Second Edition, May 2008

Disaster Scenarios and Their Handling
Chapter 494
In this case, the package (P1)
runs with RPO-TARGET set to 60
seconds.
In this case, initially the
package (P1) is running on node
N1. P1 uses a mirror md0
consisting of S1 (local to node
N1, for example
/dev/hpdev/mylink-sde) and
S2 (local to node N2). The first
failure occurs when all FC links
between the two data centers
fail, causing N1 to lose access to
S2 and N2 to lose access to S1.
Immediately afterwards, a
second failure occurs where
node (N1) goes down because of
a power failure.
After N1 is repaired and
brought back into the cluster,
package switching of P1 to N1 is
enabled.
IMPORTANT: While it is not a
good idea to enable package
switching of P1 to N1, it is
described here to show recovery
from an operator error.
The FC links between the data
centers are not repaired and N2
becomes inaccessible because of
a power failure.
When the first failure
occurs, the package (P1)
continues to run on N1 with
md0 consisting of only S1.
When the second failure
occurs, the package fails
over to N2 and starts with
S2.
When N2 fails, the package
does not start on node N1
because a package is
allowed to start only once
with a single disk. You must
repair this failure and both
disks must be synchronized
and be a part of the MD
array before another failure
of same pattern occurs.
In this failure scenario, only
S1 is available to P1 on N1,
as the FC links between the
data centers are not
repaired. As P1 started once
with S2 on N2, it cannot
start on N1 until both disks
are available.
Complete the following steps to
initiate a recovery:
1. Restore the FC links between
the data centers. As a result,
S2 (/dev/hpdev/mylink-sdf)
becomes available to N1 and
S1 (/dev/hpdev/mylink-sde)
becomes accessible from N2.
2. To start the package P1 on N1,
check the package log file in
the package directory and run
the commands which will
appear to force a package
start.
When the package starts up on
N1, it automatically adds S2 back
into the array and the
re-mirroring process is started.
When re-mirroring is complete,
the extended distance cluster
detects and accepts S1 as part of
md0.
Table 4-1 Disaster Scenarios and Their Handling (Continued)
Disaster Scenario
What Happens When
This Disaster Occurs
Recovery Process