HP StorageWorks P9000 Cluster Extension Software Administrator Guide (TB534-96009, February 2011)

Activating the pair/resync monitor
The pair/resync monitor detects and reacts to suspended Continuous Access links. To activate the
pair/resync monitor, set the ResyncMonitor object to YES. To activate automatic disk pair
resynchronization, set the ResyncMonitorAutoRecover object to YES.
When a RHCS service or SLE HA resource group is stopped, the pair/resync monitor is stopped
for the RAID Manager device/copy group the service or resource group uses.
The pair/resync monitor does not allow online changes in the P9000 Cluster Extension resource
configuration file when the corresponding RHCS service or SLE HA resource group is online.
If the ResyncMonitor object is changed to YES while the RHCS service or SLE HA resource
group is running, the pair/resync monitor is not started.
If the ResyncMonitor object is changed to NO while the RHCS service or SLE HA resource
group is running, a running pair/resync monitor is not stopped.
CAUTION:
If a RHCS service or SLE HA resource group cannot be stopped gracefully, disable monitoring of
the device/copy group for the service or resource group. To avoid data corruption, this task must
be part of the recovery procedure when P9000 Cluster Extension is deployed in the RHCS or SLE
HA environment. See “Stopping the pair/resync monitor” (page 73).
Ensure that the pair/resync monitor does not monitor and resynchronize the disk pair (device/copy
group) from both disk array sites.
Timing considerations
P9000 Cluster Extension gives priority to disk array operations over cluster software operations.
If P9000 Cluster Extension invokes disk pair resynchronization or gathers information about the
remote disk array, P9000 Cluster Extension waits until the requested status information is reported.
This ensures the priority of data integrity over cluster software failover processes. This behavior
can lead to failed resources, as follows:
P9000 Cluster Extension uses RAID Manager instances to communicate with the remote disk
array. Depending on the setting of the RAID Manager instance timeout parameter and the
number of remote instances, the service or resource group start operation can time out. This
can occur if the local RAID Manager instance cannot reach the remote RAID Manager instance.
In an SLE HA environment, the timeout value defined for the start operation can be adjusted
to the appropriate value to avoid this situation. In an RHCS environment, the timeout value
depends on the timeout value specified in script resource agent (/usr/share/cluster/
script.sh).
P9000 Cluster Extension tries to resynchronize disk pairs and waits until the RAID Manager
device/copy group is in the PAIR state if the ApplicationStartup object is set to RESYNCWAIT.
RAID Manager and the disk array microcode fully support delta resynchronization; however,
the delta between the primary and secondary disks can be large enough for the copy process
to exceed the service or resource group startup timeout value.
The ResyncWaitTimeout object can cause the resource to fail if its value is higher than the
resource startup timeout value.
If running in fence-level ASYNC, the default value of AsyncTakeoverTimeout can cause the
resource to fail if its value is set beyond the recommended startup timeout value. This is done
because the takeover process for fence-level ASYNC can take longer when communication
links are slow.
To prevent the takeover timeout from terminating the takeover commands, measure the time
required to copy the installed disk array cache and adjust the resource startup timeout interval.
When measuring the copy time, measure only the slowest link used for Continuous Access.
68 Configuring P9000 Cluster Extension for Linux