Building Disaster Recovery Serviceguard Solutions Using Metrocluster with EMC SRDF

1. Clean the Site Safety Latch on the site by running the cmresetsc tool.
On a node from the site, run the following command:
# /usr/sbin/cmresetsc <Site_controller_package_name>
IMPORTANT: Root user credentials are required to run this command.
2. Check the package log file of the Site Controller Package on the node it failed on and fix any
reported issues.
3. Enable node switching for the Site Controller package on that node.
# cmmodpkg e -n <node name> <site_controller_package_name>
4. Check the package log file on all the nodes of the MNP packages managed by the Site
Controller package on the site.
5. Fix any issues reported in the package log files and enable node switching for the MNP
packages on the failed nodes.
# cmmodpkg -e -n <node 1 name> -n <node 2 name> <MNP Package name>
6. Restart the Site Controller package and enable global switching.
# cmrunpkg <site_controller_package_name>
# cmmodpkg -e <site_controller_package_name>
In addition to the cmresetsc tool, use the cmviewsc tool to view information on the workload
packages managed by the Site Controller package. This tool is available in the /usr/sbin
directory. Run the following command to use the cmviewsc tool.
# /usr/sbin/cmviewsc [-v] [site_controller_package_name]
The cmviewsc tool displays the following information:
Number of critical and managed packages at each site.
Status of the Site Controller managed packages (halted or started).
Site Controller managed packages halted cleanly or not.
Site active or passive.
Site Safety Latch value on each node. The value can be Close, Open, or Intermediate.
For more information about using cmviewsc, see cmviewsc (1m).
Identifying and cleaning MNP stack packages that are halted
The Site Controller package does not start if the MNP stack packages are not halted cleanly. An
MNP package is halted uncleanly when the halt script does not run successfully on all the configured
nodes of the package. This implies that there might be some stray resources configured with the
package, that are online in the cluster. The Site Controller package logs the following message in
its log file on the node where it failed to start:
Package <package name> has not halted cleanly on node <node name>
The following command shows whether an MNP package halt was clean or unclean:
# cmviewcl v f line <MNP Package name>
Check for the field last_halt_failed under each instance of the MNP package. When set to
Yes, that instance of the MNP package did not successfully execute the halt script when it was
halted. Find all similar instances.
The unclean nodes might have stray resources. See the MNP package log file on the corresponding
node to identify the reason for the halt script run failure. Clean any stray resources that are still
online in the node and enable node switching on the node for the package. This clears the flag
72 Troubleshooting