Building Disaster Recovery Serviceguard Solutions Using Metrocluster with 3PAR Remote Copy

Table 6 Error Messages and their Resolution (continued)
ResolutionCauseLog Messages
case, the remote site is
siteB.
Fix any issue reported in the package log files
and enable node switching for the packages on
nodes they have failed.
3. Enable node switching for the
package managed by Site Controller
Package on the other site.
4. Clean the remote site using the
cmresetsc tool on a node in the
other site.
Reset the site siteA using cmresetsc command
and start hrdb_sc again.
Site Controller startup failed.
5. Restart the Site Controller package.
This message is logged
because one of the
Executing: cmrunpkg siteA_mg1 siteA_mg2.
siteA_mg3.
1. Check the log file of the package on
the nodes where node switching is
not enabled.packages managed by the
Site Controller package has
failed to start at this site.
Unable to run package siteA_mg1 on node
ccia6, the node switching is disabled.
2. Clean any stray resources owned by
the package, that are still online on
the node.
Unable to run package siteA_mg1 on node
ccia7, the node switching is disabled.
3. Enable node switching for the
package on the nodes.
cmrunpkg: Unable to start some package or
package instances.
4. Clean the site using the cmresetsc
tool.
Check the log files of the packages managed
by Site Controller for more details.
5. Start the Site Controller package.
Check for any error messages in the package
log file on all nodes in the site siteA for the
packages managed by Site Controller (hrdb_sc).
Fix any issue reported in the package log files
and enable node switching for the packages on
nodes they have failed.
Reset the site siteA using cmresetsc command
and start hrdb_sc again.
Site Controller startup failed.
This message is logged
because the Serviceguard
Failed to get remote site name, cluster might be
in transient state.
1. Wait for the cluster to reform (until
there is no node in reforming state).
command cmviewcl
Exiting
2. Restart the Site Controller package.
failed due to cluster
If the cmviewcl command failure is due
to memory, network or CPU transient
Or
Failed to get local site name, cluster might be
in transient state.
reformation or transient
error conditions.
error conditions, fix the issue and restart
the Site Controller package.
Exiting.
This message is logged
because the cluster is either
Error: Failed to get Site Controller EMS resource
value from <node_name>
1. Wait for the cluster to reform (until
there is no node in reforming state).
reforming or because of
Unable to find the site safety latch value at
remote site siteB
2. Restart the Site Controller package.
transient error conditions in
the EMS framework.
3. Check there are no issues with the
EMS framework using the following
command on all the nodes:
resls -s -q /dts/mcsc/
<site_controller_package_name>
Or
Error: Failed to set site safety latch value on
<node_name>
4. If this command fails, fix the issue and
restart the Site Controller package.
This message is logged
because the attempts to
Failed to delete Site Controller detach flag file
/etc/cmcluster/hrdb_sc/DETACH
1. Check the file permissions for the
DETACH file.
remove the file using the rm
command failed.
2. Remove the file before restarting the
Site Controller Package.
This message is logged
because the Serviceguard
Execution of cmviewcl -f line failed
or
1. Wait for the cluster to reform (until
there is no node in reforming state).
command cmviewcl
2. Restart the Site Controller package.
Execution of cmviewcl -v -f line failed
failed due to cluster
66 Troubleshooting