HP-UX SNAplus2 R7 Administration Guide

Appendix DUsing SNAplus2 in a High Availability Environment
The run and halt commands must be designed to allow ServiceGuard to migrate the SNAplus2 package from the
primary server to the backup server. If the SNAplus2 package fails on the primary server (which is indicated by the
termination of the
snapmon process), ServiceGuard will invoke the halt commands on the primary server. Most
often the command, snap stop, is sufcient because that command will halt all of the SNAplus2 software. Insert
this command in the customer_defined_halt_cmds section of the Package Control Script as follows:
function customer_defined_halt_cmds
{
snap stop
}
After ServiceGuard stops the SNAplus2 package on the primary server, it will attempt to start the package on the
backup server. Using our example, it might seem simple to just add the following command to start the
HALS LS
on the backup server:
snapadmin start_ls, ls_name=HALS
But this command will fail if any of the following are true:
The SNAplus2 control daemon is not running on the backup server. The SNAplus2 control daemon must always
be running in order to activate an LS.
The SNAplus2 port HAPORT is not running on the backup server.
In addition, you must make sure the following requirements are satised:
The remote SNA system does not restrict which HP 9000 server can activate the same PU conguration. For
example, the remote SNA system allows communication from any MAC address in a Token Ring LAN. This
requirement is necessary to ensure that the backup server will be allowed to activate the same LS that the
primary server used.
The primary server and the backup server both have a compatible I/O conguration. This is an important
requirement that will be further explained in the section Section D.3.7,
I/O Compatibility Constraints.
The backup server is not running SNAplus2 when ServiceGuard attempts to migrate the package. If the backup
server is running SNAplus2, then the third command (snapadmin init_node) will fail. The reason is that
SNAplus2 only allows one node to run on a server.
With this in mind, you might be tempted to issue the command snap stop as the rst run command. How-
ever, there are certain failure conditions where this command is not sufcient. If the primary server panics or
loses all networking capability, it will be unable to send a message to other SNAplus2 servers that indicates
the node has stopped on the primary server. In this case, SNAplus2 will refuse to start the node on the backup
server until SNAplus2 recognizes that the primary server is down. This time period can be lengthy (up to 30
minutes).
Therefore, if the backup server is running SNAplus2, it is safest to completely stop the SNAplus2 software on
the backup server before issuing the activation commands. The complete command set, then is:
function customer_defined_run_cmds
{
snap stop
snap start
snapadmin init_node
snapadmin start_port, port_name=HAPORT
snapadmin start_ls, ls_name=HALS
}
With these commands specied in your Package Control Script, you will be able to start an SNAplus2 LS called
HALS on a backup server when the primary server fails to keep the LS active. These commands work best in
an SNAplus2 client/server environment where the applications run on client systems and automatically attempt to
reestablish LU-LU sessions anytime a session outage occurs. For standalone environments, you will also have to
consider how your applications will be impacted by various failures, including entire server system failures.
219