HP-UX SNAplus2 R7 Administration Guide

ManualsBrandsHP ManualsSoftwareHP-UX SNAplus2 Software

211

212

213

214

215

216

217

218

219

220

Appendix DUsing SNAplus2 in a High Availability Environment

The run and halt commands must be designed to allow ServiceGuard to migrate the SNAplus2 package from the

primary server to the backup server. If the SNAplus2 package fails on the primary server (which is indicated by the

termination of the

snapmon process), ServiceGuard will invoke the halt commands on the primary server. Most

often the command, snap stop, is sufﬁcient because that command will halt all of the SNAplus2 software. Insert

this command in the customer_defined_halt_cmds section of the Package Control Script as follows:

function customer_defined_halt_cmds

{

snap stop

}

After ServiceGuard stops the SNAplus2 package on the primary server, it will attempt to start the package on the

backup server. Using our example, it might seem simple to just add the following command to start the

HALS LS

on the backup server:

snapadmin start_ls, ls_name=HALS

But this command will fail if any of the following are true:

• The SNAplus2 control daemon is not running on the backup server. The SNAplus2 control daemon must always

be running in order to activate an LS.

• The SNAplus2 port HAPORT is not running on the backup server.

In addition, you must make sure the following requirements are satisﬁed:

• The remote SNA system does not restrict which HP 9000 server can activate the same PU conﬁguration. For

example, the remote SNA system allows communication from any MAC address in a Token Ring LAN. This

requirement is necessary to ensure that the backup server will be allowed to activate the same LS that the

primary server used.

• The primary server and the backup server both have a compatible I/O conﬁguration. This is an important

requirement that will be further explained in the section Section D.3.7,

I/O Compatibility Constraints.

• The backup server is not running SNAplus2 when ServiceGuard attempts to migrate the package. If the backup

server is running SNAplus2, then the third command (snapadmin init_node) will fail. The reason is that

SNAplus2 only allows one node to run on a server.

With this in mind, you might be tempted to issue the command snap stop as the ﬁrst run command. How-

ever, there are certain failure conditions where this command is not sufﬁcient. If the primary server panics or

loses all networking capability, it will be unable to send a message to other SNAplus2 servers that indicates

the node has stopped on the primary server. In this case, SNAplus2 will refuse to start the node on the backup

server until SNAplus2 recognizes that the primary server is down. This time period can be lengthy (up to 30

minutes).

Therefore, if the backup server is running SNAplus2, it is safest to completely stop the SNAplus2 software on

the backup server before issuing the activation commands. The complete command set, then is:

function customer_defined_run_cmds

{

snap stop

snap start

snapadmin init_node

snapadmin start_port, port_name=HAPORT

snapadmin start_ls, ls_name=HALS

}

With these commands speciﬁed in your Package Control Script, you will be able to start an SNAplus2 LS called

HALS on a backup server when the primary server fails to keep the LS active. These commands work best in

an SNAplus2 client/server environment where the applications run on client systems and automatically attempt to

reestablish LU-LU sessions anytime a session outage occurs. For standalone environments, you will also have to

consider how your applications will be impacted by various failures, including entire server system failures.

219