HP XC System Software Installation Guide Version 3.1

ManualsBrandsHP ManualsSoftwareHP XC System 3.x Software

Table 1-2 Role and Service Placement for Improved Availability (continued)

Special Considerations for Role Assignment

Service is Delivered in This

RoleService Name

By default, the management_server role is installed on the head node.

If you want improved availability for Nagios, the management_server

role must be assigned to two nodes, the head node and one additional node.

In this case, the head node cannot have the management_hub and

console_network roles assigned to it, so you must move those roles to

the other node in the availability set.

The other node in the availability set acts as a Nagios monitor unless the

Nagios master fails over; at that time the other node acts both as a Nagios

master and a Nagios monitor.

HP recommends that the other node in the availability set also has an

external Ethernet connection so that you can run the Nagios Web interface

on it.

For more information about the management_server role, see

“Management Server Role” (page 140) .

management_server

Nagios master

To achieve improved availability of NAT, you must assign the external

role to both nodes in the availability set, and both nodes must have a

configured external Ethernet connection. If you assign the external role

to another node in the system, it will be ignored.

During cluster_config processing, you are prompted to supply the IP

addresses of the NAT servers.

For more information about the external role See “External Role”

(page 140)

external

Network Address

Translation (NAT)

Configuring Improved Availability for the /hptc_cluster File System

To configure improved availability for the /hptc_cluster file system (service name hptc_cluster_fs),

use the HP StorageWorks Scalable File Share (SFS) software, which must be purchased separately from

HP. SFS is also required for successful fail over of the dbserver service. During the HP XC Kickstart

installation procedure, you are prompted to configure the /hptc_cluster file system on a disk somewhere

other than the head node. If you purchase and configure SFS, you can locate the file system on SFS storage.

Configuring Failover Capabilities for SLURM and LSF-HPC with SLURM

Improved availability for SLURM and LSF-HPC with SLURM is not achieved through availability sets or

availability tools. Failover capabilities for SLURM and LSF-HPC with SLURM are achieved by placing the

resource_management role on two or more nodes. These nodes are not members of any availability

set, and the SLURM and LSF-HPC with SLURM software is not managed by any availability tool.

When you assign two or more nodes with the resource_management role, SLURM availability is

automatically enabled. If you assign the resource_management to two or more nodes, you must manually

enable availability for LSF-HPC with SLURM; see “Perform LSF Postconfiguration Tasks” (page 87) for

instructions.

Standard LSF also contains it's own automatic failover mechanisms. See the Platform LSF documentation

for more information on node failure scenarios with standard LSF.

1.9.7 Use the Improved Availability Planning Worksheet

After you have completed the advance planning of your service availability strategy, use the worksheet

in Table 1-3 to record the following information:

• The node names to associate into availability sets.

• The availability tool that will manage the services in each availability set (if you installed and

configured more than one availability tool).

• The roles (and thus, the services) to assign to both nodes in each availability set

The cluster_config utility prompts you for this information, so have the worksheet handy.

30 Preparing for a New Installation