HP XC System Software Installation Guide Version 3.1

Table 1-2 Role and Service Placement for Improved Availability (continued)
Special Considerations for Role Assignment
Service is Delivered in This
RoleService Name
By default, the management_server role is installed on the head node.
If you want improved availability for Nagios, the management_server
role must be assigned to two nodes, the head node and one additional node.
In this case, the head node cannot have the management_hub and
console_network roles assigned to it, so you must move those roles to
the other node in the availability set.
The other node in the availability set acts as a Nagios monitor unless the
Nagios master fails over; at that time the other node acts both as a Nagios
master and a Nagios monitor.
HP recommends that the other node in the availability set also has an
external Ethernet connection so that you can run the Nagios Web interface
on it.
For more information about the management_server role, see
“Management Server Role” (page 140) .
management_server
Nagios master
To achieve improved availability of NAT, you must assign the external
role to both nodes in the availability set, and both nodes must have a
configured external Ethernet connection. If you assign the external role
to another node in the system, it will be ignored.
During cluster_config processing, you are prompted to supply the IP
addresses of the NAT servers.
For more information about the external role See “External Role”
(page 140)
external
Network Address
Translation (NAT)
Configuring Improved Availability for the /hptc_cluster File System
To configure improved availability for the /hptc_cluster file system (service name hptc_cluster_fs),
use the HP StorageWorks Scalable File Share (SFS) software, which must be purchased separately from
HP. SFS is also required for successful fail over of the dbserver service. During the HP XC Kickstart
installation procedure, you are prompted to configure the /hptc_cluster file system on a disk somewhere
other than the head node. If you purchase and configure SFS, you can locate the file system on SFS storage.
Configuring Failover Capabilities for SLURM and LSF-HPC with SLURM
Improved availability for SLURM and LSF-HPC with SLURM is not achieved through availability sets or
availability tools. Failover capabilities for SLURM and LSF-HPC with SLURM are achieved by placing the
resource_management role on two or more nodes. These nodes are not members of any availability
set, and the SLURM and LSF-HPC with SLURM software is not managed by any availability tool.
When you assign two or more nodes with the resource_management role, SLURM availability is
automatically enabled. If you assign the resource_management to two or more nodes, you must manually
enable availability for LSF-HPC with SLURM; see “Perform LSF Postconfiguration Tasks” (page 87) for
instructions.
Standard LSF also contains it's own automatic failover mechanisms. See the Platform LSF documentation
for more information on node failure scenarios with standard LSF.
1.9.7 Use the Improved Availability Planning Worksheet
After you have completed the advance planning of your service availability strategy, use the worksheet
in Table 1-3 to record the following information:
The node names to associate into availability sets.
The availability tool that will manage the services in each availability set (if you installed and
configured more than one availability tool).
The roles (and thus, the services) to assign to both nodes in each availability set
The cluster_config utility prompts you for this information, so have the worksheet handy.
30 Preparing for a New Installation