ClusterPack Index of Tutorial Sections Index | Administrators Guide | Users Guide | Tool Overview | Related Documents | Dictionary Administrators Guide 1.0 ClusterPack Install QuickStart 1.1 ClusterPack General Overview 1.2 Comprehensive Install Instructions 1.3 Installation and Configuration of Optional Components 1.4 Software Upgrades and Reinstalls 1.5 Golden Image Tasks 1.6 System Maintenance Tasks 1.7 System Monitoring Tasks 1.8 Workload Management Tasks 1.9 System Troubleshooting Tasks Users Guide 2.
Dictionary of Cluster Terms Back to Top Index | Administrators Guide | Users Guide | Tool Overview | Related Documents | Dictionary Copyright 1994-2004 hewlett-packard company
ClusterPack Install QuickStart ClusterPack ClusterPack Install QuickStart Index | Administrators Guide | Users Guide | Tool Overview | Related Documents | Dictionary 1.0.
Step Q1 Fill Out the ClusterPack Installation Worksheet Print out this form and fill out all information for each node in your cluster. Installation Worksheet (pdf) Note: You will not be able to complete the following steps if you have not collected all of this information. For more information, see the Comprehensive Instructions for this step.
of the operating environment. The minimum release versions required are: z z MySQL Version 3.23.58 or higher Perl Version 5.8 or higher For more information, see the Comprehensive Instructions for this step. References: z Step 2 Install Prerequisites Back to Top Step Q3 Allocate File System Space Allocate file system space on the Management Server. Minimum requirements are listed below.
For more information, see the Comprehensive Instructions for this step. References: z Step 4 Obtain a License File Back to Top Step Q5 Prepare Hardware Access Get a serial console cable long enough to reach all the Compute Nodes from the Management Server. Note: If you are installing ClusterPack on Compute Nodes for the first time, DO NOT power up the systems, ClusterPack will do that for you automatically. If you do accidentally power the compute nodes, DO NOT answer the HPUX boot questions.
Back to Top Step Q7 Configure the ProCurve Switch z z z z z z Select an IP address from the same IP subnet that will be used for the Compute Nodes. Connect a console to the switch Log onto the switch through the console Type 'set-up' Select IP Config and select the "manual" option Select the IP address field and enter the IP address to be used for the switch For more information, see the Comprehensive Instructions for this step.
For more information, see the Comprehensive Instructions for this step.
z Step 11 Run mp_register on the Management Server Back to Top Step Q12 Power up the Compute Nodes Use the clbootnodes program to power up all Compute Nodes that have a connected Management Processor that you specified in the previous step. Provide the following information to the clbootnodes program: z z z z z Language to use, Host name, Time and time zone settings, Network configuration, Root password. For more information, see the Comprehensive Instructions for this step.
repeat the installation process, performing all steps in the order specified. For more information, see the Comprehensive Instructions for this step.
ClusterPack General Overview ClusterPack ClusterPack General Overview Index | Administrators Guide | Users Guide | Tool Overview | Related Documents | Dictionary 1.1.1 ClusterPack Overview 1.1.2 Who should use the material in this tutorial? 1.1.3 What is the best order to review the material in the tutorial? 1.1.4 Operating System and Operating Environment Requirements 1.1.5 System Requirements 1.1.
options of Gigabit Ethernet or HyperFabric2. The common components of a cluster are: z z z z z z z z Head Node - provides user access to the cluster. In smaller clusters, the Head Node may also serve as a Management Server. Management Server - server that provides single point of management for all system components in the cluster cluster LAN/switch - usually an Ethernet network used to monitor and control all the major system components. May also handle traffic to the file server.
transferring data between nodes. A cluster LAN is also configured to separate the system management traffic from application message passing and file serving traffics. Management Software and Head Node The ability to manage and use a cluster as easily as a single compute system is critical to the success of any cluster solution. To facilitate ease of use for both system administrators and end-users, HP has created a software package called ClusterPack.
11i Version 2.0. The ClusterPack has a server component that runs on a Management Server, and client agents that run on the managed Integrity compute servers.
The Data Dictionary contains definitions for common terms that are used through the tutorial. Back to Top 1.1.3 What is the best order to review the material in the tutorial? System Administrators Initial installation and configuration of the cluster requires a complete understanding of the steps involved and the information required. Before installing a new cluster, the system administrator should read and understand all of the steps involved before beginning the actual installation.
a link to the printable version at the bottom of the page. References: z Printable Version Back to Top 1.1.4 Operating System and Operating Environment Requirements The key components of the HP Integrity Server Technical Cluster are: z z z Management Server: HP Integrity server with HP-UX 11i Version 2.0 TCOE Compute Nodes: HP Integrity servers with HP-UX 11i Version 2.0 TCOE Cluster Management Software: ClusterPack V2.3 The following prerequisites are assumed: z z HP-UX 11i V2.
Back to Top Index | Administrators Guide | Users Guide | Tool Overview | Related Documents | Dictionary Copyright 1994-2004 hewlett-packard company
Comprehensive Install Instructions ClusterPack Comprehensive Install Instructions Index | Administrators Guide | Users Guide | Tool Overview | Related Documents | Dictionary 1.2.
{ z Processor. Verify the Management Server and the initial Compute Node. Configure the remaining Compute Nodes with a Golden Image. { { { { { Create a Golden Image. Add nodes to the configuration that will receive the Golden Image. Distribute the Golden Image to remaining nodes. Install and configure the Compute Nodes that received the Golden Image. Verify the final cluster configuration. These processes are further broken down into a number of discrete steps.
Note: You will not be able to complete the following steps if you have not collected all of this information. Details At various points during the configuration you will be queried for the following information: z z z DNS Domain name [ex. domain.com] NIS Domain name [ex. hpcluster] Network Connectivity: { { z z Information on which network cards in each Compute Node connect to the Management Server Information on which network card in the Management Server connects to the Compute Nodes.
z HP-UX 11i V2.0 TCOE ClusterPack depends on certain open source software which is normally installed as a part of the operatin environment. The minimum release versions required are: z z MySQL Version 3.23.58 or higher Perl Version 5.8 or higher The Management Server requires a minimum of two LAN connections. One connection must be configur prior to installing ClusterPack. The Compute Nodes must have Management Processor (MP) cards. Details Install these items when you do a fresh install of HP-UX.
z /share - 500MB (Clusterware edition only) Details Allocate space for these file systems when you do a fresh install of HP-UX on the Management Server. To resize /opt 1. Go to single user mode. % # /usr/sbin/shutdown -r now 2. Interrupt auto boot. 3. Select the EFI shell. 4. Select the appropriate file system. (Should be fs0: but may be fs1:) % Shell> fs0: 5. Boot HP-UX. % fs0:\>hpux 6. Interrupt auto boot. 7. Boot to single user mode. % HPUX> boot vmunix -is 8. Determine the lvol of /opt.
Step 4 Obtain a License File Background For ClusterPack Base Edition, please refer to the Base Edition License certificate for instructions on redeeming your license. For ClusterPack Clusterware Edition, you will need to redeem BOTH the Base Edition license certificate AND the Clusterware Edition license certificate. You will need TWO license files in order to run manager_config.
This document does not cover hardware details. It is necessary, however, to make certain hardware preparations in order to run the software. Overview Get a serial console cable long enough to reach all the Compute Nodes from the Management Server. Details To allow the Management Server to aid in configuring the Management Processors, it is necessary to hav serial console cable to connect the serial port on the Management Server to the console port on the Management Processor to be configured.
Step 7 Configure the ProCurve Switch Background The ProCurve Switch is used for the management network of the cluster. Overview The IP address for the ProCurve Switch should be selected from the same IP subnet that will be used for Compute Nodes. Details z z z z z z Select an IP address from the same IP subnet that will be used for the Compute Nodes.
Step 9 Install ClusterPack on the Management Server Background The ClusterPack software is delivered on a DVD. Overview z z z Mount and register the ClusterPack DVD as a software depot. Install the ClusterPack Manager software (CPACK-MGR) using swinstall. Leave the DVD in the DVD drive for the next step. Details How to mount a DVD on a remote system to a local directory On the system with the DVD drive (i.e. remote system): 1. Mount the DVD. % mount /dev/dsk/xxx /mnt/dvdrom 2.
When you are finished, on the local machine: 6. Unmount the DVD file system. % /etc/umount /mnt/dvdrom On the remote system: 7. Unexport the DVD file system. % exportfs -u -i /mnt/dvdrom 8. Unmount the DVD % /etc/umount /mnt/dvdrom How to enable a DVD as a software depot During the installation process, two DVDs will be required. Generic instructions for making a DVD accessible as a software depot for installation onto the Management Server are provided here.
% /usr/sbin/swinstall -s :/mnt/dvdrom CPACKMGR z The ClusterPack DVD will be referenced again in the installation process. Please leave it in the DVD drive until the "Invoke /opt/clusterpack/bin/manager_config on Management Server" step has completed. Back to Top Step 10 Run manager_config on the Management Server Background This program is the main installation and configuration driver. It should be executed on the Management Server.
Provide the following information to the manager_config program: z z z z z z z z z The path to the license file(s), Whether to store passwords, The DNS domain and NIS domain for the cluster, The host name of the manager and the name of the cluster, The cluster LAN interface on the Management Server, The count and starting IP address of the Compute Nodes, Whether to mount a home directory, The SCM admin password, The LSF admin password.
manager_config Invocation manager_config is an interactive tool that configures the Management Server based on some simple quer (most of the queries have default values assigned, and you just need to press RETURN to assign those default values).
When you telnet to an MP, you will initially access the console of the associated server. Other options su as remote console access, power management, remote re-boot operations, and temperature monitoring are available by typing control-B from the console mode. It is also possible to access the MP as a web consol However, before it is possible to access the MP remotely it is first necessary to assign an IP address to ea MP.
console port on the MP card of each Compute Node. When you are ready to run mp_register, use this command: % /opt/clusterpack/bin/mp_register Back to Top Step 12 Power up the Compute Nodes Background The clbootnodes utility is intended to ease the task of booting Compute Nodes for the first time. To use clbootnodes, the nodes' MP cards must have been registered and/or configured with mp_register.
When booting a node, clbootnodes will answer the first boot questions rather than having to answer them manually. The questions are answered using the following information: z z z z z z z Language selection: All language selection options are set to English. Keyboard selection: The keyboard selection is US English Timezone: The time zone information is determined based on the setting of the Management Server Time: The current time is accepted.
Background This tool is the driver that installs and configures appropriate components on every Compute Node. z z z z z z z Registers Compute Nodes with SCM and SIM on the Management Server. Pushes agent components to all Compute Nodes. Sets up each Compute Node as NTP client, NIS client, and NFS client. Starts necessary agents in each of the Compute Nodes. Modifies configuration files on all Compute Nodes to enable auto-startup of agents after reboots.
% /opt/clusterpack/bin/compute_config Back to Top Step 14 Set up HyperFabric (optional) Background The utility clnetworks assists in setting up a HyperFabric network within a cluster. For clnetworks to recognize the HyperFabric (clic) interface, it is necessary to first install the drivers and/or kernel patches are needed. Once the clic interface is recognized by lanscan, clnetworks can be used to set (or change) the IP address configure the card.
systems. Overview If the InfiniBand IPoIB drivers are installed prior to running compute_config, the InfiniBand HCA is detected and the administrator is given a chance to configure them. The administrator can also configure the InfiniBand HCA with IP addresses by invoking /opt/clusterpack/bin/clnetworks. See the man pages for clnetworks for usage instructions.
the basic state of a computer system. An image does not generally include all files however. By default, / and other temporary files, network directories and host specific configuration files are not included. A system image may be referred to as a golden image or a recovery image. The different names used to re to the image reflect the different reasons for creating it.
Background This command adds the new node with the specified host name and IP address to the cluster. It also reconfigures all of the components of ClusterPack to accommodate the newly added node. Details Invoke /opt/clusterpack/bin/manager_config with the "add node" option (-a). You can include multiple host:ip pairs if you need to.
Back to Top Step 20 Install and Configure the remaining Compute Nodes Background This tool is the driver that installs and configures appropriate components on every Compute Node. Overview Perform this process in the same way as configuring the first Compute Node. References: z Details Use the following command to install and configure a Compute Node that received the Golden Image. Perform this for all nodes. You can specify multiple nodes on the command line.
Finalize and validate the installation and configuration of the ClusterPack software.
Installation and Configuration of Optional Components ClusterPack Installation and Configuration of Optional Components Index | Administrators Guide | Users Guide | Tool Overview | Related Documents | Dictionary 1.3.1 HP-UX IPFilter 1.3.2 External /home File Server 1.3.3 Adding Head Nodes to an ClusterPack cluster 1.3.4 Set up TCP-CONTROL 1.3.
Nodes in a private IP sub-net (10.x.y.z range, 192.168.p.q range), which also alleviates the need for numerous public IP addresses. IP Aliasing or Network Address Translation (NAT) ClusterPack comes with HP-UX IPFilter, a software component with powerful packet filtering and firewalling capabilities. One of the features that it supports is Network Address Translation. For your information on HP-UX IPFilter, please refer to the HP-UX IPFilter manual and release notes at docs.hp.com: http://docs.hp.
HP-UX IPFilter Validation HP-UX IPFilter is installed with the default HP-UX 11i V2 TCOE bundle. To validate its installation, run the following command: % swverify B9901AA Automatic setup of HP-UX IPFilter rules ClusterPack V2.3 provides a utility called nat.server to automatically set up the NAT rules, based on the cluster configuration. This tool can be invoked as follows: % /opt/clusterpack/lbin/nat.
% man 8 ipf z List the input output filter rules % ipfstat -hio Setup the NAT rules In this section, we will walk through the steps of setting up HP-UX IPFilter that translate the source IP addresses of all packets from the compute private subnet to the IP address of the gateway node. For addin more sophisticated NAT rules, please refer to the IPFilter documentation. 1. Create a file with NAT rules. Example 1: Map packets from all Compute Nodes in the 192.168.0.x subnet to a single IP address 15.99.84.
map lan0 192.168.0.4/32 -> 15.99.84.23/32 portmap tcp/udp 40000:60000 map lan0 192.168.0.4/32 -> 15.99.84.23/32 EOF More examples of NAT and other IPFilter rules are available at /opt/ipf/examples. 2. Enable NAT based on this rule set % ipnat -f /tmp/nat.rules Note: If there are existing NAT rules that you want to replace, you must flush and delete that rule set before loading the new rules: % ipnat -FC -f /tmp/nat.rules For more complicated manipulations of the rules, refer to ipnat man pages.
If there is no packet loss, then NAT is enabled. z DISPLAY Server Interaction Test 1. On the Compute Node, set the DISPLAY variable to a display server that is not part of the cluster, for instance your local desktop. % setenv DISPLAY 15.99.22.42:0.0 (if it is csh) 2. Try to bring up an xterm on the DISPLAY server: % xterm & If the xterm is brought up in the DISPLAY server, then NAT is enabled. References: z 3.6.1 Introduction to NAT (Network Address Translation) Back to Top 1.3.
The default use model of an ClusterPack cluster is that end users will submit jobs remotely through the ClusterWare GUI or by using the ClusterWare CLI from the Management Node. Cluster administrators generally discourage users from logging into the Compute Nodes directly. Users are encouraged to use th Management Server for accessing files and performing routine tasks.
ALL:ALL@ By uncommenting these lines, all users from the Management Server will be denied access. There is also a /etc/hosts.allow file that explicitly permits access to some users. It is configured, by default, to allow ac to root and lsfadmin: ALL:root@ALL ALL:lsfadmin@ALL Although the hosts.deny file disallows all access, the entries in hosts.allow override the settings of hosts.deny. The hosts.
Software Upgrades and Reinstalls ClusterPack Software Upgrades and Reinstalls Index | Administrators Guide | Users Guide | Tool Overview | Related Documents | Dictionary 1.4.1 Software Upgrades and Reinstalls Overview 1.4.2 Prerequisites for Software Upgrades and Reinstalls 1.4.3 Reinstallation and Configuration Steps 1.4.
Installation and Setup). The reinstallation path is only meant to ensure that all of the ClusterPack software is correctly installed and the cluster layout described by earlier invocations of manager_config is configured correctly. References: z 1.2.1 Comprehensive Installation Overview ClusterPack V2.3 supports an upgrade path from ClusterPack V2.2 Back to Top 1.4.
1.4.4 Upgrading from Base Edition to Clusterware Edition Upgrading from Base Edition to Clusterware Edition is done using the "forced reinstall" path that is documented below. During manager_config you will be given an opportunity to provide a valid Clusterware License key. If you have a key, Clusterware will be installed and integrated into the remaining ClusterPack tools. Please obtain your Clusterware licnese key BEFORE reinstalling the ClusterPack software.
This tool is the main installation and configuration driver. Invoke this tool with "force install" option -F: % /opt/clusterpack/bin/manager_config -F Note: manager_config will ask for the same software depot that was used the last time the cluster was installed. If you are using the ClusterPack V2.
1.4.5 Upgrading from V2.2 to V2.3 ClusterPack V2.3 supports an upgrade path from ClusterPack V2.2. Customers that currently deploy ClusterPack V2.2 on HP Integrity servers use HP-UX 11i Version 2.0 TCOE. ClusterPack V2.3 provides a mechanism for the use of the majority of V2.2 configuration settings for the V2.3 configuration. Before starting the upgrade, it is important to have all of your Compute Nodes in good working order. All Compute Nodes and MP cards should be accessible.
% /opt/clusterpack/bin/compute_config -u z Verify that everything is working as expected.
Golden Image Tasks ClusterPack Golden Image Tasks Index | Administrators Guide | Users Guide | Tool Overview | Related Documents | Dictionary 1.5.1 Create a Golden Image of a Compute Node from the Management Server 1.5.2 Distribute Golden Image to a set of Compute Nodes 1.5.1 Create a Golden Image of a Compute Node from the Management Server A system image is an archive of a computer's file system. Capturing the file system of a computer captures the basic state of a computer system.
% badmin hclose z In addition, you should either wait until all running jobs complete, or suspend them: % bstop -a -u all -m z Execute sysimage_create on the Management Server and pass the name of the file from which you would like the image to be made. For example: % /opt/clusterpack/bin/sysimage_create z Monitor the output for possible error conditions.
To distribute a golden image to a set of Compute Nodes, you need to first register the image. To register the image, use the command: % /opt/clusterpack/bin/sysimage_register If the image was created with sysimage_create, the full path of the image was displayed by sysimage_create.
System Maintenance Tasks ClusterPack System Maintenance Tasks Index | Administrators Guide | Users Guide | Tool Overview | Related Documents | Dictionary 1.6.1 Add Node(s) to the Cluster 1.6.2 Remove Node(s) from the Cluster 1.6.3 Install Software in Compute Nodes 1.6.4 Remove Software from Compute Nodes 1.6.5 Update Software in Compute Nodes 1.6.6 Add Users to Compute Nodes 1.6.7 Remove Users from Compute Nodes 1.6.8 Change System Parameters in Compute Nodes 1.6.
The steps in this section have to be followed in the specified order to ensure that everything works correctly. Step 1 Invoke /opt/clusterpack/bin/manager_config on Management Server Invoke /opt/clusterpack/bin/manager_config with a "add node" option -a. % /opt/clusterpack/bin/manager_config -a : This command adds the new node with the specified hostname and IP address to the cluster.
sysimage_register. You can see a list of registered images by executing: The full path of the image must be given to clbootnodes: Note: After installing an image with clbootnodes -i, it may be necessary to wait several minutes after clbootnodes returns before running compute_config as the deamons may need time to start and stablize Step 4 Invoke /opt/clusterpack/bin/compute_config on Management Server This tool is the driver that installs and configures appropriate components on every Compute Node.
correctly. Step 1 Invoke /opt/clusterpack/bin/manager_config on Management Server Invoke /opt/clusterpack/bin/manager_config with a "remove node" option -r. % /opt/clusterpack/bin/manager_config -r This command removes the node with the specified hostname from the cluster. It also reconfigures all of the components of ClusterPack to accommodate the removal of the node. The '-r' option can be repeated if more than one node needs to be removed from the system.
z z z Under "Tools", select "Software Management", and then double-click on "Install Software". Select the node(s) and/or node group to install on. This will bring up the swinstall GUI, from which you can specify the software source and select the software to be installed. References: z 3.2.3 How to Run SCM Web-based GUI Using CLI Software can also be installed on Compute Nodes using the /opt/clusterpack/bin/clsh tool to run the swinstall command. However, this may not work in a guarded cluster.
z To remove product PROD1 on all Compute Nodes % /opt/clusterpack/bin/clsh /usr/sbin/swremove PROD1 z To install product PROD1 on just the Compute Node group "cae" % /opt/clusterpack/bin/clsh -C cae /usr/sbin/remove PROD1 Back to Top 1.6.5 Update Software in Compute Nodes The process for updating software is the same as for installing software. (See "Install Software in Compute Nodes"). swinstall will verify that the software you are installing is a newer version than what is already present.
Using the CLI To add users to the Compute Nodes, first add the user to the Management Server with the useradd command. (man useradd(1M) for more information). % useradd Use ypmake to push the new user's account information to the Compute Nodes: % /var/yp/ypmake Back to Top 1.6.7 Remove Users from Compute Nodes Using the SCM GUI To remove users from the cluster, do the following: z z z Select the Management Server.
1.6.8 Change System Parameters in Compute Nodes Using the SCM GUI: z z z Select one or more nodes. Under "Tools", select "System Administration", and then click on "System Properties". A SAM System Properties window will appear for each node selected. For greater efficiency and consistency, perform this operation only on a single Compute Node, and then a golden image be created from that Compute Node and pushed to the other Compute Nodes. References: z z z 3.2.3 How to Run SCM Web-based GUI 1.5.
z LogicalMemory). For fine control over inventory collection, use "Advanced Settings" to select or unselect specific items. Back to Top 1.6.10 Define Consistency Check Timetables on Compute Node Inventories To define Compute Node inventories for consistency checks use the SCM GUI to access the SIM GUI. Using the SCM GUI: z z z Select one or more nodes. Under "Tools", select "System Inventory", and then click "SysInvMgr portal". This launches the SIM GUI. References: z 3.2.
z This launches the SIM GUI. References: z 3.2.3 How to Run SCM Web-based GUI Using the SIM GUI: z z z z z z z z Log in as "admin". Select the "Filter" folder. Click "Create Filter". Enter a name to uniquely identify the inventory filter. Enter an optional description. Select one or more categories (e.g. System, Memory, I/O Devices). Select one or more Groups from the selected categories (e.g. BundleContents, LogicalMemory).
Back to Top 1.6.13 Copy files within nodes in a cluster The 'clcp' command in /opt/clusterpack/bin is used to copy files between cluster nodes. Each file or directory argument is either a remote file name of the form "%h:path" or "cluster:path" or a local file name (containing no ':' characters).
z List all processes on node3 and node4 % clps -C node3+node4 -a For more details on the usage of clps, invoke the command: % man clps Back to Top 1.6.15 Kill a user's process (or all of the user's processes) on some/all Cluster Nodes The 'clkill' command in /opt/clusterpack/bin is used to kill processes on Cluster Nodes. Since using PIDs on a cluster is not feasible given there will be different PIDs on different hosts, clkill can kill processes by name.
The following example creates a node group "cae" containing compute cluster nodes "lucky000", "lucky001", and "lucky002": % /opt/clusterpack/bin/clgroup -a cae lucky000 lucky001 lucky002 clgroup can also form groups from existing groups. For more details on the usage of clgroup, invoke the command: % man clgroup Back to Top 1.6.17 Remove a Cluster Group Groups of Compute Nodes can be removed from ClusterPack using /opt/clusterpack/bin/clgroup.
1.6.19 Remove Nodes from a Cluster Group Compute Nodes can be removed from existing groups in ClusterPack using /opt/clusterpack/bin/clgroup. The following example removes node "lucky006" from the node group "cae" : % /opt/clusterpack/bin/clgroup -r cae lucky006 Groups can also have entire groups of nodes removed by using the name of a pre-existing group. For more details on the usage of clgroup, invoke the command: % man clgroup Back to Top 1.6.
ClusterPack Base Edition The ClusterPack Base Edition license server is based on FlexLM licensing technology. The Base Edition license server is installed and configured by the manager_config tool. The license server is started by manager_config, and it is installed to start during a normal system boot. To manually start the ClusterPack license server: % /sbin/init.d/cpack.server start To manually stop the ClusterPack license server: % /sbin/init.d/cpack.
System Monitoring Tasks ClusterPack System Monitoring Tasks Index | Administrators Guide | Users Guide | Tool Overview | Related Documents | Dictionary 1.7.1 Get an Overview of Cluster Health 1.7.2 Get an Overview of the Job Queue Status 1.7.3 Get details on health of specific Compute Nodes 1.7.4 View Usage of Resources in Compute Node(s) 1.7.5 Monitor Compute Nodes based on resource thresholds 1.7.
z z State refers to the state of the host. Batch State refers to the state of the host, and the state of the daemons running on that host. A detailed list of batch states is shown below. For more information, select the online help: z z z Select Help->Platform Help Select "View" under the "Hosts" section in the left hand pane. Select "Change your hostview" to see a description of the icons. Using the Clusterware Pro V5.
z z z z z z z z have exceeded their thresholds. closed_Excl - The host is not accepting jobs until the exclusive job running on it completes. closed_Full - The host is not accepting new jobs. The configured maximum number of jobs that can run on it has been reached. closed_Wind - The host is not accepting jobs. The dispatch window that has been defined for it is closed. unlicensed - The host is not accepting jobs. It does not have a valid LSF license for sbatchd and LIM is down.
or % bqueues -l For more information, see the man page: % man bqueues Common Terms Both the Web interface and the CLI use the same terms for the health and status of the job submission queues. These terms are used to define the State of an individual queue. z z z z Open - The queue is able to accept jobs. Closed - The queue is not able to accept jobs. Active - Jobs in the queue may be started. Inactive - Jobs in the queue cannot be started for the time being. References: z z 3.7.
Default status from each node is available using: % bhosts STATUS shows the current status of the host and the SBD daemon. Batch jobs can only be dispatched to hosts with an ok status.
1.7.4 View Usage of Resources in Compute Node(s) Using the Clusterware Pro V5.1 Web Interface: From the Hosts Tab: z z z z z Select the host to be monitored using the checkbox next to each host. More than one host can be selected. From the menu select Host->Monitor A new window will open that displays the current resource usage of one of the selected hosts. Four resources are displayed: total system memory, CPU Utilization, swap space available, and /tmp space available.
z 3.7.9 How do I access the Clusterware Pro V5.1 Command Line Interface? Back to Top 1.7.5 Monitor Compute Nodes based on resource thresholds Using the Clusterware Pro V5.1 Web Interface: From the Hosts Tab z z z z From the View menu select View->Choose Columns Add the Available Column resource to the Displayed Columns list. Click OK The new resource to be monitored will be displayed on the Host tab screen. Using the Clusterware Pro V5.1 CLI: Using the lshosts command, a resource can be specified.
Index | Administrators Guide | Users Guide | Tool Overview | Related Documents | Dictionary Copyright 1994-2004 hewlett-packard company
Workload Management Tasks ClusterPack Workload Management Tasks Index | Administrators Guide | Users Guide | Tool Overview | Related Documents | Dictionary 1.8.1 Add new Job Submission Queues 1.8.2 Remove Queues 1.8.3 Restrict user access to specific queues 1.8.4 Add resource constraints to specified queues 1.8.5 Change priority of specified queues 1.8.6 Add pre/post run scripts to specified queues 1.8.7 Kill a job in a queue 1.8.8 Kill all jobs owned by a user 1.8.9 Kill all jobs in a queue 1.8.
After adding, removing or modifying queues, it is necessary to reconfigure LSF to read the new queue information. This is done from the Management Server using the Clusterware Pro V5.1 CLI: % badmin reconfig Verify the queue has been added by using the Clusterware Pro V5.1 CLI: % bqueues -l References: z 3.7.9 How do I access the Clusterware Pro V5.1 Command Line Interface? Back to Top 1.8.
Back to Top 1.8.3 Restrict user access to specific queues Using the Clusterware Pro V5.1 CLI: The file /share/platform/clusterware/conf/lsbatch//configdir/lsb.queues controls which users can submit to a specific queue. The name of your cluster can be determined by using the Clusterware Pro V5.1 CLI: % lsid Edit the lsb.queues file and look for a USERS line for the queue you wish to restrict. If a USERS line exists, you can add or remove users from it.
% lsid Find the queue definition you wish to modify. The following entries for maximum resource usage can be modified or added for each queue definition: z z z z z z z CPULIMIT = minutes on a host FILELIMIT = file size limit MEMLIMIT = bytes per job DATALIMIT = bytes for data segment STACKLIMIT = bytes for stack CORELIMIT = bytes for core files PROCLIMIT = processes per job RES_REQ is a resource requirement string specifying the condition for dispatching a job to a host.
PRIORITY = to the queue definition. Queues with higher priority values are searched first during scheduling. After adding, removing or modifying queues, it is necessary to reconfigure LSF to read the new queue information. This is done from the Management Server using the Clusterware Pro V5.1 CLI: % badmin reconfig Verify the queue has been modified by using the Clusterware Pro V5.1 CLI: % bqueues -l References: z z 1.8.1 Add new Job Submission Queues 3.7.
to the queue definition. The command or tool should be accessible and runnable on all nodes that the queue services. After adding, removing or modifying queues, it is necessary to reconfigure LSF to read the new queue information. This is done from the Management Server using the Clusterware Pro V5.1 CLI: % badmin reconfig Verify the queue has been modified by using the Clusterware Pro V5.1 CLI: % bqueues -l References: z 1.8.1 Add new Job Submission Queues Back to Top 1.8.
Users can kill their own jobs. Queue administrators can kill jobs associated with a particular queue. References: z 3.7.9 How do I access the Clusterware Pro V5.1 Command Line Interface? Back to Top 1.8.9 Kill all jobs in a queue Using the Clusterware Pro V5.1 CLI: All of the jobs in a queue can be killed by using the bkill command with the -q option: % bkill -q -u all 0 Users can kill their own jobs. Queue administrators can kill jobs associated with a particular queue. References: z 3.
1.8.11 Suspend all jobs owned by a user Using the Clusterware Pro V5.1 CLI: All of a user's jobs can be suspended using the special 0 job id: % bstop -u 0 Users can suspend their own jobs. Queue administrators can suspend jobs associated with a particular queue. References: z 3.7.9 How do I access the Clusterware Pro V5.1 Command Line Interface? Back to Top 1.8.12 Suspend all jobs in a queue Using the Clusterware Pro V5.
z 3.7.9 How do I access the Clusterware Pro V5.1 Command Line Interface? Back to Top 1.8.14 Resume all suspended jobs owned by a user Using the Clusterware Pro V5.1 CLI: All of a user's jobs can be resumed using the Clusterware Pro V5.1 CLI by using the special 0 job id: % bresume -u 0 Users can resume their own jobs. Queue administrators can resume jobs associated with a particular queue. References: z 3.7.9 How do I access the Clusterware Pro V5.1 Command Line Interface? Back to Top 1.8.
System Troubleshooting Tasks ClusterPack System Troubleshooting Tasks Index | Administrators Guide | Users Guide | Tool Overview | Related Documents | Dictionary 1.9.1 Locate a Compute Node that is down 1.9.2 Get to the console of a Compute Node that is down 1.9.3 Bring up a Compute Node with a recovery image 1.9.4 View system logs for cause of a crash 1.9.5 Bring up the Management Server from a crash 1.9.6 Troubleshoot SCM problems 1.9.7 Replace a Compute Node that has failed with a new machine 1.9.
% lshosts -l % bhosts -l References: z z z z 1.7.1 Get an Overview of Cluster Health 1.7.3 Get details on health of specific Compute Nodes 3.7.8 How do I access the Clusterware Pro V5.1 Web Interface? 3.7.9 How do I access the Clusterware Pro V5.1 Command Line Interface? Back to Top 1.9.
This will reboot the machine, hostname, and cause the machine to install from the golden image you specified. References: z 1.5.2 Distribute Golden Image to a set of Compute Nodes Back to Top 1.9.4 View system logs for cause of a crash The system logs are located in /var/admin/syslog/syslog.log The crash logs are stored in /var/adm/crash The installation and configuration logs for ClusterPack are stored in /var/opt/clusterpack/log Back to Top 1.9.
When I try to add a node, I get "Properties file for doesn't exist." Solution: z Make sure that the hostname is fully qualified in /etc/hosts on both the Management Server and the managed node, if it exists in /etc/hosts, and that any shortened host names are aliases instead of primary names. For example: { z should be used instead of: { z z 10.1.2.3 cluster.abc.com cluster 10.1.2.3 cluster Make sure that AgentConfig is installed on the managed node, and that mxrmi and mxagent are running.
with a new name and IP address. Replacing with a new hostname and IP address In this case, the replacement node is handled simply by removing the failed node and adding the new node. Remove the failed node from the cluster using the following commands: % manager_config -r % compute_config -r The nodes MP will automatically be removed from the MP register database.
Job Management Tasks ClusterPack Job Management Tasks Index | Administrators Guide | Users Guide | Tool Overview | Related Documents | Dictionary 2.1.1 Invoke the Workload Management Interface from the Management Server 2.1.2 Invoke the Workload Management Interface from the intranet 2.1.3 Prepare for job submission 2.1.4 Submit a job to a queue 2.1.5 Submit a job to a group 2.1.6 Set a priority for a submitted job 2.1.7 Check the status of a submitted job 2.1.8 Check the status of all submitted jobs 2.1.
z Go to the following URL in the web browser: % /opt/netscape/netscape http://:8080/Platform/login/Login.jsp z Enter your Unix user name and password. This assumes that the gaadmin services have been started by the LSF Administrator. Note: The user submitting a job must have access to the Management Server and to all the Compute Nodes that will execute the job. To prevent security problems, the super user account (i.e. root) cannot submit any jobs using . References: z z 3.7.
Using the Clusterware Pro V5.1 Web Interface: From the jobs tab: z z z Select Job->Submit. Enter job data. Click Submit. Data files required for the job may be specified using the '-f' option to the bsub command. This optional information can be supplied on the "Advanced" tab within the Job Submission screen. For an explanation of the '-f' options please see "Transfer a file from intranet to specific Compute Nodes in the cluster". Using the Clusterware Pro V5.
Using the Clusterware Pro V5.1 CLI: % bsub -q Use bqueues to list available Queues. % bqueues References: z z 3.7.8 How do I access the Clusterware Pro V5.1 Web Interface? 3.7.9 How do I access the Clusterware Pro V5.1 Command Line Interface? Back to Top 2.1.5 Submit a job to a group Using the Clusterware Pro V5.1 Web Interface: From the Jobs tab: z z z Select Job->Submit. Enter relevant Job information. Select the "Resources" tab.
Using the Clusterware Pro V5.1 Web Interface: Set a priority at submission by: z z z From the Jobs Tab Select Job->Submit. Using the Queue pull down menu, select a queue with a high priority. After submission: z z z z From the Jobs Tab. Select a job from the current list of pending jobs. Select Job->Switch Queue. Switch the job to a queue with a higher priority The relative priority of the different Queues can be found on the "Queue Tab". Using the Clusterware Pro V5.
% bjobs % bjobs -l References: z z 3.7.8 How do I access the Clusterware Pro V5.1 Web Interface? 3.7.9 How do I access the Clusterware Pro V5.1 Command Line Interface? Back to Top 2.1.8 Check the status of all submitted jobs Using the Clusterware Pro V5.1 Web Interface: From the Jobs tab: z z Review the Jobs table. Use the Previous and Next buttons to view more jobs. Using the Clusterware Pro V5.1 CLI: % bjobs % bjobs - l References: z z 3.7.
2.1.10 Register for notification on completion of a submitted job Using the Clusterware Pro V5.1 Web Interface: From the Jobs tab: z z z z Select Job->Submit. Click Advanced. Select done from Send email notification when job is Enter the email address in the email to field. Using the Clusterware Pro V5.1 CLI: Using the CLI, users are automatically notified when a job completes. References: z z 3.7.8 How do I access the Clusterware Pro V5.1 Web Interface? 3.7.9 How do I access the Clusterware Pro V5.
2.1.12 Kill all jobs submitted by the user Using the Clusterware Pro V5.1 Web Interface: From the Jobs tab: z z z z z z Select Tools->Find. Select User from the Field list. Type the user name in the Value field. Click Find. Click Select All. Click Kill. Using the Clusterware Pro V5.1 CLI: % bkill -u 0 References: z z 3.7.8 How do I access the Clusterware Pro V5.1 Web Interface? 3.7.9 How do I access the Clusterware Pro V5.1 Command Line Interface? Back to Top 2.1.
References: z z 3.7.8 How do I access the Clusterware Pro V5.1 Web Interface? 3.7.9 How do I access the Clusterware Pro V5.1 Command Line Interface? Back to Top 2.1.14 Suspend a submitted job in a queue Using the Clusterware Pro V5.1 Web Interface: From the Jobs tab: z z Select the job from the Jobs table. Select Job->Suspend. Using the Clusterware Pro V5.1 CLI: % bstop References: z z 3.7.8 How do I access the Clusterware Pro V5.1 Web Interface? 3.7.9 How do I access the Clusterware Pro V5.
References: z z 3.7.8 How do I access the Clusterware Pro V5.1 Web Interface? 3.7.9 How do I access the Clusterware Pro V5.1 Command Line Interface? Back to Top 2.1.16 Suspend all jobs submitted by the user in a queue Using the Clusterware Pro V5.1 Web Interface: From the Jobs tab: z z z z z z z z z z z Select Tools->Find. Select the Advanced tab. Select User from the Field list in the Define Criteria section. Type the user name in the Value field. Click << Select Queue from the Field list.
z Select Job->Resume. Using the Clusterware Pro V5.1 CLI: % bresume References: z z 3.7.8 How do I access the Clusterware Pro V5.1 Web Interface? 3.7.9 How do I access the Clusterware Pro V5.1 Command Line Interface? Back to Top 2.1.18 Resume all suspended jobs submitted by the user Using the Clusterware Pro V5.1 Web Interface: From the Jobs tab: z z z z z z z z z z z Select Tools->Find. Select the Advanced tab. Select User from the Field list in the Define Criteria section.
Using the Clusterware Pro V5.1 Web Interface: From the Jobs tab: z z z z z z z z z z z Select Tools->Find. Select the Advanced tab. Select User from the Field list in the Define Criteria section. Type the user name in the Value field. Click << Select Queue from the Field list. Select the queue from the Queue list. Click << Click Find. Click Select All. Click Resume. Using the Clusterware Pro V5.1 CLI: % bresume -u -q 0 References: z z 3.7.8 How do I access the Clusterware Pro V5.
z 3.7.9 How do I access the Clusterware Pro V5.1 Command Line Interface? Back to Top 2.1.21 Suspend a submitted MPI job Using the Clusterware Pro V5.1 Web Interface: From the Jobs tab: z z Select the job from the Jobs table. Select Job->Suspend. Using the Clusterware Pro V5.1 CLI: % bstop References: z z 3.7.8 How do I access the Clusterware Pro V5.1 Web Interface? 3.7.9 How do I access the Clusterware Pro V5.1 Command Line Interface? Back to Top 2.1.
Index | Administrators Guide | Users Guide | Tool Overview | Related Documents | Dictionary Copyright 1994-2004 hewlett-packard company
File Transfer Tasks ClusterPack File Transfer Tasks Index | Administrators Guide | Users Guide | Tool Overview | Related Documents | Dictionary 2.2.1 Transfer a file from intranet to the Management Server in the cluster 2.2.2 Transfer a file from intranet to all Compute Nodes in the cluster 2.2.3 Transfer a file from intranet to specific Compute Nodes in the cluster 2.2.4 Transfer a file from a Compute Node to a system outside the cluster 2.2.
Back to Top 2.2.2 Transfer a file from intranet to all Compute Nodes in the cluster If the cluster is a Guarded Cluster, this operation is done in two steps: z z FTP the file to the Management Server. Copy the file to all nodes in the cluster. % clcp /a/input.data %h:/date/input.data % clcp /a/input.data cluster:/date/input.data For more details on the usage of clcp, invoke the command: % man clcp References: z 2.2.1 Transfer a file from intranet to the Management Server in the cluster Back to Top 2.
< Copies the remote file to the local file after the job completes. Overwrites the local file if it exists. % bsub -f < << Appends the remote file to the local file after the job completes. The local file must exist. % bsub -f << >< Copies the local file to the remote file before the job starts. Overwrites the remote file if it exists. Then copies the remote file to the local file after the job completes. Overwrites the local file.
z FTP the file from the Head node to the external target. References: z Guarded Cluster Back to Top 2.2.5 Transfer a file from a Compute Node to another Compute node in the cluster The 'clcp' command in /opt/clusterpack/bin is used to copy files between cluster nodes. This command can be invoked either from the Management Server or any Compute Node. [From the Management Server] % clcp node1:/a/data node2:/b/data Back to Top 2.2.
For more details on the usage of clcp, invoke the command: % man clcp Back to Top Index | Administrators Guide | Users Guide | Tool Overview | Related Documents | Dictionary Copyright 1994-2004 hewlett-packard company
Miscellaneous Tasks ClusterPack Miscellaneous Tasks Index | Administrators Guide | Users Guide | Tool Overview | Related Documents | Dictionary 2.3.1 Run a tool on a set of Compute Nodes 2.3.2 Check resource usage on a Compute Node 2.3.3 Check Queue status 2.3.4 Remove temporary files from Compute Nodes 2.3.5 Prepare application for checkpoint restart 2.3.6 Restart application from a checkpoint if a Compute Node crashes 2.3.7 Determine if the application fails to complete 2.3.
Using the Clusterware Pro V5.1 Web Interface: From the Jobs tab: z z z z z z Select Jobs->Submit. Enter job information. Click Advanced. On the Advanced dialog, enter script details in the Pre-execution command field. Click OK. Click Submit. Using the CLI: % bsub E 'pre_exec_cmd [args ...]' command References: z z 3.7.8 How do I access the Clusterware Pro V5.1 Web Interface? 3.7.9 How do I access the Clusterware Pro V5.1 Command Line Interface? Back to Top 2.3.
2.3.3 Check Queue status Using the Clusterware Pro V5.1 Web Interface: From the Jobs tab: z Review the Queues table. Use the Previous and Next buttons to view more Queues. Using the Clusterware Pro V5.1 CLI: % bqueues [] References: z z 3.7.8 How do I access the Clusterware Pro V5.1 Web Interface? 3.7.9 How do I access the Clusterware Pro V5.1 Command Line Interface? Back to Top 2.3.
and should not be used while AppRS jobs are running. % apprs_clean all For jobs submitted to non-AppRS queues, the user's job submission script should include commands to remove files that are no longer needed when the job completes. In the event that the job fails to run to completion it may be necessary to remove these files manually.
#APPRS TARGETUTIL 1.0 #APPRS TARGETTIME 10 #APPRS REDUNDANCY 4 # Your job goes here: if [ "$APPRS_RESTART" = "Y" ]; then # job as it is run under restart conditions else # job as it is run under normal conditions fi The names of all files that need to be present for the application to run from a restart should be listed with the HIGHLYAVAILABALE tag: #APPRS HIGHLYAVAILABLE Other AppRS options can be set in the job submission script.
2.3.6 Restart application from a checkpoint if a Compute Node crashes If a Compute Node crashes, jobs submitted to an AppRS queue will automatically be restarted on a new node or set of nodes as those resources become available. No user intervention is necessary. Back to Top 2.3.7 Determine if the application fails to complete The job state of EXIT is assigned to jobs that end abnormally. Using the Clusterware Pro V5.1 Web Interface: From the Jobs tab: z z Review the job states in the Jobs table.
% bhist z or for more information: % bhist -l z For jobs submitted to an AppRS queue, details of the job, including failover progress can be viewed using the command: % apprs_hist References: z 3.7.9 How do I access the Clusterware Pro V5.1 Command Line Interface? Back to Top 2.3.9 Get a high-level view of the status of the Compute Nodes Using the Clusterware Pro V5.1 Web Interface: From the Jobs tab: z z Review the Hosts table.
Cluster Management Utility Zone Overview ClusterPack Cluster Management Utility Zone Overview Index | Administrators Guide | Users Guide | Tool Overview | Related Documents | Dictionary 3.1.1 What is Cluster Management Utility Zone? 3.1.2 What are the Easy Install Tools? 3.1.3 What are the system imaging tools? 3.1.4 What are the MSA (Multi System Aware) Tools? 3.1.5 clsh - Runs commands on one, some, or all nodes in the cluster. 3.1.6 clcp - Copies files to one, some, or all cluster nodes. 3.1.
The ClusterPack suite includes a set of utilities for setting up a cluster of Itanium 2 nodes. The tools mananger_config, mp_register, clbootnodes, compute_config and finalize_config are key components for establishing and administering an Itanium 2 cluster.
z z sysimage_register sysimage_distribute These scripts use ClusterPack's knowledge of the cluster configuration to simplify the creation and distribution of system (golden) images. With the use of scripts, creating and distributing images is as simple as running these three tools and providing the name of a host and/or path of the image. References: z z 1.5.1 Create a Golden Image of a Compute Node from the Management Server 1.5.2 Distribute Golden Image to a set of Compute Nodes Back to Top 3.1.
clsh exits non-zero if there are problems running the remote shell commands. A summary of hosts on which problems occurred is printed at the end. clsh is used as follows: % clsh [-C cluster-group] [options] cmd [args] Examples To grep for something on all hosts in the cluster: % clsh grep pattern files ...
z single local to multiple remote % clcp src dst:%h or clcp src cluster-group:dst z multiple local to multiple remote % clcp src dst.%h %h:dst z multiple remote to multiple local % clcp %h:src dst.%h Examples 1. Assume that the file /etc/checklist needs to be updated on all HP hosts. Also assume that this file is different on all hosts. The following is a way in which this can be done: % clcp %h:/etc/checklist checklist.%h % vi checklist.* Make necessary changes. % clcp checklist.
% rcp host1:/etc/checklist checklist.1 % vi checklist.0 checklist.1 % rcp checklist.0 host0:/etc/checklist % rcp checklist.1 host1:/etc/checklist 3. The following is an example if log files are needed: % clcp %h:/usr/spool/mqueue/syslog %h/syslog.%Y%M% D.%T This would save the files in directories (which are the host names) with file names of the form: YYMMDD.TT:TT. The above might map to: % rcp host0:/usr/spool/mqueue/syslog host0/syslog.921013.14:43 % rcp host1:/usr/spool/mqueue/syslog host1/syslog.
Back to Top 3.1.8 clps - Cluster-wide ps command clps and clkill are the same program with clps producing a "ps" output that includes the host name and clkill allowing processes to be killed. clps is used as follows: % clps [-C] cluster][-ad]{tty user command pid regexp} For more details on the usage of clps, invoke the command: % man clps Back to Top 3.1.9 clkill - Kills specified processes on specified nodes.
z Short format (enabled by the -s option) The short format lists the cluster (followed by a colon) and the hosts it contains; one cluster per line. Long lines do not wrap. If there is only one cluster to be listed and the -v option has not been used, the leading cluster and colon are omitted. This is the default mode if the output is not to a tty device; facilitating the use of clinfo as a component in a larger script. z Medium format (enabled by the -m option) The medium format is tabular.
The first form of this command allows the user to add node groups to a compute cluster. The initial definition of the node group can be specified as a list of individual nodes and/or other groups. When a previously existing group is used in the formation of a new group, all members of the pre-existing group are added to the new group. The second form allows the user to remove a node group or nodes from a node group.
Back to Top 3.1.12 clbroadcast - Telnet and MP based broadcast commands on cluster nodes. The clbroadcast command is used to broadcast commands to various nodes in the cluster using the Management Processor (MP) interface or telnet interface. The tool opens a window with a telnet or an MP connection on each target and another "console window" with no echo where all input keyboard actions will be broadcast in all target windows.
Service ControlManager (SCM) Overview ClusterPack Service ControlManager (SCM) Overview Index | Administrators Guide | Users Guide | Tool Overview | Related Documents | Dictionary 3.2.1 What is ServiceControl Manager? 3.2.2 How to install, configure, manage, and troubleshoot SCM: 3.2.3 How to Run SCM Web-based GUI 3.2.1 What is ServiceControl Manager? ServiceControl Manager (SCM) makes system administration more effective, by distributing the effects of existing tools efficiently across nodes.
3.2.2 How to install, configure, manage, and troubleshoot SCM: ServiceControl Manager is installed as part of ClusterPack, and should not need to be installed manually. For additional information about the configuration, management, or general troubleshooting, please refer to the ServiceControl Manager Technical Reference: http://docs.hp.com/hpux/onlinedocs/B8339-90030/B8339-90030.html Back to Top 3.2.
System Inventory Manager (SIM) Overview ClusterPack System Inventory Manager (SIM) Overview Index | Administrators Guide | Users Guide | Tool Overview | Related Documents | Dictionary 3.3.1 What is System Inventory Manager? 3.3.2 How to invoke SIM 3.3.1 What is System Inventory Manager? The SIM application is a tool that allows you to easily collect, store and manage inventory and configuration information for the Compute Nodes in the HP-UX Itanium 2 cluster.
z z The filtering facility allows you to define and view only the information that you need at any given time. The Command Line Interface (CLI) that is provided enables scripting capabilities. Documentation for SIM is available at: http://software.hp.com/products/SIM/info.html Online help is also available by clicking the Help Tab in SIM GUI. Back to Top 3.3.
Application ReStart (AppRS) Overview ClusterPack Application ReStart (AppRS) Overview Index | Administrators Guide | Users Guide | Tool Overview | Related Documents | Dictionary 3.4.1 What is AppRS? 3.4.1 What is AppRS? AppRS is a collection of software that works in conjunction with Platform Computing's Clusterware™ to provide a fail-over system that preserves the current working directory (CWD) contents of applications in the event of a fail-over.
To use AppRS, users must add the following line to their ~/.cshrc file: source /share/platform/clusterware/conf/cshrc.lsf and the following line to their ~/.profile file: . /share/platform/clusterware/conf/profile.lsf References: z z z z z 2.3.4 Remove temporary files from Compute Nodes 2.3.5 Prepare application for checkpoint restart 2.3.
Cluster Management Utility (CMU) Overview ClusterPack Cluster Management Utility (CMU) Overview Index | Administrators Guide | Users Guide | Tool Overview | Related Documents | Dictionary 3.5.1 What is CMU? 3.5.2 Command line utilities 3.5.3 Nodes monitoring 3.5.4 Invoking CMU 3.5.5 Stopping CMU 3.5.6 CMU main window 3.5.7 Monitoring By Logical Group 3.5.8 Contextual Menu 3.5.9 Logical Group Administration Menu 3.5.1 What is CMU? CMU is designed to manage a large group of Compute Nodes.
3.5.3 Nodes monitoring z Cluster monitoring Enhanced monitoring capabilities for up to 1024 nodes in a single window (with vertical scrollbars). z Monitoring tools Provides tools to monitor remote node activities. z Node Administration Allows execution of an action on several nodes with one command. The actions are: 1. Boot and reboot selected nodes. 2.
window enabled. CMU will display the last monitored logical group. Note: When starting the CMU window for the first time, the monitoring action is performed with the “Default” Logical Group. Note: Some of the menus and functions within CMU will allow the user to act on more than one selected item at a time. When appropriate, the user can select multiple items by using the Ctrl or Shift keys in conjunction with the left mouse button.
{ { { { { Terminal Server Configuration PDU Configuration Network Topology Adaptation Node Management Event Handling Configuration Back to Top 3.5.7 Monitoring By Logical Group The following section describes the different actions that the user can perform in the "Monitoring By Logical Group" window. z Select/Unselect one node Left click on the name of this node. The node becomes darker when selected, or returns to original color when unselected.
A contextual menu window appears with a right click on a node displayed in the central frame of the main monitoring CMU window. The following menu options are available: z Telnet Connection Launches a telnet session to this node. The telnet session is embedded in an Xterm window. z Management Card Connection Launches a telnet connection to the management card of this node. The telnet session is embedded in an Xterm window.
Many management actions such as boot, reboot, halt, or monitoring will be applied to all of the selected nodes. z Halt This sub-menu allows a system administrator to issue the halt command on all of the selected nodes. The halt command can be performed immediately (this is the default), or delayed for a given time (between 1 to 60 minutes). The administrator can also have a message sent to all the users on the selected nodes by typing in the "Message" edit box.
before booting a node. z Reboot This sub-menu allows a system administrator to issue the reboot command on all of the selected nodes. The reboot command can be performed immediately (this is the default), or delayed for a given time (between 1 to 60 minutes). The administrator can also have a message sent to all the users on the selected nodes by typing in the "Message" edit box. Note: The reboot command is performed on the nodes using "rsh".
To improve the Xterm windows display appearance, every window can be shifted (in x and y) from the previous one to make sure that they fit nicely on the screen. By default, the shift values are computed so that the windows tile the screen and no window is displayed outside of the screen. If the user does not need to visualize the telnet sessions, or does not want to crowd the display, the user has the option to start the Xterm windows minimized.
NAT/IPFilter Overview ClusterPack NAT/IPFilter Overview Index | Administrators Guide | Users Guide | Tool Overview | Related Documents | Dictionary 3.6.1 Introduction to NAT (Network Address Translation) 3.6.1 Introduction to NAT (Network Address Translation) Network Address Translation (NAT) or IP Aliasing provides a mechanism to configure multiple IP addresses in the cluster to present a single image view with a single external IP address.
IP Aliasing or Network Address Translation (NAT) ClusterPack comes with HP-UX IPFilter, a software component with powerful packet filtering and firewalling capabilities. One of the features that it supports is Network Address Translation. For information on HP-UX HPFilter, please refer to the HP-UX HPFilter manual and release notes at docs.hp.com: http://docs.hp.com/hpux/internet/index.html#IPFilter/9000 For information on NAT features of HP-UX HPFilter refer to the public domain how-to document.
Platform Computing Clusterware Pro V5.1 Overview ClusterPack Platform Computing Clusterware Pro V5.1 Overview Index | Administrators Guide | Users Guide | Tool Overview | Related Documents | Dictionary 3.7.1 What is Clusterware Pro? 3.7.2 How do I obtain and install the Clusterware Pro V5.1 license file? 3.7.3 Where is Clusterware Pro V5.1 installed on the system? 3.7.4 How can I tell if Clusterware Pro V5.1 is running? 3.7.5 How do I start and stop the Clusterware Pro V5.1 daemons? 3.7.
z z z z Organizations experience increased productivity from transparent single system, clusteras-server access to compute resources. The Platform Computing's Clusterware Pro V5.1 solution dramatically reduces time to market through continuous access to the cluster's compute power. The Platform Computing's Clusterware Pro V5.1 solution enables organizations to achieve higher quality results by running simulations and analyses faster than previously possible.
Setup and Configuration of a DEMO license The use of a DEMO license file (license.dat) for Clusterware Pro, as part of the ClusterPack V2.3 Clusterware Edition, requires some modification of installed configuration files. These modifications will have to be removed in order to use a purchased license key (LSF_license.oem). 1. Place the DEMO license key onto the Management Server /share/platform/clusterware/conf/license.dat 2. Modify the /share/platform/clusterware/conf/lsf.
The /etc/exports file on the Management Server, and the /etc/fstab file on each Compute Node is updated automatically by ClusterPack. Back to Top 3.7.4 How can I tell if Clusterware Pro V5.1 is running? On the Management Server, several Clusterware Pro V5.1 services must be running in order to provide fu functionality for the tool. All of these services are located in /share/platform/clusterware.
To START services on the Management Server Issue the following command on the Management Server as the super user (i.e. root): % /share/platform/clusterware/lbin/cwmgr start To STOP services on the Management Server Issue the following command on the Management Server as the super user (i.e. root): % /share/platform/clusterware/lbin/cwmgr stop To START services on ALL Compute Nodes Issue the following command on the Management Server as the super user (i.e.
% /share/platform/clusterware/lbin/cwagent stop References: z 3.1.5 clsh - Runs commands on one, some, or all nodes in the cluster. Back to Top 3.7.6 How do I start and stop the Clusterware Pro V5.1 Web GUI? The Web GUI is started and stopped as part of the tools that are used to start and stop the other Clusterwa Pro V5.1 services. No additional steps are required. Note: The Clusterware Pro Web GUI is not automatically started during a reboot of the Management Server.
z The username and password are the same as for any normal user account on the Management Server. References: z 3.7.6 How do I start and stop the Clusterware Pro V5.1 Web GUI? Back to Top 3.7.9 How do I access the Clusterware Pro V5.1 Command Line Interface? Before using the Clusterware Pro V5.1 CLI, you must set a number of environment variables. This must b done once in each shell before using any of the Clusterware Pro V5.1 commands.
% badmin reconfig % badmin mbdrestart -f Restarting the Clusterware Pro V5.1 Services As an alternative, the Clusterware Pro V5.1 services can simply be restarted on all nodes in the cluster. T will cause any information about jobs that are running to be lost, but the jobs will continue to run. Please "How do I start and stop the Clusterware Pro V5.1 daemons?" for more information. References: z 3.7.5 How do I start and stop the Clusterware Pro V5.1 daemons? Back to Top 3.7.
Management Processor (MP) Card Interface Overview ClusterPack Management Processor (MP) Card Interface Overview Index | Administrators Guide | Users Guide | Tool Overview | Related Documents | Dictionary 3.8.1 Using the MP Card Interface 3.8.1 Using the MP Card Interface The MP cards allow the Compute Nodes to be remotely powered up. Using this technology, the initial installation and configuration of the Compute Nodes is eased. In order to access the MP Card Interface (using HPUX 11i V2.
{ Back to Top Index | Administrators Guide | Users Guide | Tool Overview | Related Documents | Dictionary Copyright 1994-2004 hewlett-packard company
Related Documents ClusterPack Related Documents Index | Administrators Guide | Users Guide | Tool Overview | Related Documents | Dictionary 4.1.1 HP-UX 11i Operating Environments 4.1.2 HP-UX ServiceControl Manager 4.1.3 HP Application ReStart 4.1.4 HP System Inventory Manager 4.1.5 HP-UX IPFilter 4.1.6 ClusterPack V2.3 4.1.1 HP-UX 11i Operating Environments HP-UX 11i March 2002 Release Notes http://www.docs.hp.com/hpux/onlinedocs/5185-4391/5185-4391.html http://www.docs.hp.com/hpux/os/11i/index.
http://www.docs.hp.com/hpux/onlinedocs/5187-4543/5187-4543.html ServiceControl Manager Troubleshooting Guide http://www.docs.hp.com/hpux/onlinedocs/5187-4198/5187-4198.html Back to Top 4.1.3 HP Application ReStart HP Application ReStart Release Note AppRS Release Notes (pdf) HP Application Restart User's Guide AppRS User's Guide (pdf) Back to Top 4.1.4 HP System Inventory Manager SIM Info http://software.hp.com/products/SIM/info.html Back to Top 4.1.
ClusterPack V2.3 Release Note http://www.docs.hp.com/hpux/onlinedocs/T1843-90009/T1843-90009.
The Cluster Management Software is the ClusterPack for system administrators and endusers. Back to Top Guarded Cluster A cluster where only the Management Server has a network connection to nodes outside of the cluster. All of the Compute Nodes are connected within the cluster on a private subnet (i.e. IP addresses of 10.*.*.* or 198.162.*.*). Back to Top Head Node A Head Node provides user access to the cluster. In smaller clusters, the Management Server may also serve as a Head Node.
Back to Top Management Server The Management Server provides single point of management for all system components in the cluster. In smaller clusters the Management Server may also serve as a head Node. References: z Head Node Back to Top Network Attached Storage (NAS) Network Attached Storage (NAS) attaches directly to Ethernet networks, providing easy installation, low maintenance, and high uptime. Back to Top Storage Storage can either be local to each Compute Node, or external to the cluster.