HP XC System Software Administration Guide Version 3.2

useful commands to collect and present data in a scalable and intuitive fashion. The Web pages
update automatically at a preconfigured interval (120 seconds by default).
To open the Web page, open a browser on the head node and point it to the following:
https://head_node_fully_qualified_domain_name/resmon.
You are prompted to supply your Nagios user name and password (which were defined during
the initial installation and configuration of the HP XC system).
The resmon utility gathers individual node CPU load and memory data from the metrics
monitoring infrastructure. The data is obtained from the HP XC shownode metrics load
and shownode metrics mem commands. The resmon utility also gathers a CPU count for
each node from the cluster management database (CMDB).
The resmon utility gathers node state and job information from the resource management
components that have been configured on the HP XC system. These components are the Load
Sharing Facility (LSF) by Platform Computing, the open source Simple Linux Utility for Resource
Management (SLURM) by Lawrence Livermore National Labs, or both. The LSF bhosts and
bjobs commands and the SLURM scontrol, sinfo, and squeue commands are used to
gather node and job state information.
For more information on this utility, see resmon(1).
7.10 The netdump and crash Utilities
The netdump utility is a tool developed by Red Hat that sends a kernel dump (oops data and
memory dumps) from a monitored client system to another system in the network. That system,
which runs a utility named netdump-server, stores the kernel dumps. The default location
for these kernel dumps is in the /var/crash directory.
The crash utility is a self-contained tool that you can use to investigate live systems or to examine
kernel core dumps created with the netdump package.
NOTE: The crash utility is designed to examine an uncompressed kernel image (a vmlinux
file) that was compiled with the compiler's -g option so that it can be debugged. Consider editing
the kernel Makefile to add the -g option to the CFLAGS line.
7.10.1 Installing Netdump and Crash
Two RPMs for Netdump are available:
The netdump client runs on the nodes to be monitored.
The netdump-server runs on the nodes that can receive the kernel dumps over the internal
network.
Use the following procedure to install the Netdump on the HP XC system:
1. Log in as superuser (root) on the head node.
2. Use the rpm command to install the appropriate RPMs:
rpm -ihv netdump-rev_number.platform.rpm
rpm -ihv netdump-server-rev_number.platform.rpm
where:
rev_number Is the revision number of the software. At the time of this publication, the
revision number is 0.7.14-4.
platform Is the platform code.
Use x86_64 for CP4000 systems.
Use ia64 for CP6000 systems.
3. Use the rpm command to install the crash software:
rpm -ihv crash-rev_number.platform
102 Monitoring the System