HP Insight Control for Linux V6.
© Copyright 2008, 2010 Hewlett-Packard Development Company, L.P. Confidential computer software. Valid license from HP required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license. The information contained herein is subject to change without notice.
Table of Contents 1 Using HP Insight Control for Linux..............................................................................11 1.1 Integration with HP Systems Insight Manager...............................................................................11 1.2 HP Insight Control for Linux licenses.............................................................................................11 1.3 HP Insight Control for Linux extensions to HP SIM..........................................................
Installing and setting up managed systems...............................................................45 4.1 Populating the Insight Control for Linux repository......................................................................45 4.2 Setting up management hubs..........................................................................................................45 4.2.1 Creating a management hub...................................................................................................46 4.
7.1.3 Support for custom or other variants of Linux operating systems.........................................82 7.2 Using Kickstart or AutoYaST files for unattended installations.....................................................82 7.2.1 Naming conventions for installation configuration files........................................................83 7.2.2 Customizing installation configuration files...........................................................................84 7.2.
11 Understanding tasks and task results.....................................................................125 11.1 Task results overview..................................................................................................................125 11.2 Understanding task results..........................................................................................................125 11.3 Task results page.............................................................................................
14.4 Services monitored by Nagios.....................................................................................................164 14.5 Understanding Nagios alert messages........................................................................................166 14.6 Understanding system event log monitoring .............................................................................167 14.7 Configuring Nagios email alerts..............................................................................
18 Connecting to a remote console............................................................................193 18.1 Console management facility overview......................................................................................193 18.2 How CMF works..........................................................................................................................193 18.3 Accessing a remote console.............................................................................................
23.3 Apache service does not start......................................................................................................216 23.4 Troubleshooting CMF problems..................................................................................................217 23.5 Troubleshooting configuration problems....................................................................................220 23.6 Troubleshooting connection problems..................................................................
24.3.2 Websites...............................................................................................................................262 24.3.3 Troubleshooting resources...................................................................................................263 24.4 Typographic conventions............................................................................................................263 A Sample SLES version 9 installation media copy session...................................
1 Using HP Insight Control for Linux This chapter addresses the following topics: • • • • • • • • • • • • • • • • “Integration with HP Systems Insight Manager” (page 11) “HP Insight Control for Linux licenses” (page 11) “HP Insight Control for Linux extensions to HP SIM” (page 12) “HP Insight Control for Linux toolboxes” (page 15) “HP Insight Control for Linux command environment” (page 15) “Using the HP SIM quick launch feature” (page 16) “Internal task queuing and management” (page 16) “Synchronized syst
IMPORTANT: Exercise caution when assigning an Insight Control for Linux license, particularly when assigning licences to multiple targets. Only servers require this license. Licensing any other device, such as a management processor, wastes a license needlessly; licenses are difficult to remove after they are assigned. 1.3 HP Insight Control for Linux extensions to HP SIM Table 1-1 lists the HP Insight Control for Linux features, by category, that are integrated with HP SIM.
Table 1-1 Insight Control for Linux extensions to the HP Insight Control user interface (continued) Menu item Description Documented in Options→IC-Linux→Define Networks The Define Networks tool provides an interface through Chapter 2 (page 25) which the administrator can create and edit network definitions that can be used by the Network Configuration Editor tool. The network definitions are used by the OS installation tools to implement booting using the virtual media mechanism.
Table 1-1 Insight Control for Linux extensions to the HP Insight Control user interface (continued) Menu item Description Documented in Tools→Server Controls→Power Off Server... Accesses the management processor on the selected Chapter 17 (page 191) target managed system or systems to power on, power off, or reboot the managed system or systems. Tools→Server Controls→Power On Server... Tools→Server Controls→Reboot Server...
Table 1-1 Insight Control for Linux extensions to the HP Insight Control user interface (continued) Menu item Description Documented in Configure→Boot to SmartStart Toolkit Boots a managed system to the HP SmartStart Toolkit environment for maintenance. Section 23.2.2 (page 216) Deploy→Deploy Drivers, Firmware and Agents→IC-Linux→Update ProLiant Firmware... Initiates a firmware update on any ProLiant server licensed for Insight Control for Linux on your CMS.
Table 1-2 HP Insight Control for Linux commands Command Description Manpage console Enables access to the serial consoles of all managed systems. console(8) headnode Returns the name of the CMS. headnode(1) nodename Displays the Insight Control for Linux internal name for the CMS or managed system on which it is run. nodename(1) nrg Uses data collected by Nagios to generate reports, including an nrg(8) analysis of the state of all HP Insight Control for Linux collections.
A time out of five days is in effect for an OS installation using Kickstart or AutoYaST files so that tasks do not hang indefinitely. You can change the number of Insight Control for Linux tasks that can be run concurrently. For information, see Section 21.8 (page 208) 1.
IMPORTANT: HP recommends that HP Insight Control for Linux deployment and capture facilities be used in a trusted network environment because of inherent insecurity with the PXE boot protocol. The PXE boot protocol is insecure because of its design. The CMS cannot verify the identify of a system booting into the RAM disk. Also, the booting managed system cannot verify the identify of the host from which it receives the RAM disk. Virtual media is provided as a secure alternative to PXE. 1.
1.13 Managed system names HP Insight Control for Linux command line commands recognize managed systems by the following name types. Managed system name type Description Host name The name assigned to the managed system when the Linux OS was installed on it, if a name was assigned. HP SIM recognizes managed systems by this host name. Fully qualified host name The managed system host name with the domain name appended, for example, earth.example.com. In this document, example.
Removing a managed system does not change the numbering scheme. If you want to reuse a node number, remove it from this file and reconfigure the Insight Control for Linux Management Services with the Options→IC-Linux→Configure Management Services menu item. There are three fields in each non-blank or non-comment line in this file; they are: number name GUID Where: number Is the node number of the managed system. This number must be nonnegative.
1.14 Security 1.14.1 Integrated security features This section describes features that are integrated into HP SIM and HP Insight Control for Linux to make them secure. Security features are also discussed in context of the associated topic throughout this document. • Browser Connections HP SIM enforces a secure connection to the web browser.
• secure boot mechanism Virtual media support is provided as the secure boot mechanism. PXE booting provides no authentication or encryption. Data used to authenticate either the CMS or a managed system, or used to setup login credentials on a management processor must be secured. This information is secured with the virtual media mechanism. Specifically, the data includes the SSH public key and any certificates needed to secure the communication between the CMS and the managed system.
Use the Import button to import the iLO’s self-signed certificate. You can obtain the iLO’s self-signed certificate by connecting to the iLO using your browser. In Microsoft Internet Explorer for Windows Vista, for example: 1. 2. 3. 4. 5. Select Page→Security Report. Select View Certificates. Select the Details tab. Select the Copy to File... button. In the Certificate Export Wizard, select the Base-64 encoded X.509 (.CER) radio button and proceed to save your file.
1.16 Backing up HP Insight Control for Linux files The following table contains a list of user files and directories that must be backed up in order to regenerate your customized installation, if needed.
2 Configuring network parameters for virtual media Topics include: • • • • • “Introduction” (page 25) “Preparing for virtual media” (page 26) “Using the Define Networks tool” (page 29) “Using the Network Configuration Editor” (page 31) “Next Step” (page 35) 2.1 Introduction Virtual media is a mechanism available only for systems with an iLO-based management processor. Virtual media allows a system to boot an ISO image over the network; it is the alternate boot mechanism to PXE.
IMPORTANT: Use these tools to define the network configuration parameters before running any other tool that uses virtual media, especially Initiate Bare Metal Discovery. Usually, network configuration is performed in two stages: • • In the first stage, you define the network configuration parameters and store them under a network name. You can have as many network name definitions as you want.
3. Select either Discover a group of systems or Discover a single system button, as appropriate. There is a slight difference in the window for these two choices. The Discover a group of systems choice is shown in the illustration. 4. Enter a descriptive name in the Name text field. The descriptive name must be either listed in the CMS's hosts file or known to the CMS's name server. Otherwise, enter an IP address. 5. 6. Ensure that the Schedule check box is not checked.
2.2.2 Creating a user account and enabling virtual media on the management processor You must create a user account on the management processor, if one doesn’t already exist. The user name and password must match the management processor user name and password you specified when you installed Insight Control for Linux. The iLO is capable of supporting multiple user accounts; if your iLO was already configured with other user accounts you can just add another user account.
5. Select Save User Information. NOTE: Do not disconnect your browser from this management processor address. You might need it to license virtual media, which is described in the next section. 2.2.3 Licensing virtual media on the management processor Your iLO Advanced license key activates iLO Advanced features. For the latest instructions, which may supersede those shown below, see the following website: www.hp.
Figure 2-1 Define networks tool The parameters in the Define Networks tool include the following: • Available Networks This is a list of the network definitions. When you create a new network definition, its name is displayed in this list after pressing Save. When a network name in the Available Networks list is selected and you select the Load button, its network parameters are displayed in the appropriate fields; you can select only one network at a time.
3. 4. 5. Enter the value of the network mask in the text field reserved for that parameter; this is a required entry. Optionally enter any other network parameters you need. Select the Save button. A dialog box reports success or failure. If successful, the Available Networks list is updated. 2.3.2 Loading a network definition 1. 2. Choose the network definition from the Available Networks list. Select the Load button.
2. Select the target. The target must be the corresponding iLO-based management processor. Ensure that the management processor has already been discovered by HP SIM; the management processor should not be licensed. NOTE: Take care that all target management processors are iLO-based: • If you select a single target management processor that is not an iLO-based management processor, the tool will not let you proceed and will display No in the Tool launch OK? field of the Verify Target Systems window.
NOTE: Even though a server might have more than one NIC, you can only specify one name for the server. 6. The Port/MAC Address column offers a list (a drop-down menu) of the MAC addresses of the embedded NICs in the server associated with the management processor. Choose the appropriate MAC address from the drop-down menu.
• Multiple rows, one for each management processor, appear in the Network Configuration Editor page. Choose the one you want to concentrate on by selecting the corresponding check box in the first column. TIP: You can assign a network to multiple nodes by: 1. Selecting the nodes. 2. Selecting the desired network from the drop down box at the bottom of the page. 3. Selecting the Apply Network button. • You can sort the information in the Network Configuration Editor page by selecting a column heading.
2.4.4 Freeing an IP address stored in Network Configuration Editor The Network Configuration Editor distributes IP Addresses from a range that you specify when you define a network. It stores the network range and the assigned IP addresses in the /opt/ mx/icle/config/network_map.xml file. If a managed system is deleted from HP SIM, that managed system and its IP address assignment is not removed from the network_map.xml file, and the IP address is not released.
3 Discovering managed systems, switches, and enclosures This chapter addresses the following tasks, which you must complete in the following order when you are configuring and setting up HP Insight Control for Linux: 1. 2. 3. 4. “Discovering systems ” (page 37) “Assigning HP Insight Control for Linux licenses to discovered systems” (page 40) “Preparing and discovering switches and enclosures” (page 41) “Changing the boot method” (page 42) 3.
NOTES: • You can update a server's firmware automatically as part of the bare-metal discovery process. For information on enabling this feature, see Section 8.2.3 (page 101). • You can initiate a one-time PXE boot, or set the server to always PXE boot before booting from the local hard disk. Either method is acceptable. • For the servers to PXE boot the Insight Control for Linux RAM disk, you must have configured DHCP as described in the HP Insight Control for Linux Installation Guide.
3.1.2 Discovering bare-metal servers using virtual media The Initiate Bare-Metal Discovery tool allows you to perform a discovery of a bare-metal server. Figure 3-1 Initiate Bare-Metal Discovery tool IMPORTANT: You can use the Initiate Bare-Metal Discovery tool tool for servers that will be PXE-booted or that will use virtual media. If you are using virtual media, you can only use this tool for bare-metal discovery of an iLO-based management processor.
NOTE: You can also use this tool to initiate a bare-metal discovery of a server that will be PXE-booted. Select the PXE radio button in step 4. The target system must be the management processor for the server. 3.1.3 Discovering running systems NOTES: • This section applies only to systems with iLO or iLO 2 management processors. For servers with LO100 management processors, after you discover the server with HP SIM, run the Configure SNMP on DL1xx Servers task described in Section 21.9 (page 209).
NOTES: • When you apply the Insight Control for Linux license, the license is locked immediately when it is assigned to the server. Before Version 6.0, the license was assigned, but locked later during an Insight Control for Linux operation (for example, installation or setting up monitoring). • Exercise caution when assigning an Insight Control for Linux license, particularly when assigning licences to multiple targets. Only servers require this license.
a. b. c. d. e. f. 3. Select New... In the Ping inclusions range text box, enter the IP addresses or host names of the OAs and switches to be discovered, one entry per line. Enter a name for this new discovery task. Do not use any special characters, such as the apostrophe, in the name. Select Save. Deselect the box next to Schedule. Select Run Now to start the discovery process.
4. Select the boot method, either PXE or virtual media, for each system by selecting the radio button in the PXE column or Virtual Media column. NOTE: You can select the boot method for all the systems listed by selecting the check box in the PXE or Virtual Media column heading. 5. After you have selected the boot method for all the systems you want to change, select Save. NOTE: If you exit the window before selecting Save, the boot method for the system or systems is unchanged. 3.
4 Installing and setting up managed systems This chapter describes how to install operating systems on managed systems and set them up for HP Insight Control for Linux monitoring. This chapter addresses the following tasks, which you must complete in this order: 1. 2. 3. 4. “Populating the Insight Control for Linux repository” (page 45) “Setting up management hubs” (page 45) “Installing a Linux OS on managed systems” (page 48) “Setting up managed systems for monitoring” (page 48) 4.
Figure 4-1 Management hub aggregation HP recommends that you create at least one management hub for every 256 servers. For example, if you plan to monitor 1000 servers, you should create a minimum of 4 management hubs. The management hub also runs the Supermon aggregator (supermond) and Nagios (nagios_monitor) services. 4.2.1 Creating a management hub Use the following procedure to create a management hub: NOTE: 1. 2. 3.
a. Select Customize... in the System and Event Collections panel. This figure shows the location with a red arrow. The Customize Collections window appears. b. Scroll down the list to access Systems Managed by IC-Linux in the Customize Collections window and select the + icon. The list expands. c. Select the + icon in the + icelx line. The list expands again. d. Select the icelx_Management_Hubs line and select Edit.... A new window resembling the following opens.
g. h. i. Use the >> button to move the selected servers from the Available Items: list to the Selected Members: list. Select OK. Select OK to close the dialog box that indicates the collection was saved successfully. 4.3 Installing a Linux OS on managed systems You must install a Linux OS if one or more servers that you intend to monitor and manage with Insight Control for Linux does not have a supported OS installed.
Table 4-1 Open ports on managed systems (continued) Port number Service Protocol Inbound or outbond 2381 compaq-https TCP Inbound 2709 mond TCP Inbound 5666 nrpe TCP Inbound 5989 WBEM TCP Both 60000 Insight Control for Linux repository web server1 TCP Inbound 1 This must port must be opened for managed systems running VMware ESX only. If you changed the default port during installation, you need to open this port instead.
4.4.3 Installing additional components of the PSP This task is optional. Insight Control for Linux does not automatically install the complete set of PSP and ESX agents. If you install VMware ESX virtual machine OS, you must run Configure→Configure or Repair Agents if you want to install the ESX agents.
Figure 4-3 Settings for configure or repair agents 5. Make the following settings to configure SNMP: • Select Set read community string and enter the appropriate value for your network configuration. 4.
NOTE: To discover or identify a server that will become a managed system, HP SIM requires that a SNMP read community string must be set to public in the global credentials for that server. There may be additional read community string settings in addition to public, but public must be specified. • • 6. Select Send traps to refer to this instance of HP SIM. You can optionally set Send a sample SNMP trap to this instance of HP SIM, but it is not required.
Therefore, configuring the managed system is a two-step process: Step 1: Make the appropriate association on the system BIOS. Depending on how you decide to configure your system, you might not need to do anything. As a general rule, the factory default system BIOS settings are as follows.
splashimage=(hd0,0)/grub/splash.xpm.gz hiddenmenu title Red Hat Enterprise Linux Server (2.6.18-92.el5xen) root (hd0,0) kernel /xen.gz-2.6.18-92.el5 com1=115200,8n1 1 module /vmlinuz-2.6.18-92.el5xen \ 2 ro root=/dev/VolGroup00/LogVol00 rhgb \ quiet console=ttyS0 3 module /initrd-2.6.18-92.el5xen.img title Red Hat Enterprise Linux Server-base (2.6.18-92.el5) root (hd0,0) kernel /vmlinuz-2.6.18-92.el5 \ ro root=/dev/VolGroup00/LogVol00 rhgb quiet initrd /initrd-2.6.18-92.el5.img Add com1=115200,8n1 here.
5 Configuring monitoring services This chapter describes how to configure HP Insight Control for Linux monitoring services. In addition to an Section 5.1 (page 55), this chapter addresses the following tasks, which you must complete in this order: 1. 2. 3. 4.
• It also deploys the Insight Control for Linux management agents to all servers in the {collection_name}_Servers subcollection. For information on managing subcollections, see Chapter 13 (page 147). Only the objects in these collections are monitored by HP Insight Control for Linux: • • • Either all licensed servers are automatically added to the {collection_name}_Servers subcollection or only the servers in the {collection_name}_Servers collection, depending your response on the Auto-populate option.
• Enter no if you want Insight Control for Linux only to manage and monitor only the servers in {collection_name}_Servers collection. NOTE: 5. You should populate your collections manually before proceeding. Select Run Now. This task can take several minutes to configure services. The Stdout tab shows the scripts that are running, and Done appears when this task is complete. 6.
charon: 11:02am up 0:49, poseidon: 9:46am up 1 day 4. 1 user, load average: 0.38, 0.36, 0.36 4:46, 3 users, load average: 1.10, 1.23, 1.34 Verify that the nrpe daemon is working on all the managed systems with the following command: # /opt/hptc/nagios/libexec/gather_all_data --verbose write 4048, 2, 2, eth1 to db => icelx2 (charon.example.com) write 4048, 2, 2, eth1 to db => icelx4 (pluto.example.com) write 4048, 2, 2, eth1 to db => icelx1 (poseidon.example.com) 5. Ensure that the vars.
If Warnings Are Reported If one or more warnings are reported in the Warning column, use the analyze option to obtain an analysis of the problem. When possible, the command output provides potential corrective action or the reasons for a given state.
6 Managing the Insight Control for Linux repository This chapter provides an overview of the Insight Control for Linux repository and how to perform activities related to it. The following topics are addressed: • • • • “Introduction to the Insight Control for Linux repository ” (page 61) “Registering items in the Insight Control for Linux repository” (page 64) “Copying software to the Insight Control for Linux repository” (page 69) “Editing and deleting registered items” (page 79) 6.
After an OS is registered with the repository, manually copy the vendor-supplied installation media to the appropriate directories in the repository. The media can be a physical CD or DVD, or it can be an .iso image. You must expand the .iso image into flat files. IMPORTANT: Be aware that repository management tasks do not follow typical authorization models. All HP SIM users can select, add, delete, or modify all Insight Control for Linux repository items regardless of their user authorizations. 6.1.
Figure 6-2 Remote repository using the CMS as a gateway 6.1.2 Repository contents Table 6-1 lists the classes of items that are stored in the repository. Table 6-1 Repository item types Name Description PSP An OS-specific bundle of ProLiant optimized drivers, utilities, and management agents. Supported OS Vendor-supplied installation files for supported versions of RHEL or SLES. Custom OS Vendor-supplied installation files for another type of Linux OS (a custom OS).
Table 6-2 Default repository contents Item type File name examples Description PSP Dependency Script example_dependency.sh Provides a sample PSP dependency script that installs RPM dependencies on the managed systems that are required by the PSP installation process.
is a simple process: you register the OS in the repository, copy the vendor-supplied installation files to the repository, and copy the appropriate boot files to the associated boot target directory. To register an OS in the repository, follow these steps: 1. Select the following menu item from the HP Insight Control user interface: Options→IC-Linux→Manage Repository 2. 3. 4. 5. Select New. From the Item Type drop down list, select either Supported OS or Custom OS. Select Next.
Table 6-3 OS registration information (continued) Registration information Description Kernel name The name of the kernel boot file. If you do not supply a kernel name, vmlinuz is the default file name. Supply for supported OS, custom OS, or both Custom In the context of a custom OS, the file name might be different name. You must examine the installation sources to determine what the OS has named them. RAM disk name The name of the RAM disk boot file. If you do not supply a kernel name, Custom initrd.
The PSP software components listed in Table 9-1 (page 105) install the required agents on the managed systems and are required for proper management of the managed systems. These components are installed automatically when HP Insight Control for Linux installs a supported Linux OS using a installation configuration file, but you can install additional PSP components.
IMPORTANT: Write down or copy and paste into a file the path displayed in the PSP path on disk field because this is the location in the repository where you will download the PSP. 8. Select OK to return to the Manage Repository screen. To download a PSP into the repository, see Section 6.3.7. 6.2.4 Registering automated installation configuration files (Kickstart and AutoYaST) To support a completely unattended installation of a Linux OS, both Red Hat, Inc.
IMPORTANT: Write down or copy and paste into a file the path displayed in the Installation configuration path on disk or Installation configuration path via http: (for remotely hosted repositories) field because this is the location in the repository to which you will copy the installation configuration file. 8. Select OK to return to the Manage Repository screen. 6.2.
After you register an OS in the repository, you must copy the vendor-supplied installation files and boot files to the appropriate OS-specific directories under the /opt/repository/os and /opt/repository/boot directories . The OS distribution medium can either be a series of CDs, DVDs, or .iso files.
6.3.3 Copying SLES into the repository The steps required to copy SLES OS installation and boot files differ between SLES Version 9 and Versions 10 and 11: • • “Copying SLES version 10 or version 11 into the repository” (page 71) “Copying SLES version 9 into the repository” (page 73) 6.3.3.
Example 6-1 Repository directory structure for SLES versions 10 and 11 OS installation files on CD 3. 4. Copy the contents of each installation disk into its own directory. For example, copy the contents of the first CD into the CD1 directory, and so on. Copy the kernel and RAM disk boot files to the related boot target directory. The kernel file name is linux and the RAM disk file name is initrd.
6.3.3.2 Copying SLES version 9 into the repository IMPORTANT: SLES Version 9 supports network installations from web servers running only on port 80, and the default port used by Insight Control for Linux for network installations is Port 60000. Configuring your repository web server to use port 80 will cause a port collision with the Apache web server that came with your OS because the Apache server was pre-configured to also listen on port 80.
NOTES: Do not confuse SUSE Professional Version 9 with SLES Version 9: • SUSE Professional Version 9 often comes on a single DVD. SUSE Professional Version 9 is free, but it is unsupported for use with HP Insight Control for Linux. • SLES Version 9 is supported on managed systems only. The OS directory and the boot target directory where you copy the installation files were provided to you during the OS registration process described in Section 6.2.2 (page 64).
Example 6-2 Repository directory structure for SLES version 9 OS installation files Follow these guidelines to create the subdirectory names: • • • Use all capital letters for the directory names. The CD directory names must use the form CDn. The CD subdirectory name must include a number, for example: CD1. The following example creates a portion of the required subdirectories: # # # # # # cd /opt/repository/os/SLES9-SP3-i386 mkdir CORE9 cd CORE9 mkdir CD1 mkdir CD2 mkdir CD3 6.
# # # # # # . . . 2. mkdir CD4 mkdir CD5 cd .. mkdir SLES9 cd SLES9 mkdir CD1 Follow these guidelines to open the SLES Version 9 distribution media to copy each CD to the appropriate directory. On a Linux workstation or laptop, it is common to mount a CD or loopback mount an ISO image. For example: # mount /dev/cdrom /mnt Or # mount –o loop /home/sles9-disc1.iso /mnt Begin by inserting or mounting the first CD and coping the contents to the appropriate subdirectory as described in step 3.
# printf "/SLES9/CD1\t/SLES9/CD1\n" >> yast/order # printf "/CORE9/CD1\t/CORE9/CD1\n" >> yast/order 6. Copy the kernel and RAM disk boot files to the related boot target directory. The kernel file name is linux and the RAM disk file name is initrd. # cp SVRP3/CD1/boot/loader/linux /opt/repository/boot/SLES-9-SP3-i386Boot/ # cp SVRP3/CD1/boot/loader/initrd /opt/repository/boot/SLES-9-SP3-i386Boot/ 7.
unique identifier for the server you are installing the OS on. This file contains a list of variables in name=value format. The following is an example of the file contents: hostname="host_name" mac="MAC_address" ip="IP_address" netmask="255.255.240.0" gateway="gateway_IP_address" nameservers="name_servers_IP_addresses" domains="domain_names" boot_method=pxe kernel="vmlinuz" ramdisk="/CentOS5Boot/initrd.
4. In the Description column, find the PSP description that corresponds to the PSP you want to download and select it. Do not select Download >> here. Wait until the next step. Select the link in the Description column that corresponds to a multi-part download. 5. 6. Select the Download >> button that corresponds to the PSP *tar.gzip file name, for instance psp-8.2.sles10.linux.en.tar.gz. When prompted, select Save to save the *tar.
4. Select OK at the bottom right of the screen to complete the deletion process. The following message appears: The selected items have been deleted NOTE: Deleting an item from the repository does not delete the corresponding directory in the /opt/repository directory nor does it delete the files that you might have copied to that directory. If you want to delete or move the directory and files, you must do so manually. 5. 80 Select OK again to refresh the screen and remove the item from the table.
7 Installing operating systems on managed systems This chapter addresses the following topics: • • • • • • • • “Linux OS installation overview” (page 81) “Using Kickstart or AutoYaST files for unattended installations” (page 82) “Prerequisites to OS installations on managed systems” (page 87) “Installing RHEL on managed systems” (page 89) “Installing SLES on managed systems” (page 91) “Installing VMware ESX and VMware ESXi operating systems” (page 91) “Installing another variant of Linux on managed systems
• An unattended installation method reads RHEL Kickstart and SLES AutoYaST installation files, which contain the responses to the OS installation process. This method requires no user interaction. For more information about using Kickstart and AutoYaST files for unattended installations, see Section 7.2 (page 82).
IMPORTANT: • HP recommends copying and using the default files as templates to create customized installation configuration files that are suitable for your environment. Familiarize yourself with the contents and usage comments in the configuration file templates, and use them to make versions that are appropriate for your own environment. • Some default templates specify port 60000, the default web server repository port number, by that number.
OS version Directory name in /opt/repository/instconfig RHEL Version 5 Update 2 (for Virtual Guests) rh052–virt-guest RHEL Version 5 Update 3 rh053 RHEL Version 5 Update 3 (for Management rh053–management-hub Hubs) • VMware ESX Version 3.0 esx030 VMware ESX Version 3.5 esx035 VMware ESX Version 4.0 esx040 The associated installation configuration files are stored in the appropriate OS-specific directory under /opt/repository/instconfig and use the same naming convention. For example, rh053.
Table 7-1 HP Insight Control for Linux macros for installation configuration files Macro name Description %%agentinstall%% This macro is unique to Insight Control for Linux. During installation, it expands into a shell script that downloads the two PSP components from the CMS and installs only the packages that HP SIM and Insight Control for Linux need to be able to monitor the managed system properly.
7.2.3 Installation configuration files for custom operating systems You can upload installation configuration files for unsupported operating systems into the Insight Control for Linux repository. However, the OS installation process does not have a built-in mechanism for linking the installation configuration files to a given installation.
s/$/ console=ttyS0/ w ! ex /etc/inittab <> /etc/securetty 1 The designation Space-TAB means a space character followed immediately by a tab character. Thus. this line can be interpreted as: /^[ \t]*kernel For managed systems that are virtual hosts: ex /boot/grub/menu.lst <
• • • • • • • The target managed system or systems and their management processors have been discovered and are associated with each other in HP SIM You have set the user name and password on the management processors. For more information about setting or changing management processor credentials, see Section 21.1 (page 205). For more information on management processor credentials themselves, see “Management Processor Credentials” (page 211).
Table 7-2 Download web address for initrd files (continued) Operating system Architecture FTP download web address SLES 10 SP3 x86 and AMD64 No modifications required SLES 11 x86 and AMD64 No modifications required Before you can use Insight Control for Linux to install Linux on these servers, you must: • • • Download the files and copy them to the appropriate directories under /opt/ repositories/boot, overwriting the original initrd supplied with the distribution of the corresponding Linux operati
NOTES: • During installation, when specifying the HTTP setup, you will be prompted for the IP address of the CMS and the path name for the RHEL installation. For example: http://CMS-IP-addr:CMS-port/path-name Where: • CMS-IP-addr is the IP address of the CMS CMS-port is the port number of the repository web server that you specified when you installed Insight Control for Linux. The factory default value is 60000.
7.5 Installing SLES on managed systems This section describes the two methods for installing SLES to one or more managed systems: • • “Installing SLES using an unattended method” (page 91) “Installing SLES interactively” (page 91) NOTE: When you use HP Insight Control for Linux installation tools to install SLES on a managed system, HP Insight Control for Linux automatically edits the /etc/ssh/sshd_config file and turns on password authentication in this file.
IMPORTANT: • Installing a virtualization OS on a server erases any pre-existing data on that system. Before you begin, be sure that you have captured or backed up any data you want to retain before you begin. Preserving user data on volumes other than the principle target volume is not guaranteed. Presume that existing data on primary and secondary volumes is erased. The tasks for installing the virtualization OS are launched from the following HP SIM menu: Deploy→Operating System • The VMware ESX 3.
8. Optionally, you may set the root account password at this step. If you want the target system to use the default root password (root), select the Use Default Root Password option. To set a root password other than the default, select the Specify Root Password option, enter the root password, and verify the entry. HP recommends setting a strong root password on all your severs. 9. Do one of the following to start the installation: • Select Run Now to launch the installation operation immediately.
5. Select the virtualization OS to install and select Next>. Only the virtual machine OS that applies to your installation is available for you to select from the menu. IMPORTANT: The list contains only those virtualization operating systems that are registered in the repository and copied to it. If you select a virtualization OS that has been registered, but the installation files have not been copied to the repository, a validation error appears. 6. 7.
NOTE: When performing an ESXi installation using virtual media, to facilitate the installation, Insight Control for Linux does not automatically remove the ISO image that was created. This ISO image contains the RAM Disk and removing the ISO image while RAM disk is loaded causes the installation to fail. HP recommends, if disk space is a concern, that you remove the ISO image manually. The ISO image is named using the server's Globally Unique IDentifier (GUID).
2. Select the menu item that reflects the appropriate OS type and installation method you want to use. Your choices are: Red Hat Interactive Red Hat (Kickstart) SLES Interactive SLES (AutoYaST) Custom or Other 3. Do one of the following to select and verify that the server or servers shown in the target list are the servers to which you want to install an OS: • Proceed to the next step if the target list is correct. • Select Add Targets... or Remove Target to modify the list, if the list is incorrect.
In the previous kernel append line, 172.1.1.1 is the IP address of the CMS management interface, 60000 is the TCP port on which the repository web server is listening, and /instconfig/directory/filename.cfg is the path to the configuration file. 9. Optionally, for unattended installations using a Kickstart or AutoYaST file, you may set the root account password at this step. If you want the target system to use the default root password (root), select the Use Default Root Password option.
8 Using HP Insight Control for Linux to update HP ProLiant firmware This chapter addresses the following topics: • • • “Overview of updating HP ProLiant firmware” (page 99) “Basic firmware update functionality” (page 100) “Advanced firmware update functionality” (page 103) 8.1 Overview of updating HP ProLiant firmware Keeping firmware up to date is a challenging but necessary task. Each ProLiant server usually has several devices that require regular firmware updates, which can create a burden.
8.2 Basic firmware update functionality Basic firmware update functionality is designed to provide an easy to set up and easy to use way of updating firmware for people who simply want to keep their firmware up to date. Just download the latest firmware CD, install it into the Insight Control for Linux repository, and run the update. 8.2.1 Initial setup Before you can initiate a firmware update on a server, you need to download and prepare the firmware files and tools that will do the work.
8.2.3 Updating firmware on systems during bare-metal discovery With Insight Control for Linux, you can update the firmware on a server automatically as part of the bare-metal discovery process. This is a convenient way to ensure all the hardware added to your environment has the latest firmware. To enable this feature, edit the Insight Control for Linux properties file, /opt/mx/icle/ icle.
8.2.6 Adding or removing firmware files from the firmware tar file HP continuously releases new firmware for various devices, and these new releases are usually included in the next revision of the Firmware Maintenance CD. However there might be times when you will want to start using this new firmware before the next CD is released.
To modify the firmware update timeout, edit the /opt/mx/icle/icle.properties file. The FW_UPG_WAIT_TIMEOUT parameter controls the length of the firmware update timeout; that line in the file resembles the following: FW_UPG_WAIT_TIMEOUT=900 Determine the value (in seconds) that is appropriate for your installation and assign it to the FW_UPG_WAIT_TIMEOUT value. 8.
172.31.64.99=prodfirmware.tar server1=firmware.tar 01:00:ab:67:45:ff=latest-firmware.tar IMPORTANT: Ensure that system values are unique in the file. For example, there should not be two identical MAC addresses in the same configuration file. Wildcards are not supported in the configuration file. MAC addresses are case insensitive and must be separated by colons (:). 8.3.2 Example firmware configuration files The following are examples of configuration files: Example 1 prod-server-1=production-firmware.
9 Installing PSPs on managed systems This chapter addresses the following topics: • • • • “Overview of the PSP installation tool” (page 105) “Required PSP components” (page 105) “Creating a PSP dependency script” (page 106) “PSP installation procedure” (page 107) 9.1 Overview of the PSP installation tool The HP Insight Control for Linux PSP installation tool enables you to install any or all PSP components on one or more managed systems.
NOTE: The RPMs for these PSPs are OS- and platform-specific and are named as such, for example, the HP ProLiant Channel Interface for Red Hat Enterprise Linux 4 (x86_64). 9.3 Creating a PSP dependency script Some of the utilities contained in the PSP have RPM dependencies that must be met in order for them to install correctly. These dependencies are documented in the HP ProLiant Support Pack User Guide and are not listed here.
IMPORTANT: If an errata kernel is installed on the managed system, ensure that the errata kernel version is supported by the PSP package you want to install. 9.4 PSP installation procedure NOTE: If you use a PSP dependency script, ensure that it is registered and copied in the Insight Control for Linux Repository. After you have created and copied the PSP dependency script to the /opt/repository/ pspscript/example_dependency.sh directory, follow these steps to begin the installation process: 1.
If any of the selected software components did not install successfully on the target managed system for any reason, including package dependency failures, the final state of the task on that system is Failed. Review the log carefully because it contains important PSP installation results. If any of the PSP components failed to install due to RPM dependency requirements, you must resolve the RPM dependencies and run the Install ProLiant Support Pack (PSP)... tool again.
10 Capturing and deploying Linux images This chapter addresses the following topics: • • • • • • “Overview of capturing and deploying Linux images ” (page 109) “Prerequisites to capturing a Linux image” (page 112) “Capturing a Linux image from a managed system” (page 114) “Preparing for scalable deployment” (page 116) “Deploying a captured Linux image to one or more managed system” (page 119) “HP Insight Control for Linux partition wizard overview” (page 121) 10.
Using the Partition Wizard during image deployment requires an in depth understanding of Linux, kernel modules, and the grub boot loader. NOTE: To account for the time it may take to capture or deploy a very large image over a slow network, a time out of five days is in effect for capturing or deploying a Linux image so that you can determine if an operation hangs. HP recommends that you check your task results to verify the status of any running jobs. 10.1.
GATEWAY GATEWAYDEV IMAGESERVER The script is run in a chroot environment so there is no need to configure paths relative to the HP Insight Control for Linux environment. For information on how these scripts can be used, see the comments in the example scripts provided with Insight Control for Linux. 10.1.
NOTE: The Predeployment and Postdeployment scripts run in the Insight Control for Linux RAM disk. The managed server's file system is mounted as read/write in the RAM disk under the /mnt/target mountpoint. If your scripts manipulate any files on the managed server, remember to specify /mnt/target in the path; otherwise you will only be manipulating the RAM disk files. You can use any script that is registered and copied to the repository. Insight Control for Linux also creates the /tmp/variables.
IMPORTANT: You must ensure that the captured image kernel supports the selected file system types, including the Logical Volume Manager (LVM). It is especially important that the initial RAM disk of the captured image (initrd*.img) has file system support for the selected /boot and / (root) partitions, because the initial RAM disks are not remade by the Deploy→Operating System→Deploy Linux Image... task. It is possible to remake the initial RAM disk as part of a post-installation script.
Example 10-1 provides an example of a sample /etc/fstab file. In Example 10-1, two additional serial SCSI disks are located on the source system (/dev/cciss/c0d1 and /dev/cciss/c0d2), and the system disk resides on /dev/cciss/c0d0. IMPORTANT: For the capture operation to be successful,/dev/cciss/c0d1 and /dev/cciss/c0d2 must be partitioned, and the dump flag must be set to 1.
3. Do one of the following to select the managed system from which you want to capture its image: • If no server is shown in the list, do the following: a. Select Collection. b. Select All Servers from the drop down menu. c. Select View Contents to display and select from the list of available servers. d. Select Apply when you have selected a server. • • 4. 5. 6. Select Next> if the target list is correct. Select Add Targets... or Remove Target to modify the list, if the list is incorrect.
NOTE: After an image has been captured from a managed system, a sanity size check is performed on the captured image to verify that it is valid. By default, if the captured image size is less than 1,000,000 bytes, the image is considered invalid, because most operating systems yield an image size greater than that. Typically an image size less than 1,000,000 bytes indicates that an error occurred that prevented Insight Control for Linux from capturing the image.
Figure 10-1 Network groups example The concept behind a Scalable Deployment is to transfer an OS image tar file from the CMS to the group leader in each network group. After the image tar file is completely transferred, the group leader transfers the image to each of the remaining servers in the network group. The advantage to this concept is that all network traffic is kept local to the switch or enclosure.
NOTE: This does not create a collection for the network group. Verify the collection entry for the group by examining the netgroup.conf file. It has an entry similar to the following: enclosureA=n121 n[122-124] n133 For Switches 1. Select Customize... in the System and Event Collections panel. This figure shows the location with a red arrow. The Customize Collections window appears. 2. Select New... in the Customize Collections window.
i. j. Select OK to continue. Generate the netgroups.conf file with the following command: # /opt/hptc/bin/netgroup --ofile /opt/mx/icle/netgroups.conf k. Verify the collection entry for the group by examining the netgroup.conf file. If the name of the group is Switch1 and the servers that comprise it are n1, n3, n4, and n5, the netgroups.
b. c. d. e. f. • • 3. 4. 5. Select All Servers from the drop down menu. Select View Contents to display a list of available managed systems in the collection. Select one or more managed systems from the list. Select Apply . Select Next> after you verify that the managed systems list is correct. Select Next> if the target list of selected managed systems is already correct. Select Add Targets... or Remove Target to modify the list, if the list of selected managed systems is incorrect.
Figure 10-2 Existing disk partition scheme See Section 10.6 (page 121) for a general overview of the Partition Wizard and how to use it to edit disk partitions and volume groups. Select Next> after you have completed customizing the disk partition layout. 9. Optionally select any or all of the following types or scripts (one of each): • Predeployment script • Postdeployment script • Final Deployment script For information on these scripts, see Section 10.1.3 (page 111). 10.
(root) and swap partition. You could capture this image and later deploy it to multiple servers as /, swap, /opt, /usr , and /var without having to manually manipulate the image. The Partition Wizard user interface provides a representation of the disk partition layout to be applied to the target server before laying down the image. Because the Partition Wizard does not know about the storage media, it works with a generic representation that has been created to describe the storage media.
IMPORTANT: A deployed image might not boot if you do not follow these guidelines and requirements. If you cannot meet the requirements or you are not experienced with Linux, kernel modules or grub, HP recommends that you deploy an image using the partitioning scheme in the image itself rather than using the advanced Partition Wizard. For additional limitations or restrictions regarding the use of the Partition Wizard, see the HP Insight Control for Linux Release Notes. 10.6.
11 Understanding tasks and task results This chapter addresses the following topics: • • • • • • “Task results overview” (page 125) “Understanding task results” (page 125) “Task results page” (page 125) “Common task results” (page 127) “SIM standard task results format” (page 130) “Scalable task results format” (page 134) 11.1 Task results overview HP SIM and HP Insight Control for Linux enable you to manage systems by scheduling and running tasks.
Figure 11-1 Task results page Table 11-1 lists the components of the Task Results page. Table 11-1 Components of the Task Results page Available in SIM standard view, scalable view, or common to both views Component Description Task Instance Results Provides the status of the running task or the task that is selected Common in the task list log at the top of the page. Use SIM Standard Task This option is only offered when you run an Insight Control for Common Linux task.
Table 11-1 Components of the Task Results page (continued) Available in SIM standard view, scalable view, or common to both views Component Description Use Scalable Task Results Format radio button This format is unique to Insight Control for Linux tasks and is Common only available as an option when you run an Insight Control for Linux task. Selecting this radio button provides an operation oriented format that enables you to view the status of each operation in a task as it completes on each target.
11.4.1.1 Stopping a task When you select the Stop button in the Task Instance area, the task status is immediately set to Cancelled. The stop process attempts to cancel the task for all targets with non-terminal statuses, regardless of whether or not they have begun running. The stop operation does not affect targets that have already reached a terminal status.
• • • All task level results All parameters displayed in the Parameters pop-up window Target level results, including: — All information displayed in the target status table — All target details, including all information displayed in the operation status table and the log for each operation NOTE: If you select All Systems for the report, the target level results are displayed for all targets, each separated by a line. 11.4.2.
Figure 11-5 View of the operation details log 11.5 SIM standard task results format This section describes the portions of the Task Results page that are specific to the SIM Standard Task Results Format, which is the default view. Figure 11-6 illustrates the SIM Standard Task Results format. The figure shows the task results for an instance of a Red Hat Kickstart OS installation task running on three target servers.
Figure 11-6 SIM standard task results format 11.5.1 Summary status and target status area Figure 11-7 illustrates the Summary status: area and target status area, which provide the overall status of a task on each target server. Figure 11-7 View of the summary status and target status areas Table 11-2 describes the information displayed in the Summary status: area. 11.
Table 11-2 Description of target status area Column heading Description Target Name Name of the target managed system on which the task was run. Status The status of a target is computed from the status of its operations. Non-terminal target status Pending: All operations can have the Pending status. Running: At least one operation has the status Running. A percent complete is also displayed.
11.5.1.2 Log button in the target status area When you select the Log button, a new window opens that displays the log for all operations for the task, including the following information: • • • A summary of the task level information The information displayed in the target status table for the currently selected target A block of information for each operation in the task, including the log The log screen does not auto-refresh.
Table 11-3 Description of target details table Column heading Description Operation Name The name of the operation Status Non-terminal operation status Pending: If this is the first operation, execution of the task for the target has not started. For any other operation, Pending means that one or more of the preceding operations has a non-terminal status. Running: This operation is currently being run. A percent complete is also displayed. Only one operation for a target can be run at a time.
Figure 11-9 Scalable task results format 11.6.1 Operations table Figure 11-10 illustrates the Operations table, which lists individual operations within a task and provides the status of the entire operation as it starts and completes on each target server. The important thing to know is that operation status represents the status of the operation on every target server.
Table 11-4 lists the information displayed in the Operations table. Table 11-4 Description of the operations table Column heading Description Operation Name The name of the operation that is run as a component of an Insight Control for Linux task. Status Complete: The operation has successfully completed on all target servers. Pending: The operation has not yet started or is not yet complete on all target servers.
Table 11-5 lists the information displayed in the Operation Target Details table. Table 11-5 Description of the operation target details table Column heading Description Target Name The name of the target on which the operation was run on or is running on. Status Complete: The operation has successfully completed on the target servers. Pending: The operation has not yet started or is not yet complete on the target servers.
12 Installing and setting up virtual machines This chapter addresses the following tasks, which you must complete in this order: 1. 2. 3. 4. 5.
12.2 Registering the virtual host with HP Insight Control virtual machine management IMPORTANT: Before a host can be registered, either through the VM Host registration menu item or with the Configure→Configure or Repair Agents... task, the sign-in credentials must be specified either as global credentials, discovery credentials, or system credentials.
12.3 Creating and installing virtual guests Generally this section discusses how: • To create the virtual guest: HP suggests that you use the vCenter application for VMware ESX and VMware ESXi. HP suggests that you use the virt-manager utility for Xen. • To install and configure an operating system that will run on the virtual guest.
boot: linux ks=http://cms:port/instconfig/os/os.cfg ksdevice=device Where: cms Is the fully-qualified IP address of the CMS port Is the port used os_specifier Specifies the operating system to be installed on the virtual guest, for example, RHEL5U2-i386 or SLES10SP2-i386/DVD1. Some releases of SLES may specify CD1 instead of DVD1. device Is the network device to connect to the network. For example, boot: ks=http://mercury.example.com:600000/instconfig/rh054-virt-guest/rh054-virt-guest.
TIP: Consider matching the machine name to the host name in the virtual machine map you established earlier. See Section 21.10 (page 210). • • Select the Paravirtualized option for the virtualization method.
NOTE: During the discovery of the virtual host, HP SIM identifies the virtual guests from the perspective of the virtual host, assigning the name {virtual-host-name}_{virtual-guest-name}. During the discovery of the virtual guest, HP SIM becomes aware of the host name and IP address given to the virtual guest during OS installation.
The virtual hosts and virtual guests can now be monitored with HP Insight Control for Linux. You can perform additional HP Insight Control virtual machine management operations at this time. For information on HP Insight Control virtual machine management operations, see Insight Control Virtual Machine Management User Guide. 12.
13 Managing Insight Control for Linux collections for monitoring This chapter addresses the following topics: • • • • • “Introduction to collections” (page 147) “Populating a collection” (page 148) “Adding servers and switches to an existing Insight Control for Linux collection” (page 148) “Removing a managed system or switch from an Insight Control for Linux collection” (page 149) “Removing a management hub” (page 150) 13.
Table 13-1 HP Insight Control for Linux subcollections (continued) Object type Subcollection name Description How populated Enclosures {collection_name}_Enclosures If the hardware configuration contains HP blade servers and enclosures, this collection provides access to the enclosures. Switches Populated manually only. {collection_name}_Switches All switches placed in this subcollection are monitored by Insight Control for Linux.
1. Use the instructions in Chapter 3 (page 37) and Chapter 4 (page 45) to perform the following tasks to prepare servers: • Discover the server or servers. Make sure you follow the appropriate discovery process because the procedure differs for bare-metal servers and servers that already have a supported Linux OS installed on them. • Deploy a Linux OS to the server if it does not have an OS installed.
1. Select the following menu item from the HP Insight Control user interface to remove the management agents from the managed systems that you no longer want HP Insight Control for Linux to monitor and manage: Deploy→Deploy Drivers, Firmware and Agents→IC-Linux→Uninstall Agents... NOTE: Run Uninstall Agents... only if you are removing a managed system. Omit this step if you are removing a switch. 2. Remove the managed systems or switches from the existing Insight Control for Linux collection: a.
14 Using graphical tools to monitor managed systems This chapter addresses the following topics • • • • • • • • “HP Insight Control for Linux system monitoring overview” (page 151) “Nagios overview” (page 152) “Using Nagios” (page 156) “Services monitored by Nagios” (page 164) “Understanding Nagios alert messages” (page 166) “Understanding system event log monitoring ” (page 167) “Configuring Nagios email alerts” (page 167) “Monitoring Metrics in real time” (page 168) 14.
NOTE: Insight Control for Linux does not support monitoring of virtual hosts running VMware ESXi , and does not support servers or virtual guests running Microsoft Windows. 14.1.1 Collecting metrics through a management processor HP Insight Control for Linux supports management processors using the iLO or IPMI protocols for gathering sensor and system event log information. To access a system’s management processor, you must configure the management processor credentials in HP SIM.
Nagios, as provided with HP Insight Control for Linux, is configured with system and network service checks already in place for your system; these network service checks are automatically configured for each managed system. Nagios obtains its sensor and metric data from the Supermon open source monitoring application, which is integrated with the Insight Control for Linux. Figure 14-1 illustrates the interaction of these tools.
14.2.2 Launching Nagios To launch Nagios, you must have a valid certificate for the Apache service. To configure an Apache certificate, see Section 5.2 (page 55). Select the following menu item from the HP Insight Control user interface to launch Nagios: Tools→Integrated Consoles→Nagios The Nagios main window shown in Figure 14-2 appears when you launch Nagios. Figure 14-2 Nagios main window From the Nagios main window, you can choose any of the menu options on the left navigation bar.
Figure 14-3 Nagios menu options 14.
The Nagios menu options offer various views of the managed systems. After you choose an option, Nagios prompts for a login name and a password. This login and password were established when you installed and configured Insight Control for Linux. Insight Control for Linux provides plug-ins that monitor these views and other system statistics.
14.3.1 Viewing network health Select the Tactical Overview menu option to obtain an overall view of the managed systems. Figure 14-4 shows an example of this view. Figure 14-4 Nagios tactical overview The top of the window provides information about the network. It provides the number of network outages and information on the network health in terms of the Nagios hosts and Nagios services. The next portion of the window contains information about the Nagios hosts.
In Figure 14-4, one host is down. Select the link in that box to open the Host Status Details window for that host, which identifies the hosts and provides some status information about them. Nagios services are described in the next portion of the window. 14.3.2 Viewing Nagios hosts and services When the hardware configurations contains dozens of managed systems, the Service Detail view provides a good view of the Nagios hosts and their corresponding Nagios services.
Figure 14-6 Nagios service detail view The Status column displays any problems that might be occurring. To display the status of a service, select the link for the service in the Service column to open the Nagios Service Information view shown in Figure 14-7. 14.
Figure 14-7 Nagios service information view 14.3.3 Displaying hosts and services that are experiencing problems The Service Problems view, which is accessed by selecting Problems Services (Unhandled) in the Nagios menu, is useful for configurations with hundreds of systems. It identifies the Nagios hosts that are experiencing problems, and it shows only the corresponding Nagios services with status that is not OK, which enables you to monitor only those Nagios hosts that need attention.
Figure 14-8 Nagios service problems view Select the link that corresponds to a Nagios host to open the Nagios Host Information view for that Nagios host. You can also use the Nagios report generator, nrg, to obtain an analysis of all Nagios services: # nrg --mode analyze For more information and examples of its use, see nrg(8). 14.3.
Figure 14-9 HP Graph default overview display Figure 14-10 HP Graph detail display of all managed systems If you want to display the graphical data for a selected Nagios host (a Nagios host can be a virtual host), select an item in the menu in the upper left-hand side. Figure 14-11 (page 164) shows the graphs for one managed system, osmone.
cpu system Shows how much of the CPU time has been spent on system-level tasks. cpu usage Reports how much of the server's CPU set was spent in the user, system, and nice states. This is the default view. load average Reports the 1, 5, and 15 minute load averages. mem buffers Shows how much of the server's memory is allocated to system-wide memory buffers. mem shared Reports the amount of memory shared among applications.
NOTE: The detail graphs for a system show the graphs for a specified metric on all Nagios hosts. The detail graphs for a Nagios host show all applicable metrics for that Nagios host. Figure 14-11 HP Graph host display for one managed system 14.3.5 Gathering and displaying system environment data HP Insight Control for Linux provides plug-ins that monitor the environment data on each managed system such as temperature and fan speed, which can be indicators of possible system failure.
Nagios plug-ins are located in the /opt/hptc/nagios/libexec directory on the CMS. Table 14-1 lists each Nagios plug-in service that runs on the CMS. The items in the Service Name column correspond to the Service column of the Nagios Service Detail View and Service Problems View windows, which are shown in Figure 14-5 (page 158) and Figure 14-8 (page 161), respectively.
Table 14-2 Services monitored on managed systems (continued) Service name Function/Description Syslog Alerts1 Links to any consolidated log messages that match patterns in the /opt/hptc/ nagios/etc/syslogAlertRules file. System Event Log1 Links to any System Event Log messages that match patterns in the /opt/hptc/ nagios/etc/selRules file. The System Event Log is collected through the management processor, either an iLO or an IPMI BMC.
3 4 5 6 1 Warning 2 Critical other Unknown The name of the Nagios service description. For more information, see the corresponding /opt/hptc/nagios/etc/templates/*_template.cfg template file. The alert applies to this host name. The IP address of the host. The message text generated from the plug-in. In the following example, indicates that this data was collected by the Nagios monitor running on icelx47.
# 'nagios' contact definition define contact{ contact_name alias service_notification_period host_notification_period service_notification_options host_notification_options service_notification_commands host_notification_commands email pager } nagios Nagios Admin 24x7 24x7 w,u,c,r d,u,r notify-by-email,notify-by-epager host-notify-by-email,host-notify-by-epager nagios@localhost.localdomain nagios@localhost.
• Allows user customized metrics as well as predefined metrics 14.8.3 Performance Dashboard requirements The servers you want to monitor must fulfill the following requirements for using the Performance Dashboard tool; the servers must be: • • Licensed for HP Insight Control for Linux Configured to use HP Insight Control for Linux monitoring services, as described in Chapter 5 (page 55) 14.8.
Figure 14-13 Monitoring three metrics using Performance Dashboard 14.8.4.1 Ring plot color coding The colors used by the Performance Dashboard ring plot segments represent the following: • • • Light Gray means that a server is actively reporting data. Pink represents the actual value of the metric. Dark Gray means that a server is not reporting data and might be down. In that case, select the Left Mouse on the server to launch the Nagios application focused on that server to investigate further. 14.8.4.
2. 3. 4. 5. Select target managed systems. You can select individual servers or all servers in the icelx_servers subcollection. Select Apply to move the selected servers to the target list. Verify the target list. Select Run Now to launch the Performance Dashboard tool. 14.8.6 Using the mouse buttons to manipulate the Performance Dashboard tool Table 14-3 describes how to use the mouse to manipulate the Performance Dashboard tool.
• • • • • • • • • • • • • • • • • • • • • • User Time System Time Nice Time Idle Time Load Averages (1-Minute, 5-Minute, And 15-Minute Intervals) Total Processes Total User Processes Total Zombie Processes Network Received MB Network Received Packets Network Received Dropped Packets Network Received Errors Network Transmitted MB Network Transmitted Packets Network Transmitted Dropped Packets Network Transmitted Errors Total Swap Swap In Use Pages In Pages Out Pages Swapped In Pages Swapped Out 14.8.
15 Customizing Nagios The Nagios configuration is designed so that you can customize it as needed. Complete documentation for customizing Nagios is available on the following Nagios website: www.nagios.
nrpe_user=new_nagios_user # # # # # NRPE GROUP This determines the effective group that the NRPE daemon should run as. You can either supply a group name or a GID. NOTE: This option is ignored if NRPE is running under either inetd or xinetd nrpe_group=new_nagios_group Where new_nagios_group is the group name of the new Nagios user's account. Save the file. 5. Edit the /opt/hptc/nagios/etc/nagios.
# You can either supply a username or a UID. nagios_user=new_nagios_user # NAGIOS GROUP # This determines the effective group that Nagios should run as. # You can either supply a group name or a GID. nagios_group=new_nagios_group Save the file. 9. Run the Options→IC-Linux→Configure Management Services task. NOTE: The Task Results window may report completion although the operation might not yet be complete. Monitor the console to determine the result. 10.
To avoid these alerts, use the command sequence listed in the following table to shut down Nagios before performing any maintenance operations and tasks and start or restart Nagios. Purpose Command line To shut down Nagios on the CMS immediately before performing maintenance operations and tasks: # /etc/init.d/nagios stop To start Nagios after a maintenance operation: # /etc/init.d/nagios start To restart Nagios after changing its configuration: # /etc/init.d/nagios restart 15.2.
Figure 15-1 Nagios configuration 15.2.3 Changing sensor threshold values Job loads, usage patterns, process types, counts, memory, cache, disk subsystems, and so on contribute input to Nagios. Nagios uses threshold values to determine whether or not to send an alert and to determine whether that alert is critical or a warning. Nagios monitors the sensor thresholds and generates alerts when a threshold is reached.
Nagios for the thresholds. Modify these values to change when Nagios alerts you to subsystems encountering thresholds. The nagios_vars.ini file also contains variables that are commented out. Examine the content of the file to determine if those variables are appropriate for your system. If so, remove the comment characters accordingly. The following shows a portion of the nagios_vars.
Table 15-1 Supermon metrics collection intervals (continued) Metric name Collection interval avenrun %LOADAVECOLLECTIONPERIOD% ** mdadm %MDADMCOLLECTIONPERIOD% ** * The default is 5 minutes. ** This value is specified in the /opt/hptc/nagios/etc/nagios_vars.ini file. 15.2.5.1 Global service check timeout limit The master Nagios configuration file, nagios.cfg, contains global settings that control overall behavior. One of these settings is the service_check_timeout interval.
Table 15-2 Default settings for monitored Nagios services Service description Actively launched on Maximum check managed system? attempts Normal check interval Retry check interval Configuration Monitor Yes 3 60h 0m 0s 0h 2m 0s IP Assignment - DHCP Yes 3 0h 5m 0s 0h 1m 0s Switch Data Collection Yes 3 0h 10m 0s 0h 2m 0s Apache HTTPS Server Yes 3 0h 5m 0s 0h 1m 0s Host Monitor Yes 3 0h 5m 0s 0h 1m 0s Nagios Monitor Yes 3 0h 5m 0s 0h 1m 0s Sensor Collection Monitor Yes 3 0h
: Deleting an alert can be useful if you receive duplicate alerts, one from HP SIM and one from Nagios, for the same event. 15.5 Controlling Nagios messages Nan is an open source utility supplement to the Nagios application. HP Insight Control for Linux incorporated the Nan notification aggregator and delimiter for the Nagios paging system. Nagios can send large numbers of messages, especially when the CMS and managed systems are starting up, shutting down, or experiencing a failure.
16 Using the command line to view managed system status HP Insight Control for Linux provides a set of commands that you can run on the CMS to determine the status of one or more managed systems. This chapter addresses the following topics: • • • • “Archiving sensor metrics on an individual basis” (page 183) “Displaying usage, statistics, and metrics with the shownode command” (page 184) “Displaying environmental data” (page 189) “Reporting usage information and host and service status” (page 189) 16.
Example 16-2 Expanded sensor metrics # shownode metrics sensors icelx1 Timestamp |Node_Id |Name |Value |Description -------------------------------------------------------------------------date_and_time |icelx1 |Temp 8 Memory |54 |Celsius; ok date_and_time |icelx1 |Temp 5 CPU |31 |Celsius; ok date_and_time |icelx1 |Temp 2 CPU |33 |Celsius; ok date_and_time |icelx1 |Temp 7 CPU 2 |30 |Celsius; ok date_and_time |icelx1 |Temp 1 System |40 |Celsius; ok date_and_time |icelx1 |Temp 6 CPU 2 |30 |Celsius; ok date_an
Admin: device: gateway: hwaddr: iftype: ifusage: interface_number: ipaddr: ipv6addr: mtu: name: netmask: port: switch: install_disk: is_blade: level: location: memory: n_sockets: node_number: power_setting_dts: power_setting_on: region: server_type: services: gather_data: hosts: provider_type: eth2 Admin 192.0.2.3 earth.example.com Unknown (edit /etc/snmp/snmpd.
-------------------------------------------------------------------------------icelx1 |192.0.2.1 |earth |earth.example.com |192.0.2.7 |ILO2 icelx2 |192.0.2.2 |neptune |neptune.example.com |Unknown |Unknown icelx3 |192.0.2.3 |saturn |saturn.example.com |192.0.2.8 |ILO2 icelx4 |192.0.2.4 |mercury |mercury.example.com |192.0.2.9 |ILO2 icelx5 |192.0.2.5 |192.0.2.5 |192.0.2.5 |Unknown |Unknown icelx6 |192.0.2.6 |pluto |pluto.example.com |192.0.2.
NOTES: • Metrics that return data based on time (for example, cpu idle time) might be inaccurate when collected from a virtual guest by Supermon and Nagios. To ensure accuracy in these time-based metrics, use the vCenter application for VMware ESX and VMware ESXi virtual guests and virt-manager for Xen virtual guests. • On ESX 3.5 systems, the output from the shownode metrics command for paging, diskinfo, and so on is displayed as 0 because these metrics are unavailable from the ESX 3.5 host.
date_and_time |icelx1 | | | | | | | |0 |1 |2 |3 |4 |5 |6 |7 |1854680 |3645524 |3118507 |2585423 |2700043 |3517129 |3684936 |3705238 |1 |295 |60 |1 |215 |107 |300 |221 |2308294 |2669669 |2828907 |2680954 |2259501 |2372128 |2303954 |2321097 |564006498 |530638968 |550132258 |561896064 |563186698 |561278491 |561833592 |562124531 |223267 |31391474 |12305645 |1266708 |295084 |1274550 |619651 |291321 |2258 |1 |21605 |13143 |0 |0 |0 |0 |48882 |97907 |36849 |1529 |2273 |1402 |1368 |1387 The shownode metric
16.3 Displaying environmental data Depending on the platform, certain tools might enable you to collect information specific to the platform. You can use the /sbin/hplog utility to display the following environment data: • • • Thermal sensor data Fan data Power data For more information, see hp-health(4) and hplog(8). 16.
# nrg --help --help --verbose - Report more details --log|l - logfile, default $statuslog --severity - default is all c - critical, w - warning, o - ok, u - unknown, p - pending --hosts - Only list hosts status --services - Only list service status --monitors - Only list monitor status --up - Only up nodes --down - Only down nodes --sort t,h,s - Sort by (t)ime, (h)ost, (s)ervice --sort - Summary mode only (as cwoup as in severity) --mode - Report mode: (f)ull, (s)ummary, (r)aw, (w)at
17 Remote server controls The menu items on the Tools→Server Controls menu enable you to remotely manage power control on a physical managed system. IMPORTANT: Be aware that the Insight Control for Linux server controls operate by contacting the management processor of the server directly and executing the requested power function. That means that servers are powered off or cycled abruptly without a graceful shutdown.
18 Connecting to a remote console This chapter addresses the following topics: • • • • • “Console management facility overview” (page 193) “How CMF works” (page 193) “Accessing a remote console” (page 193) “Serial connections on DL100 series servers” (page 194) “Enabling telnet access to iLO management processors” (page 195) 18.1 Console management facility overview The Console Management Facility (CMF) daemon, cmfd, collects and stores console output for all managed systems.
NOTE: You can find which servers are management hubs by examining the icelx_Management_Hubs subcollection. You can also use the following command, but be aware that it returns internal names: # shownode roles --role management_hub 2. Log in to the console with the console command. You can specify either the internal name or the host name. This example uses the internal name icelx16 instead of the host name mercury: $ console icelx16 Locating server for icelx16 Server for icelx16 is mercury.example.
18.5 Enabling telnet access to iLO management processors IMPORTANT: The telnet protocol transmits the user name and password in clear text over the network to the iLO management processor. HP does not recommend using telnet if your environment is untrusted. By default, the cmfd connects to the management processor using the SSH protocol.
19 Using SSH for remote server management HP Insight Control for Linux provides several ways for you to access a managed system through SSH. This chapter addresses the following topics: • • • • “Setting SSH credentials on managed systems” (page 197) “Setting SSH credentials for users” (page 197) “Running a command on multiple managed systems” (page 198) “Using HP Insight Control for Linux to run commands and scripts through SSH” (page 199) 19.
19.3 Running a command on multiple managed systems The open source Parallel Distributed Shell (pdsh) command is a multi-threaded remote shell client that runs commands on multiple managed systems in parallel. You can specify all, a given number of, or only certain managed systems on which to perform the command or commands that are passed as arguments to the pdsh command. All three forms of managed system names can be used by the pdsh command.
• The -w option enables you to specify a range of managed systems or a specific managed system. Command line Description # pdsh -w icelx[1-5] hostname earth: earth.example.com eris: eris.example.com mercury: mercury.example.com mars: mars.example.com pluto: pluto.example.com Runs the hostname command on the first five managed systems in the collection. # pdsh -w mercury uptime Runs the uptime command on a specific managed system.
4. 5. 6. Enter the command you want to run. Enter multiple commands on one line and separate each command with a semicolon (;). The maximum length of a command is 255 characters. Select Run Now to run the command immediately. Or, select Schedule to schedule the task to occur at some point in the future Select the following menu item from the HP Insight Control user interface to view the task results: Tasks & Logs→View Task Results...
state. If the return code from the script indicates success, the task is shown in a Complete state. 19.
20 Managing licenses This chapter describes the following topics: • • • “Licensing overview” (page 203) “Adding the Insight Control for Linux license key to HP SIM” (page 203) “Licensing virtual guests” (page 204) 20.1 Licensing overview HP Insight Control for Linux uses the HP SIM License Manager as its licensing model. Every server you want to monitor and manage with HP Insight Control for Linux requires an Insight Control for Linux license to be applied to it.
20.3 Licensing virtual guests When a virtual host (VM host) is licensed for HP Insight Control for Linux, all guests of that VM host are considered licensed for Insight Control for Linux as well, provided that the virtual guests are properly associated with their virtual host. You can license a virtual machine guest (VM guest) without licensing its host or you can license it in addition to licensing its host, in either case unnecessarily consuming licenses.
21 Miscellaneous topics This chapter addresses the following topics: • • • • • • • • • • “Changing management processor credentials” (page 205) “Changing the default port for the repository web server” (page 205) “Increasing the number of servers that can be discovered concurrently” (page 206) “Changing the IP address of the CMS ” (page 206) “Uninstalling HP Insight Control for Linux” (page 206) “Determining the installed HP Insight Control for Linux version” (page 207) “Event logging overview” (page 207)
Wait for two to three minutes for HP SIM to restart completely. 21.3 Increasing the number of servers that can be discovered concurrently When performing a bare-metal discovery on a set of servers, the maximum number of nodes that will be discovered concurrently is 16. Perform the following steps to increase that number: 1. Edit the /opt/mx/icle/icle.
4. Change directory to the uninstall directory, and run the uninstall script to remove all Insight Control for Linux RPMs: NOTE: You must run the script from the uninstall directory. # cd /opt/hp/icelx/config/uninstall # ./uninstall.sh 5. Remove the following HP Insight Control for Linux monitoring directories. If you have any files in these directories that you want to preserve, make sure you save a copy of the files before you remove them.
21.7.2 The syslog-ng.conf rules file The syslog-ng.conf rules file defines the order of importance by which the log files are arranged. The /opt/hptc/syslog-ng/etc/syslog-ng/syslog-ng.conf file defines a series of rules for the syslogng_forward service on how to handle messages from its clients. The syslog-ng.conf file contains five types of rules: Options Defines generic information such as reconnection timeouts, FIFO size limits, and so on.
• The value of MAX_CONCUR_CHAINS variable in the /opt/mx/icle/icle.properties file. The default value is 64. The sum of the weight values for all the tasks running concurrently cannot exceed the value of the MAX_CONCUR_CHAINS variable. For example, if you wanted to run several Deploy Linux Image tasks (each of these tasks carries a weight of 6), the sum of the first ten tasks is 60. The eleventh task would exceed the value of the MAX_CONCUR_CHAINS variable.
3. Use the text editor of your choice to create the /etc/icelx/uuidformat/nodename file on the managed system. This file consists only of the following text: UUIDFORMAT=n Where n is original, lowercaseG6, or G6, as appropriate. 4. If the IP of the management processor is on a separate subnet from the server and inaccessible, then perform these additional steps to allow the server to be successfully bare-metal discovered. a.
22 Advanced topics Topics include: • • “Management Processor Credentials” (page 211) “Deploying WBEM provider components using Configure or Repair Agents task” (page 213) 22.
5. Select OK. 22.1.2.2 Discovering and setting up servers with virtual media deployment If your site uses the virtual media deployment features of Insight Control for Linux, you need to perform these additional steps when you discover the management processors: 1. 2. 3. 4. 5. For the initial part of the process, create an account on the management processor being discovered that matches the default Insight Control for Linux MP credentials.
When a new set of credentials is entered with the Configure →Management Processor→Credentials... task, Insight Control for Linux attempts to find an existing user with the same user name. If one is found, the user password is changed to match the new credential. If no match is found, then the new credentials are placed in slot 15, overwriting the existing credentials. For this reason, do not store credentials, other than those for Insight Control for Linux, in slots 15 and 16. 22.
On SLES 10 SP2 and SLES 11: • • • • • • • xen xen-devel xen-libs xen-tools kernel-xen kernel-source sblim-indication_helper For SLES 10 SP2, the openwbem package must not be installed. All Xen virtual hosts must have the HP ProLiant Support Pack (PSP) installed. For information on deploying PSP, including dependent packages, see the Minimum requirements for Linux servers section at http://h20000.www2.hp.com/bc/docs/support/SupportManual/c00472061/c00472061.pdf.
23 Troubleshooting This chapter addresses the following topics: • • • • • • • • • • • • • • • • • • • • • • • • • • • “General troubleshooting topics” (page 215) “Alternative booting” (page 216) “Apache service does not start” (page 216) “Troubleshooting CMF problems” (page 217) “Troubleshooting configuration problems” (page 220) “Troubleshooting connection problems” (page 225) “Troubleshooting DHCP problems” (page 226) “Troubleshooting discovery problems” (page 228) “Troubleshooting firmware update proble
Problem See: Unable to get SSH credentials: SSH credentials for the specified server were not set or are missing Section 23.21 (page 251) SSH authentication failed Section 23.21 (page 251) Unable to create SSH connection: Connection refused Section 23.21 (page 251) Error retrieving BMC for server. Root cause: Could not determine the BMC associated Section 23.18 (page 246) with the server (x.x.x.x) in the database Unable to power off server: Error retrieving BMC for server.
• RHEL Version 4: Look in the /var/log/httpd/error_log log file for service errors. You might see an error similar to this: Init: Unable to read server certificate from file /var/log/httpd/error_log To create a self-signed certificate, see Section 5.2 (page 55). 23.4 Troubleshooting CMF problems The following table describes possible causes of problems with the Console Management Facility (CMF) and provides actions to correct them.
Cause/Symptom Corrective actions Debug the cmfd daemon List the cmfd daemon's debug mode options with the cmfd -h command, then run the cmfd daemon with the appropriate options. The output is logged in the /opt/hptc/cmf/logs/ cmfd.log file. For example, to see the connection attempts and output: # /etc/init.d/cmfd stop # /opt/hptc/cmf/sbin/cmfd -d 00000102 218 Console command cannot connect to console.
Cause/Symptom Corrective actions CMF encounters a fatal error. Examine the /opt/hptc/cmf/logs/cmfd.log file for any errors. Console output is not being collected for the managed Perform the appropriate actions: system. • Ensure that the target system’s VSP (Virtual Serial Port) has been properly configured in its management processor. • Follow these steps: 1. Ensure that the target system’s console is being redirected to ttyS0 (COM1) or ttyS1 (COM2).
23.5 Troubleshooting configuration problems The following table describes possible configuration problems and provides actions to correct them. Cause/Symptom Corrective actions Configure Insight Control for Linux management services fails Perform the appropriate actions: • Verify that the task has indeed completed. The Task Results window may report completion although the operation might not yet be complete. Monitor the console to determine the result.
Cause/Symptom Corrective actions Configure Insight Control for Linux services might fail If the Options→IC-Linux→Configure Management if the CMS is not properly represented in HP SIM Services task fails, look at the CMS entry in the HP SIM database. On rare occasions, in the following instances, the CMS is not properly represented in HP SIM: • Two entries for the CMS exist in the HP SIM database. For example, if the CMS host name is earth, two instances of earth are returned by the mxnode command.
Cause/Symptom Corrective actions If you experience similar issues, follow these troubleshooting recommendations: • Verify that the /etc/hosts file is correct. For example, make sure the real host name is not equated to localhost and make sure there is only one real and valid entry for the host name and IP address. • Verify that the DNS configuration is correct.
Cause/Symptom Corrective actions Enclosures collection monitor will report a CRITICAL The Enclosures Collection Monitor Nagios service reports status if the OA credentials have not been configured a CRITICAL alert if the OA credentials have not been properly configured properly. Locate the value for the command[encchk_all] command definition in the /opt/hptc/nagios/etc/ nrpe_local.cfg file. Run the command associated with the command definition.
Cause/Symptom Corrective actions Configuration fails if a subcollection exists The configuration of the Insight Control for Linux Management Services fails if a subcollection exists under any of the Insight Control for Linux collections are: • prefix_Servers • prefix_Console_Ports • prefix__Enclosures • prefix_Switches • prefix_Management_Hubs • prefix_headnode Where prefix is the prefix for your system; the default prefix is icelx. To remedy, remove the subcollections by selecting Customize...
23.6 Troubleshooting connection problems The following table provides actions to correct a possible connection problem. Cause/Symptom Corrective Actions Cannot connect to network Perform the following actions: • Verify the network connection. • Examine the firewall. • Verify that HP SIM is operating properly. • Verify the following settings in the /opt/hptc/etc/ sysconfig/cmsserver.ini file: — The value of cmsServer should be the IP address of the CMS. — The value of cmsPort should be 50001.
23.7 Troubleshooting DHCP problems The following table describes possible causes of problems with Dynamic Host Configuration Protocol (DHCP) and provides actions to correct them. Cause/Symptom Corrective actions DHCP Process Not Running Perform the appropriate action: The DHCP server process is absent from the process list, • Verify that the /etc/dhcpd.conf service configuration file exists and that it is not empty. verified with the following command: • Verify that the /etc/dhcpd.
Cause/Symptom Corrective actions IP Addresses Are WRONG • Verify that there is only one DCHP server on the network. • Check that the CMS and managed servers are networked properly, that is, either using a dedicated management network or obtaining approval from your network administrator to provide DHCP on an existing network. • Temporarily disable the DHCP service on the CMS with the following command: The IP addresses assigned to the managed systems do not match the configuration of your DHCP server.
23.8 Troubleshooting discovery problems The following table describes possible causes of problems that may occur during the device discovery process and provides actions to correct them.
Cause/Symptom Corrective actions The Configure System to Boot from Local Disk operation Perform the appropriate action: failed. • Select Options→Data Collection, specifying the server that failed as the target. Verify this task completed successfully in the Task Results window. • Run the Data Collection Report on the system, which is accessible from the Tools & Links page for the system, and verify that there is a Network Interface section containing one or more MAC address(es).
23.9 Troubleshooting firmware update problems The following table provides the actions to correct a firmware update task failure. Cause/Symptom Corrective Actions Firmware Update Task Failed Perform the appropriate action: NOTE: If the task fails, the system is left up in the Insight • Examine the Task Results If the task failed on any step other than the firmware Control for Linux RAM disk, so that you can examine the step, perform the steps to correct that problem.
23.11 Troubleshooting large scale deployment problems The following table provides the actions to correct a large scale deployment failure. Cause/Symptom Corrective Actions Large Scale Deployment Failed Perform the appropriate action: • Examine the log in Operation Details section of the Task Results window for errors or other information.
23.13 Troubleshooting monitoring problems The following table describes possible causes of problems related to monitoring and provides actions to correct them. Problems related to the Performance Dashboard tool are described in a subsequent table. Cause/Symptom Corrective Actions Cannot distribute pdsh keys Ensure that the Configure→Configure or Repair Agents task was run on the managed systems.
Cause/Symptom Corrective Actions Configure Management Services task fails Verify that a proxy is not used to communicate between the CMS and the managed system. Insight Control for Linux does not have proxy server support; the Insight Control for Linux features do not communicate through proxy servers, and require direct network connectivity between the CMS and the managed systems. Verify communication by using the wget command on the managed system to retrieve a file from the CMS.
The following table describes possible causes of problems with the HP Graph tool and provides actions to correct them. Cause/Symptom Corrective Actions Cannot Launch HP Graph After Upgrade Add a symbolic link of the hpcgraph.conf file to the web server's configuration directory and restart the web server as follows: HP Graph cannot launch on a CMS that was upgraded from an older release of Insight Control for Linux. For RHEL operating systems: 1.
The following table describes possible causes of problems with the Performance Dashboard tool and provides actions to correct them. Cause/Symptom Corrective Actions Performance Dashboard initiates without data. It displays all dark gray. Perform the appropriate action: • Restart the Performance Dashboard tool. • Determine if Nagios is collecting data. The Performance Dashboard tool uses the same metric gathering infrastructure as Nagios.
• • “Nagios gather_all_data script reports check_nrpe errors ” (page 240) “Troubleshooting Nagios problems” (page 240) 23.14.1 Determining the status of the Nagios service Use the following command to determine if Nagios is running properly: # /etc/init.d/nagios status Nagios ok: located 1 process, Gathering status for nrpe ... Nagios nsca: icelx7: 0 data packet(s) sent icelx5: 0 data packet(s) sent icelx6: 0 data packet(s) sent status log updated 22 seconds ago icelx<3-8>NRPE v2.
# nrg --mode analyze Nodelist -------USE6371RA4 Description ------------------------------------------------------------------------Enclosure Status - Warning> The enclosure is reporting one or more warning conditions for environmental sensors gathered from the device. Check the sensor status on the enclosure. Verify the status of the Enclosures Collection Monitor which provides this data.
A warning or critical message indicates that one or more monitored sensors reported that a threshold has been exceeded. Correct the condition. Service: Load Average Status Information: Node Load Ave: x/y/z QueLen: n A warning or critical message indicates that load average thresholds for the specific managed system have been exceeded. Thresholds can be set on a per-managed system, per-class, or per-system basis in the nagios_vars.ini file. These values are specific to the site and depend on site load.
Reports the number of new records processed in the /hptc_cluster/adm/logs/ consolidated.log file. A warning or critical message occurs when there is insufficient time to process a huge volume of messages before the Nagios service_check_timeout period expires. Nagios examines the recent incoming consolidated log messages and issues a warning or critical message if the incoming rate since last interval exceeds a configured number of records. The default values are 2 for warnings and 20 for critical.
• If the output reports that vars.ini have been resynchronized for a managed system, verify that there is a self-signed certificate for the Apache service and that that service is running. For troubleshooting information on the Apache service, see Section 23.3 (page 216). 23.14.7 Nagios gather_all_data script reports check_nrpe errors These errors include socket timeouts and refused connections. The nrpe daemon is unable to configure the server because the check_nagios_vars script is unable to write vars.
Cause/Symptom Corrective Actions Nagios “Management Settings Monitor” service reports Run the following commands to resynchronize the vars warning vars.ini file across all managed systems: # cd /opt/hptc/nagios/libexec # ./check_nagios_vars --update Nagios services report a non-OK status Remove the nagios_vars.db file: Under very rare circumstances, the Nagios cache might # rm /opt/hptc/nagios/etc/nagios_vars.db become unsynchronized.
Cause/Symptom Corrective actions Can no longer install older OS after upgrade Ensure that older operating systems were added manually to the /opt/mx/icle/SupportMatrix.xml file. For information, see the HP Insight Control for Linux Installation Guide. Kickstart / AutoYaST install completes but task in SIM Ensure that you removed the following two files: UI still shows that it is running • autoInstallComplete_jsp.class • autoInstallComplete_jsp.
Cause/Symptom Corrective actions OS Installations fail, cannot connect to managed system Verify that a proxy is not used to communicate between the CMS and the managed system. Insight Control for Linux does not have proxy server support; the Insight Control for Linux features do not communicate through proxy servers, and require direct network connectivity between the CMS and the managed systems. Proxy information can be defined using an environment variable or various configuration files.
23.15.3 Capturing Linux images Cause/Symptom Corrective actions The CMS disk partition with /opt/repository is full. Perform the appropriate action: • Create a new disk partition with more space for /opt/ repository on the CMS. • Turn off the dump flag in the /etc/fstab file on the target servers to capture fewer partitions. The target server has lost association with its management processor. For the corrective action, see Section 23.
Cause/Symptom Corrective actions OS Installations fail, cannot connect to managed system Verify that a proxy is not used to communicate between the CMS and the managed system. Insight Control for Linux does not have proxy server support; the Insight Control for Linux features do not communicate through proxy servers, and require direct network connectivity between the CMS and the managed systems. Proxy information can be defined using an environment variable or various configuration files.
Cause/Symptom Corrective actions The PSP file is corrupted. Recopy the PSP to the appropriate /opt/repository/ psp subdirectory. The PSP installation fails on a ESX or ESXi system. The Deploy→Deploy Drivers, Firmware and Agents→IC-Linux→Install ProLiant Support Pack is not supported on managed systems running ESX or ESXi NOTE: If you need to install ESX agents, run Configure→Configure or Repair Agents… 23.
Cause/Symptom Corrective actions Problem manipulating EV Retry after waiting a short period of time, that is a minute or two. This is usually caused when a server is booting and a power control command is sent to its management processor. Most commonly the system is in a BIOS boot and the management processor cannot determine the power status. Unable to power off server: Error retrieving BMC for server.
23.19.3 Rebuilding a server-to-management processor association It might be necessary to rebuild an association between a previously discovered managed system and its management processor. There are several ways to do this depending on what the problem is and what state the managed system is in. 23.19.3.1 Repairing the association of a booted managed system running an OS If a managed server is booted and running a supported OS, follow these steps to repair a lost association.
servers, data fields in the system BIOS contain system serial number and asset tag information. These fields are set at the factory, but you can override them. Verify that these fields appear valid and do not contain any special characters. Abnormal data in these fields cause the iLO to generate an error and cause the server-to-iLO association to break. 6. • If the BIOS data is valid and the iLO XML call is still reporting errors, a hardware problem might be the cause.
Figure 23-2 PXE configuration files 6. 7. 8. 9. PXE boot the managed server as described in Section 3.1.1 (page 37). Monitor the HP Insight Control user interface and the server's remote console and verify that the server has been successfully rediscovered and associated with its management processor. If the server does not have a valid OS installed on it, power off the server when it has completed its discovery.
sysUpTimeInstance = Timeticks: (29922197) 3 days, 11:07:01.97 sysContact.0 = STRING: Root root@localhost (configure /etc/snmp/snmp.local.conf) sysName.0 = STRING: pluto.example.com sysLocation.0 = STRING: Unknown (edit /etc/snmp/snmpd.conf) sysORLastChange.0 = Timeticks: (0) 0:00:00.00 sysORID.1 = OID: ifMIB sysORID.2 = OID: snmpMIB sysORID.3 = OID: tcpMIB sysORID.4 = OID: ip sysORID.5 = OID: udpMIB sysORID.6 = OID: vacmBasicGroup sysORID.7 = OID: snmpFrameworkMIBCompliance sysORID.
Cause/Symptom Corrective actions Unable To Get SSH Credentials: SSH Credentials For the Specified Server Were Not Set Or Are Missing Set the SSH credentials for the target system using the HP SIM Options→Credentials tool. Specify Global or The credentials for the SSH protocol settings are missing. System credentials as appropriate. For more information, see “Setting SSH credentials on managed systems” (page 197) and the HP Systems Insight Manager online help.
In place of 192.0.2.1:60000, substitute the actual IP address of your CMS and the repository web server port. This entry automatically copies your master /etc/hosts file from the CMS as part of the installation procedure. 23.23 Troubleshooting supermon problems The following table describes possible causes of problems with Supermon and provides actions to correct them.
23.25 Troubleshooting HP Insight Control for Linux tool problems Cause/Symptom Corrective action "Tool Launch OK?" says NO Select only target managed systems with a system type of Server. Insight Control for Linux tools only work with systems with a system type of Server.
23.26 Troubleshooting virtual machine installation and setup problems Cause/Symptom Corrective action HP SIM did not properly identify the virtual host Perform the appropriate actions: • For Xen , use the uname command as follows to verify that the VM host is running a Xen kernel: # uname -r 2.6.18-92.e15xen The text string xen should be embedded in the output.
Cause/Symptom Corrective action VM Guests are not monitored Check the {collection_name}_Servers subcollection to ensure that the VM guests to be monitored belong to that subcollection. Problems installing SLES 10 SP2 x86_64 Xen See the HP Insight Control virtual machine management documentation for workarounds to this problem. This might occur on some hardware combinations. ESX 3.5 Kickstart installation hangs prompting for disk When VMware ESX 3.
Cause/Symptom Corrective action Virtual media bare metal discovery fails Perform the appropriate action: • Verify that port 60002 is open on the CMS. • Invoke the Network Configure Editor and verify that the network parameters for the server have been specified correctly For information on the Network Configure Editor, see “Using the Network Configuration Editor” (page 31). Virtual media falls back to PXE boot.
24 Support and other resources 24.1 Contacting HP 24.1.1 Information to collect before contacting HP Be sure to have the following information available before you contact HP: • • • • • • Software product name Hardware product model number Operating system type and version Applicable error message Third-party hardware or software Technical support registration number (if applicable) 24.1.
24.1.4 HP authorized resellers For the name of the nearest HP authorized reseller, see the following sources: • In the United States, see the HP U.S. service locator website at: http://www.hp.com/service_locator • In other locations, see the Contact HP worldwide website at: http://welcome.hp.com/country/us/en/wwcontact.html 24.1.5 Documentation feedback HP welcomes your feedback. To make comments and suggestions about product documentation, send a message to: docsfeedback@hp.
— — — Appendix B, Sample PSP dependency script Appendix C, Sample installation configuration files Appendix E, Alternative network configurations; this information was moved to the HP Insight Control for Linux Installation Guide. 24.3 Related information 24.3.
• HP ProLiant Servers Information about HP ProLiant servers is available at the following websites: — BL BladeSystem servers: http://www.hp.com/go/blades — DL series servers: http://www.hp.com/servers/dl — ML series servers: http://www.hp.com/servers/ml 24.3.2 Websites HP ProLiant Support Pack To find and download the HP ProLiant Support Pack (PSP) that is appropriate for your ProLiant server and Linux OS, follow these steps: 1. Open a browser to the following web address: http://www.hp.com 2. 3.
• http://www.linuxheadquarters.com A website that provides Linux documents and tutorials. Documents contain instructions for installing and using applications for Linux, configuring hardware, and a variety of other topics. • http://www.gnu.org Home page for the GNU Project. This site provides online software and information for many programs and utilities that are commonly used on GNU/Linux systems. Online information include guides for using the bash shell, emacs, make, cc, gdb, and more.
Term A term or phrase that is defined in the body text of the document, not in a glossary. User input Indicates commands and text that you type exactly as shown. Replaceable The name of a placeholder that you replace with an actual value. [] In command syntax statements, these characters enclose optional content. {} In command syntax statements, these characters enclose required content. | The character that separates items in a linear list of choices. ...
A Sample SLES version 9 installation media copy session This appendix provides a detailed example of copying SLES Version 9 installation media to a staging directory on a Linux Workstation. Example A-1 (page 265) uses a loopback mounting of ISO images, but the procedure is also valid for working with CD media. If you are working with physical CD media, replace the mount -o loopback command with the following: # mount /dev/cdrom Example A-1 Copying SLES version 9 installation media 1.
3. Loop mount and copy the service pack contents into their separate directories # mkdir -p SVRP3/CD{1,2,3} 266 # # # # # mount -o loop /tmp/SLES-9-SP-3-i386-CD1.iso cdmount cd cdmount find . -depth -print | cpio -pamVd ../SVRP3/CD1 cd .. umount cdmount # # # # # mount -o loop /tmp/SLES- 9-SP-3-i386-CD2.iso cdmount cd cdmount find . -depth -print | cpio -pamVd ../SVRP3/CD2 cd .. umount cdmount # # # # # mount -o loop /tmp/SLES-9-SP-3-i386-CD3.iso cdmount cd cdmount find .
Glossary A AutoYaST file A configuration file used to effect an unattended installation of SLES operating systems. B bare-metal Describes a server that is not currently booted with a running operating system. This could be a brand new server with no OS installed on it, or it could be a server with an OS that is not booted. C central management server See CMS. certificate An electronic document that contains a subject's public key and identifying information about the subject.
hypervisor Computer software, specific to a hardware platform, that allows you to run multiple operating systems on a single host at the same time. I iLO Integrated Lights Out. A self-contained hardware technology available on various hardware models that enables remote management of any node within a system. Subsequent generations of this technology are iLO 2 and iLO3. For information on which servers offer iLO management processors, see the HP Insight Control for Linux Support Matrix.
ProLiant Support Pack See PSP. PSP ProLiant Support Pack. A set of HP software components that have been bundled together by HP and verified to work with a particular operating system. An HP ProLiant Support Pack contains driver components, agent components, and application and utility components. All these are verified to install together. PSP dependency script An optional user-provided script that runs during a PSP deployment to a managed system. PXE Preboot Execution Environment.
Index A Apache self-signed certificate, 216 configuring on the CMS, 55 Apache service does not start, 216 port 80, 73 association between server and management processor, 247 between virtual host and virtual guest, 143 AutoYaST file, 82 (see also installation configuration file) defined, 81 B bare-metal system discovering (PXE), 37, 39 discovering (virtual media), 39 bare-metal system discovery discovery, 13 increasing number of servers discovered concurrently, 206 procedure for Initiate Bare-Metal Discove
digital signing, 17 directories to back up, 24 discover bare-metal systems, 13 bare-metal systems using PXE, 37 bare-metal systems using virtual media, 39 enclosures, 37, 41 managed systems, 37 servers with supported OS on them, 40 servers with unsupported OS on them, 37 switches, 37, 41 documentation HP Insight Control, 261 Insight Control for Linux, 261 Linux, 262 Nagios, 263 pdsh, 263 ProLiant servers, 262 providing feedback, 260 PSP User Guide, 262 reporting errors in, 260 RRDtool, 263 SIM, 261 Supermon
procedure to install VMware ESX interactively, 93 procedure to install VMware ESXi interactively, 93 PSP, 105 Red Hat interactive, 89 Red Hat unattended, 89 SLES Linux interactive, 91 SLES Linux unattended, 91 supported operating systems, 82 unattended, 81 installation configuration file customized macros, 84 default location in repository, 82 naming convention in repository, 82 registering in the repository, 68 supported types, 81 installation troubleshooting, 241 interactive installation session , 81 inte
real time metrics, 168 services failure, 221 strategy, 151 troubleshooting, 232 using Nagios, 156 using the command line, 183 monitoring services configuring and starting, 55 N Nagios /opt/hptc/nagios/etc/selRules file, 180 alert messages, 166 changing default settings, 179 changing default user name, 173 changing rules file, 180 changing sensor thresholds, 177 configuration file, 176 customizing, 173 default alert message format, 166 defined, 151 determining status of Nagios service, 236 disabling plug-in
metrics, 171 troubleshooting, 234 plug-in check_metrics, 153 defined, 152 disabling for Nagios, 179 exit status, 151 monitoring environment date, 164 Nagios monitoring, 164 nrpe, 152 running on managed system, 165 running on the CMS, 165 syslog alert monitor, 152 plug-in directory /opt/hptc/nagios/libexec, 164 port 80, 73 postcapture script, 110 postdeployment script, 111 power management, 191 power off managed system, 14 power on managed system, 14 precapture script, 110 predeployment script, 111 processor
server control troubleshooting, 246 server power management, 191 _Server subcollection, 147 server to management processor association, 247 service running on managed systems, 165 SFS Nagios host, 156 shownode command, 16 subcommands, 184 shownode config command, 184 shownode info command, 185 shownode metrics command, 164, 186 subcommands, 186 shownode virtual command, 186 shut down server graceful, 191 SIM documentation, 261 SLES installation troubleshooting, 241 SmartStart Toolkit, 216 SNMP snmpwalk comm
enabling, 195 toolboxes, 15 troubleshooting, 215 Apache service, 216 capturing Linux image, 244 CMF, 217 configuration, 220 connection, 225 console management, 217 custom OS installation, 243 device discovery, 228 DHCP, 226 firmware update, 230 HP Graph, 233 image deployment, 244 Insight Control for Linux repository, 230 install PSP, 245 large scale deployment, 231 licensing, 231 Linux image deployment, 241 monitoring, 232 Nagios, 235–237, 240 nrg command, 58 OS installations, 241 Performance Dashboard, 234