ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 First Edition (January 2001) Part Number 221544-001 Compaq Computer Corporation Compaq Confidential – Need to Know Required Writer: Rachel Williams Project: Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Comments: Part Number: 221544-001 File Name: a-frnt.
Notice © 2001 Compaq Computer Corporation Compaq, the Compaq logo, NonStop, ProLiant, SmartStart, Compaq Insight Manager, ServerNet, and ROMPaq Registered in U.S. Patent and Trademark Office. Microsoft, MS-DOS, Windows, and Windows NT are trademarks of Microsoft Corporation in the United States and other countries. Intel and Pentium are trademarks of Intel Corporation in the United States and other countries. UNIX is a trademark of The Open Group in the United States and other countries.
Contents About This Guide Text Conventions.......................................................................................................vii Symbols in Text....................................................................................................... viii Symbols on Equipment............................................................................................ viii Getting Help ...........................................................................................................
iv Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Setting Up Cluster Hardware continued Cabling the Components.......................................................................................... 2-8 Using Labeling Standards ................................................................................. 2-8 Cabling the ServerNet I Interconnect................................................................ 2-9 Cabling the Public LAN Connection .
Contents NonStop Clusters Verification Utility .............................................................. 4-9 UPS-Initiated Shutdown ................................................................................... 4-9 Chapter 5 Troubleshooting Installation Problems ............................................................................................... 5-2 Quick Install Error Messages............................................................................
About This Guide Use the Compaq ProLiant Clusters for the SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 as step-by-step instructions for installation and as a reference for cluster operation and troubleshooting. Text Conventions The following conventions distinguish elements of text: Keys, Buttons Keys and buttons appear in boldface. A plus sign (+) between two keys indicates that they should be pressed simultaneously.
viii Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Symbols in Text These symbols may be found in the text of this guide. They have the following meanings. WARNING: Text set off in this manner indicates that failure to follow directions in the warning can result in bodily harm or loss of life. CAUTION: Text set off in this manner indicates that failure to follow directions can result in damage to equipment or loss of information.
About This Guide This symbol, on an RJ-45 receptacle, indicates a network interface connection. WARNING: To reduce the risk of electric shock, fire, or damage to the equipment, do not plug telephone or telecommunications connectors into this receptacle. This symbol indicates the presence of a hot surface or hot component. If this surface is contacted, the potential for injury exists. WARNING: To reduce the risk of injury from a hot component, allow the surface to cool before touching.
x Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Getting Help If you have a problem and have exhausted the information in this guide, you can obtain further information and other help in the following locations. Compaq Technical Support In North America, call the Compaq Technical Support Phone Center at 1-800-OK-COMPAQ. This service is available 24 hours a day, 7 days a week. For continuous quality improvement, calls may be recorded or monitored.
About This Guide Compaq Authorized Reseller For the name of your nearest Compaq authorized reseller: ■ In the United States, call 1-800-345-1518. ■ In Canada, call 1-800-263-5868. ■ Elsewhere, see the Compaq website for locations and telephone numbers. Compaq Confidential – Need to Know Required Writer: Rachel Williams Project: Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Comments: Part Number: 221544-001 File Name: a-frnt.
Chapter 1 Clustering Overview A Compaq ProLiant™ Cluster for UnixWare 7 is a collection of servers, storage, and software that allows independent storage and servers to act as a single system. The cluster presents a single-system image to clients. It also protects against hardware, operating system, middleware, and application failures and provides configuration options for load balancing.
1-2 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 The Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Cluster Kit (U/300 kit) for the ProLiant DL380 server supports specific hardware components, enabling the cluster software to be installed in about an hour.
Clustering Overview ■ ■ For clusters using Ethernet interconnect: G One Compaq NC3123 Fast Ethernet NIC (NC3123 NIC) PCI 10/100 Wake on LAN (WOL) installed into slot 1 of each server for public network access G One crossover cable for the cluster interconnect (provided in the cluster kit) For clusters using ServerNet™ I interconnect: G One ServerNet I PCI adapter installed into slot 1 of each server G Two ServerNet I cables Storage Components The U/300 kit for the ProLiant DL380 server supports t
1-4 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Cluster Integrity Serial Cable The Cluster Integrity (CI) serial cable listed with the server components is required for the U/300 Quick Install cluster for the ProLiant DL380 server. This cable prevents the condition in which more than one node in a cluster acts as the root node and operates as the root node.
Clustering Overview An Ethernet cluster interconnect uses the embedded NIC in each server connected by one Ethernet crossover cable as shown in Figure 1-2. Node 1 Node 2 CI Serial Cable Dedicated ServerNet I Cables RA4100 Figure 1-2.
1-6 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Cluster-related software provided with the ProLiant DL380 server includes: ■ Compaq SmartStart™ and Support Software CD ■ Compaq Management CD Additionally, you must obtain software licenses. NOTE: SCO UnixWare 7 (with Mirroring Option or Online Data Manager) and UnixWare 7 NonStop Clusters software licenses must be purchased through your SCO reseller or distributor.
Clustering Overview Quick Install CDs for the ProLiant DL380 Server The Quick Install CDs for the ProLiant DL380 server provide rapid and simplified cluster installation. These CDs contain all the necessary software already configured for immediate cluster boot. An installation wizard allows you to enter parameters and licenses specific to your configuration. The Quick Install CDs for the ProLiant DL380 server contain a readme.
1-8 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Compaq ServerNet Verification Utility (SVU) The Compaq ServerNet Verification Utility (SVU) verifies proper installation and cabling of the Compaq ServerNet I interconnect before a UnixWare software installation. The SVU is a utility run from bootable diskettes inserted into each cluster node.
Clustering Overview Compaq Management CD The Compaq Management CD shipped with ProLiant servers contains software for managing Compaq clusters. The Compaq Insight Manager is included on the CD along with Compaq Management Agents and Tools for Servers for SCO UnixWare 7 NonStop Cluster. The Quick Install process automatically installs the agents and tools. ■ Compaq Insight Manager Compaq Insight Manager is an easy-to-use Microsoft Win32 software utility for collecting server and cluster information.
1-10 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Overview of Cluster Assembly and Software Installation Steps Use the following general steps to set up your cluster hardware, initialize the hardware, and install the software. The specific procedures are found in the sections noted in these steps: 1. Set up the cluster hardware.
Clustering Overview 4. Upgrade controller firmware. Firmware provides an interface between hardware and software. It is important to use the latest firmware for full hardware functionality. Upgrading controller firmware is performed using a diskette created as part of server configuration. Refer to “Updating Controller Firmware” in Chapter 3. 5. Verify ServerNet I connections.
1-12 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Other References For more information about the RA4100 storage subsystem or RA4000 redundant array controller, refer to the following guides, either as included with your hardware or as found at the Compaq Support website at http://www.compaq.
Clustering Overview SCO UnixWare 7 NonStop Clusters Documentation The SCO UnixWare 7 NonStop Clusters software includes online documentation, which you can view after the cluster is installed. The main documentation set is called SCOhelp and contains information that can answer many administrative questions. SCOhelp is available locally from the UnixWare Desktop and remotely using a Web browser when your cluster is connected to the public network.
Chapter 2 Setting Up Cluster Hardware Setting up a cluster includes setting up, cabling, and verifying hardware components. Use the following sections to set up the Compaq ProLiant Clusters for SCO UnixWare 7 U/300 for the Compaq ProLiant DL380 Quick Install Cluster: ■ Assembling the Rack ■ Setting Up the Cluster Nodes ■ Setting Up the External Storage Hardware ■ Cabling the Components For specific information about individual components, see the documentation that comes with the component.
2-2 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Assembling the Rack In clusters that use racks, rack assembly requires careful attention to avoid problems. Evaluate the site where the cluster is to be installed by checking the path and setup area.
Setting Up Cluster Hardware Stacking Components Keep in mind the following considerations while stacking components in a rack: ■ Put the UPSs in the bottom of the rack. ■ Assemble other components into the rack from the bottom up. ■ Put the heaviest equipment per U of height in the bottom of the rack whenever possible. ■ Install non-flat-panel monitors toward the top of the rack. ■ Install components that require better cooling capacity toward the top of the rack.
2-4 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Transporting Racks Before transporting a filled rack, read the documentation that comes with the rack to determine the safety measures to take for successful transportation. Never transport a rack without first reviewing the documentation. Develop standard procedures for securing rack equipment depending on the rack and its components.
Setting Up Cluster Hardware Setting Up the Cluster Nodes Setting up the cluster nodes includes: ■ Installing the 64-bit Fibre Channel Host Bus Adapter (HBA) into slot 3 of each node and Gigabit Interface Converters Shortwave (GBIC-SW) into each adapter ■ Installing one 9.
2-6 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Installing Internal Disk Drives One 9.1-GB disk drive is required per node. The Quick Install automatically configures each internal drive with a 9.1-GB partition, even if the disk drive is larger than 9.1-GB. This UnixWare partition cannot be modified, and other UnixWare partitions cannot be added to this disk drive.
Setting Up Cluster Hardware Setting Up the External Storage Hardware IMPORTANT: The RA4100 is shipped with a single RAID controller. Each RA4100 array used in Compaq ProLiant Clusters for SCO UnixWare 7 requires an additional, redundant controller. NOTE: The Quick Install automatically configures an RA4100 drive with a RAID 1 9.1-GB UnixWare partition, even if the disk drive is larger than 9.1 GB. This partition cannot be modified and other UnixWare disk drive partitions cannot be added to this disk drive.
2-8 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 4. Ignore the chapters on the Array Configuration Utility and the Options ROMPaq in the documentation for the RA4100. These steps are part of the installation procedure in Chapter 3, “Installing Cluster Software.” 5. Install a 9.1-GB or larger disk drive into each slot 0 of the array.
Setting Up Cluster Hardware Cabling the ServerNet I Interconnect ServerNet I adapters include X and Y connections for redundancy. Figure 2-1 shows the ServerNet I adapter connections. Port X Connector Port Y Connector PCI Bus Connector Figure 2-1. ServerNet I, PCI adapter connections IMPORTANT: Cable X and Y to their corresponding counterparts. Do not cable X connections to Y connections.
2-10 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 The ServerNet I cables directly connect the ServerNet I adapter in node 1 to the ServerNet I adapter in node 2, as shown in Figure 2-2. Node 1 Node 2 CI Serial Cable X Dedicated ServerNet I Cables Y Public Network Figure 2-2. Example of cabling the cluster interconnect of a cluster that uses ServerNet I interconnect NOTE: Cabling for the external storage is intentionally not shown.
Setting Up Cluster Hardware Use the cabling suggestions illustrated in Figure 2-3 to label the ServerNet I cables. Node ServerNet I Number Switch Port Number 1 0 Cable Tie Color Pink X ServerNet I cables are identified with White cable ties. X/Y Fabric Identifier Node Identifier 2 1 Orange Red ties are used only during shipment and are to be removed during onsite installation. Figure 2-3. ServerNet I cable labeling suggestion To cable the ServerNet I interconnect, follow these steps: 1.
2-12 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Cabling the Public LAN Connection For interconnects using ServerNet I, connect the public LAN Ethernet cable to the embedded NIC of the servers. See Figure 2-2 earlier in this chapter. For interconnects using Ethernet, connect the public LAN Ethernet cable to the NC3123 NIC into slot 1 of the servers. See Figure 2-4. Node 1 Node 2 Ethernet Crossover Cable CI Serial Cable Public Network Figure 2-4.
Setting Up Cluster Hardware Cabling the Ethernet Interconnect An Ethernet crossover cable is required for cluster interconnects using Ethernet. To cable the Ethernet interconnect, connect one end of the Ethernet crossover cable to the embedded NIC in node 1. Connect the other end of the Ethernet crossover cable to the embedded NIC in node 2. Figure 2-4 illustrates the proper cabling. Cabling the CI Serial Cable IMPORTANT: The CI serial cable is required.
2-14 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Cabling the RA4100 To cable the RA4100 components, follow these steps: 1. Connect the Fibre Channel cabling between the arrays and nodes using the instructions in the user guide for the RA4100 and the documentation that comes with the Fibre Channel cables. See Figure 2-5 for a cabling illustration.
Setting Up Cluster Hardware Fibre Channel Cable Precautions Keep the following precautions in mind when installing, handling, moving, connecting, and disconnecting Fibre Channel cables: ■ Affix cable labels carefully, without over-tightening, to avoid breaking the glass fibers within the cables. ■ Do not bend the Fibre Channel cable into an arc tighter than the minimum allowable bend radius specified by the cable manufacturer.
2-16 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Cabling the Keyboard, Monitor, and Mouse To cable the keyboard, monitor, and mouse, refer to the documentation that comes with these devices. UPS Power Management Cabling Compaq ProLiant Clusters for SCO UnixWare 7 support serial data connections from UPS units to ProLiant server nodes in the cluster.
Chapter 3 Installing Cluster Software Using the Compaq ProLiant Clusters for the SCO UnixWare 7 DL380 Quick Install CDs for the Compaq ProLiant DL380 server to install the SCO UnixWare 7 NonStop Clusters software on a ProLiant DL380 cluster includes several tasks.
3-2 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Understanding Preinstallation Tasks and Considerations Before you begin the software installation, assemble the hardware for the cluster, fill out the Quick Install planning worksheets in Appendix B of this guide, and have four formatted diskettes on hand. Read through this chapter to become familiar with the installation procedures as you fill out the worksheets.
Installing Cluster Software Obtaining UnixWare 7 Licenses Before installing the SCO UnixWare 7 NonStop Cluster software, obtain a UnixWare 7 license that includes either the Mirroring Option or an OnLine Data Manager (ODM) license. To locate a convenient SCO reseller or distributor to purchase licenses, see the SCO website at http://www.sco.
3-4 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 4. Click Yes when prompted to continue. Wait for the configuration to be erased. 5. If you used the Server Profile Diskette, remove it. Power down the RA4100 if you are erasing the configuration on node 1. When prompted, power down, and then power up only the server. IMPORTANT: Do not turn the RA4100 back on at this time. Continue with the following procedure for server configuration.
Installing Cluster Software 13. Page down to the Embedded - Compaq Automated Server Recovery entry. 14. Verify that the following items are disabled: G Software Error Recovery G Standby Recovery Server G UPS Shutdown Use the arrow keys to select the options and the Enter key to modify them as necessary. 15. Page down to Embedded-Compaq Integrated Dual Channel Wide Ultra2 SCSI Controller (Port2). G Select Controller Order, and then press Enter. G Select First and press F10.
3-6 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 25. Create a firmware diskette at this time if you are completing this step on node 1. You will use this diskette later to upgrade the RA4000 controller firmware. If you are performing this procedure on node 2, skip this step and continue with the following section “Updating Controller Firmware.” To create a firmware diskette, obtain one DOS formatted diskette, and follow these steps: a.
Installing Cluster Software Updating Controller Firmware Controller firmware must be updated on both nodes. Use the following procedure to upgrade the controller firmware: 1. Turn on the RA4100 and wait about 90 seconds for the RA4000 controllers to complete their POSTs. 2. Insert the firmware upgrade diskette into the drive. (You made this diskette in the preceding procedure.) 3. Boot the node from the diskette, and then follow the prompts on the screen until the firmware is updated. 4.
3-8 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 3. From the options presented to you, select the following: G Select your particular server from the list presented to you. G Select the appropriate model or All Models. G Select SCO UnixWare 7 from the list of operating systems. 4. Select the Softpaq for ServerNet Verification Utilities. At the download page, follow the directions for downloading the Softpaq and creating diskettes.
Installing Cluster Software Verifying Node-to-Node Communication Node-to-node communication tests include a link test for the cables and a loopback test for the adapters. Use the following steps to verify node-to-node communication on a directly connected ServerNet I two-node cluster: 1. Insert a ServerNet Utility Disk into node 1 and node 2, and then reboot the nodes. Wait for the DOS prompt on the nodes. 2. Type spaf 1 2 at the DOS prompt on node 1, and then press Enter. A title screen displays. 3.
3-10 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Installing the Cluster Using Quick Install Before beginning the software installation, be sure to have the Quick Install planning worksheets on hand and the following items available: ■ Cluster name and Cluster Virtual IP (CVIP) address ■ Node 1 hostname and IP address for the public network ■ Node 2 hostname and IP address for the public network ■ Netmask for the public network ■ For clusters
Installing Cluster Software Installing Node 1 Before beginning the installation, select the set of Quick Install CDs for your cluster configuration. Choose the CDs for either the ServerNet I cluster interconnect or Ethernet cluster interconnect. NOTE: To save time, you can install both nodes together. Be sure node 1 has rebooted before rebooting node 2. Insert the CDs into the servers, power up the servers, and follow the procedures for each node at the same time.
3-12 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 b. Date, Time, and Time Zone Modify the current date, time, and time zone as necessary. Only U.S. time zones are available during Quick Install. Time zone information can be changed after Quick Install by using the UnixWare SCOadmin system administration tools. For more information, see the “Understanding Preinstallation Tasks and Considerations” section in this chapter. c.
Installing Cluster Software 6. Enter the node 1 UnixWare license, the node 2 UnixWare license, and the NonStop Clusters license. To complete this step, you must have either UnixWare licenses that include the mirroring license, or an add-on license for either the ODM or mirroring. After you exit the license manager, the node continues booting. NOTE: Node 2 cannot join the cluster until licensing information has been entered on node 1.
3-14 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 IMPORTANT: If you supply information here, this information must match the information that you supplied for the first node. See step 4e of "Installing Node 1" earlier in this chapter. The option to load or save this information to a diskette is not available. b. Wait for a message that indicates that the installation is complete. 3. Wait for node 1 to complete the installation and reboot. 4.
Installing Cluster Software Registering the ProLiant Cluster for SCO UnixWare 7 After the cluster is verified, go to the Compaq High Availability website to register the cluster. Compaq sends notification to registered users as software updates and additional support for SCO UnixWare 7 NonStop Clusters is made available. To register the cluster, see the Compaq High Availability website at http://www.compaq.
Chapter 4 Managing Clusters Compaq and SCO both provide a variety of software to simplify the management of ProLiant Clusters for SCO UnixWare 7. SCO cluster management software includes: ■ Clusterized SCOadmin ■ Event Processor Subsystem ■ SCO UnixWare 7 NonStop Clusters Management Suite ■ Clusterized and cluster-specific command line utilities Compaq provides the management capabilities customized for use with ProLiant Clusters for SCO UnixWare 7.
4-2 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 SCO UnixWare 7 NonStop Clusters Management Software The single-system image of the Compaq ProLiant cluster makes managing a cluster similar to managing a single-node, noncluster UnixWare 7 system. The standard SCO documentation is useful for performing the management tasks.
Managing Clusters The SCOadmin software includes the following management applications: ■ Account Manager ■ Filesystem Manager ■ License Manager ■ Login Session Viewer ■ Mail Manager ■ Netscape Server Administrator ■ Print Job Manager ■ Printer Setup Manager ■ SCOadmin Setup Wizard ■ Task Scheduler ■ VERITAS Volume Manager ■ Virtual Domain User Manager The following SCOadmin folders provide additional management tools: ■ Clustering ■ Compaq ■ Hardware ■ Networking ■ Softwar
4-4 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 NonStop Clusters Management Suite The NonStop Clusters Management Suite (NCMS) is installed as part of SCO UnixWare 7 NonStop Clusters and includes: ■ Config Manager ■ ServerNet Manager ■ Keepalive Manager ■ Keepalive Configuration Manager ■ Samview Start NCMS by entering the ncms command at a command line prompt or by selecting an application from the clustering entry in the SCOadmin manageme
Managing Clusters Keepalive Configuration Manager The SCO UnixWare 7 NonStop Clusters Keepalive Configuration Manager provides a graphical user interface to create and manage configuration file sets for applications to be monitored by the Keepalive subsystem. Samview The SCO UnixWare 7 NonStop Clusters System Availability Monitor (SAM) Viewer displays availability reports for the cluster, nodes, and other devices.
4-6 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 The SCO UnixWare commands that are clusterized in SCO UnixWare 7 NonStop Clusters include: ■ netstat, inetd, netcfg, rpcbind ■ fuser, df, fsck, mknod, sync ■ VERITAS commands, mount, umount, mountall, umountall ■ init, crash, cron ■ id tools, pdi commands ■ pmd, brand ■ SCOadmin commands ■ sar, ps, ipcs, prtconf ■ shutdown The following SCO UnixWare commands interrogate the cluster nod
Managing Clusters Compaq ProLiant Cluster Management Software for SCO UnixWare 7 NonStop Clusters Compaq provides the following cluster management capabilities customized for use with Compaq ProLiant Cluster for SCO UnixWare 7 NonStop Clusters. These capabilities are available on the Compaq Management CD shipped with ProLiant servers.
4-8 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Clusterized Compaq Management Agents Compaq Management Agents running on a SCO UnixWare 7 NonStop Clusters system support the same client-server interface as a single-server SCO UnixWare 7 system. The client-server interface for Compaq Insight Manager is SNMP-based, which allows ProLiant servers and clusters to be managed by other network management client software.
Managing Clusters The Quick Install procedure automatically installs the support needed for Compaq Insight Manager XE. On the Management CD, the package that provides this support is nscccm and is part of the Compaq Management Agents and Tools for Servers for SCO UnixWare 7 NonStop Clusters portion of the CD. For additional information, refer to the Compaq Insight Manager XE User Guide included on the Management CD.
4-10 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Configuring SCO UnixWare 7 NonStop Clusters for UPS-Initiated Shutdown The UPS-initiated shutdown is configured by modifying the OS_SHUTDOWN, UPS_LOG_FILE, and UPS_SERIAL_PORT parameters within the /opt/compaq/etc/nscupsd.cfg configuration file. The OS_SHUTDOWN parameter specifies the battery backup power remaining when a cluster-wide shutdown is initiated.
Managing Clusters Two-Node Cluster with a Single Power Supply in Each Node When using a two-node cluster with two UPSs, as shown in Figure 4-1, configure the UPSs so that the cluster shuts down only if both UPSs are low on power. The loss of a single physical UPS results in the loss of one of the nodes but not the loss of the cluster. In this configuration, both UPSs are combined into a single logical UPS, which results in a UPS_SERIAL_PORT configuration of: UPS_SERIAL_PORT=/dev/tty00.1:/dev/tty00.
Chapter 5 Troubleshooting Carefully follow the detailed instructions provided in this guide to avoid unnecessary problems.
5-2 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Installation Problems This section addresses problems relating to installation of SCO UnixWare 7 or SCO UnixWare 7 NonStop Clusters. Table 5-1 Solving Installation Problems Problem Possible Cause Action Server unit does not power up Power cord or power source Check all power cords to ensure that they are fully inserted into the power supply plug and the outlet.
Troubleshooting Table 5-1 Solving Installation Problems continued Problem Possible Cause Action Error messages regarding the Cluster Integrity (CI) serial cable display The CI serial cable is not properly installed Install the CI serial cable between node 1 and node 2 using the serial port connector B on each node.
5-4 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Quick Install Error Messages This section addresses errors relating to Quick Install installation. Table 5-2 Quick Install Error Messages Error Message Possible Cause Action No disks found No internal disk drive Add a 9.1-GB or larger disk drive and configure the system with the SmartStart. See Chapter 2, “Setting Up Cluster Hardware” and Chapter 3, “Installing Cluster Software” of this guide.
Troubleshooting Node-to-Node Communication Problems This section addresses problems relating to node-to-node communication. Table 5-3 Solving Node-to-Node Communication Problems Problem Possible Cause Action New node does not join the cluster Ethernet crossover cable is not correctly cabled or is defective Verify that the Ethernet crossover cable is connected as described in Chapter 2 of this guide. Embedded NIC is not correctly functioning Verify that the embedded NIC is correctly configured.
5-6 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Table 5-3 Solving Node-to-Node Communication Problems continued Problem Possible Cause Action Existing node does not rejoin the cluster Node hardware failure Disconnect the node from the cluster. Diagnose and repair hardware failures as a stand-alone ProLiant server.
Troubleshooting Table 5-3 Solving Node-to-Node Communication Problems continued Problem Possible Cause Action Alternating root node panics (RA4100 system) RA4100 storage subsystems or hubs are not powered up Apply power to the hubs and storage subsystems. Ethernet connection failed (and the CI serial cable is not used) Power down the cluster. Check the Ethernet crossover cable to determine that the cable is properly connected, or is not crimped or compromised in any way.
5-8 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Table 5-3 Solving Node-to-Node Communication Problems continued Problem Possible Cause Action Alternating root node panics ServerNet I cross-cabled in two-node cluster (and the CI serial cable is not used) Power down the cluster. Verify that ServerNet I is cabled between cluster nodes (X to X and Y to Y) as described in Chapter 2 of this guide. Correct the cabling and boot the cluster.
Troubleshooting Table 5-3 Solving Node-to-Node Communication Problems continued Problem Possible Cause Action Bad packets or ServerNet I barrier errors reported SPA is defective Cluster Membership Service (CLMS) master (the active root node) is unable to communicate with a node during startup or normal operation. If a node does not join the cluster, verify that the SPA is functioning on that node using the SVU as described in Chapter 3 of this guide.
5-10 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Shared Storage Problems This section addresses problems that can be encountered in clusters using the Compaq StorageWorks RAID Array 4100 storage system. This section does not address RA4100 storage system problems specific to the storage system itself. For those issues, see the user guide for the RA4100 and the Fibre Channel troubleshooting guide.
Troubleshooting Table 5-4 Solving Shared Storage Problems continued Problem Possible Cause Action Unstable loop errors resulting in the adapter being taken offline GBIC-SW laser has malfunctioned Shut down the node containing the adapter and use that node to diagnose and isolate the problem using the information contained in the user guide for the RA4100 and the Fibre Channel troubleshooting guide. Replace any defective component.
5-12 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Client-to-Cluster Connectivity Problems This section addresses problems relating to client-to-cluster connectivity. Table 5-5 Solving Client-to-Cluster Connectivity Problems Problem Possible Cause Action Clients cannot communicate with a node (or nodes) over Ethernet Improper name resolution Verify that the /etc/resolv.conf file within the cluster indicates the correct domain name servers.
Troubleshooting Table 5-5 Solving Client-to-Cluster Connectivity Problems continued Problem Possible Cause Action CVIP is not accessible after a node failure Cluster virtual interface has no available public network interfaces on the same subnet Configure the cluster so that at least two public network interface NIC boards on two different nodes have IP addresses on the same subnet as the CVIP address.
5-14 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Cluster Resource Problems This section addresses problems relating to cluster resources. Table 5-6 Solving Cluster Resource Problems Problem Possible Cause Action Device is not seen on all nodes in a cluster Mismatched kernels Ensure that all nodes are in the cluster, and then reboot node 2.
Troubleshooting ServerNet I Messages Use this section to interpret and respond to the following types of messages: ■ ServerNet I SAN Error Messages ■ ServerNet I Notice Messages ■ ServerNet I Warning Messages ■ ServerNet I Panic Messages ■ ServerNet I Continuation and Informative Messages For information about ServerNet, see the NonStop Clusters for the SCO UnixWare 7 System Administrator’s Guide located in the SCOhelp online documentation set.
5-16 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Table 5-7 lists the text strings for severity, explains what the text strings mean, and references the tables containing the message details.
Troubleshooting ServerNet I Notice Messages This section addresses ServerNet I Notice Messages. Table 5-9 ServerNet I Notice Messages Messages Description User Action Barrier failed on path:n snetID:0xF0nnn curpath:n These messages display when a new node attempts to join a cluster. The message indicates whether the new node is able to communicate with the target node over the given path (X/Y). If a path is cabled, a success message is expected.
5-18 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Table 5-9 ServerNet I Notice Messages continued Messages Description User Action Link exception condition on path n has been resolved. Re-enabling path n Indicates that a link exception condition on a path is resolved and that link exception detection and processing is re-enabled for that path. The path becomes available for ServerNet I communications within the next minute.
Troubleshooting ServerNet I Warning Messages The ServerNet I warning messages are listed in Table 5-10. Messages are listed in alphabetical order except where a series of messages associated with a single-fault condition are grouped together. These groups are alphabetized under the first message in the series. If you cannot find a particular message, look toward the end of the table where multiple messages having the same description are grouped together and are not in alphabetical order.
5-20 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Table 5-10 ServerNet I Warning Messages continued Messages Description User Action (0xF0nnn) Multiple link exceptions detected on path n This series of messages indicates that a burst of link exceptions was detected on a ServerNet I path. Link exception reporting must be enabled (see spam –l on command) for these messages to be displayed. Check the cabling at the local node on indicated the path.
Troubleshooting Table 5-10 ServerNet I Warning Messages continued Messages Description User Action 0xF0nnn: rcvd spurious packet acknowledge, src=0xnnnnnnnn Indicates that an unexpected packet acknowledgment arrived. Usually this message can be linked with a [SNET] timeout message. The acknowledgment from the packet that was timed out arrived late. None. Watch for additional [SNET] timeout messages.
5-22 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Table 5-10 ServerNet I Warning Messages continued Messages Description User Action (0xF0nnn) exception queue error These messages indicate hardware error conditions were detected during interrupt processing. Queue overruns and transmitter/receiver overflows indicate a potential loss of a response due to buffer space exhaustion. None.
Troubleshooting ServerNet I Panic Messages The ServerNet I panic messages are listed in Table 5-11. Most of the messages are in alphabetical order. However, if you cannot find a particular message, look toward the end of the table where multiple messages having the same description are grouped together and are not in alphabetical order.
5-24 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Table 5-11 ServerNet I Panic Messages continued Messages Description User Action ship_PCI_initialize: Found n ServerNet I PCI adapters— currently only one ServerNet I PCI adapter supported Indicates that during the discovery and initialization of the SPA, more than one SPA was found Ensure that only one SPA 1.5 revision E is installed in the local node.
Troubleshooting Table 5-11 ServerNet I Panic Messages continued Messages Description User Action ship_init: Unsupported revision of the SAIL ASIC detected CIN=0xnnnnnnnn Indicates that an SPA was found, but the SAIL ASIC on it is not a recognized revision. The driver recognizes revisions A and B of the SAIL ASIC; however, B is the only revision supported by Compaq ProLiant Clusters for SCO UnixWare 7. Replace the SPA with a version containing revision B of SAIL ASIC (SPA 1.5 revision E).
5-26 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Table 5-11 ServerNet I Panic Messages continued Messages Description User Action avt_define_q: invalid interrupt queue size: nnnn These messages are all SPAD (software) errors. If possible, take crash dump for analysis by product support personnel. Reboot the node into the cluster.
Troubleshooting ServerNet I Continuation and Informative Messages The ServerNet I continuation and informative messages are listed in alphabetical order in Table 5-12. Table 5-12 ServerNet I Continuation and Informative Messages Messages Description User Action AVT entry 0xnnnnnnnn @ 0xnnnnnnnn: I/O Address 0xnnnnnnnn, Type = Data AVT entry 0xnnnnnnnn @ 0xnnnnnnnn: I/O Address = 0xnnnnnnnn, Type = Interrupt These are two separate cases of continuation messages.
5-28 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Table 5-12 ServerNet I Continuation and Informative Messages continued Messages Description User Action Dump of Exception Packet @ 0xnnnnnnnn This continuation message is followed by additional information from the packet in question, which was not expected.
Appendix A Software Versions Software versions provided by the Quick Install CDs for the SCO UnixWare 7 NonStop Clusters include: ■ SCO UnixWare 7.1.1 ■ SCO UnixWare 7 NonStop Clusters 7 1.1+IP, PTF nsc1011c, PTF nsc1013a ■ Compaq EFS 7.38a ■ Compaq Management Agents 4.90 ■ System partition created from the Compaq SmartStart and Support Software CD 4.90 Additional software and versions needed include: ■ Compaq SmartStart and Support Software CD 4.
Appendix B Quick Install Planning Worksheets The following worksheets help you to gather and organize the information that you need for the SCO UnixWare 7 NonStop Clusters Quick Install procedures described in Chapter 3, “Installing Cluster Software,” for the Compaq ProLiant DL380 server. Fill these worksheets out before you begin the software installation and use the data where needed in the procedures.
B-2 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Table B-1 Quick Install Data continued Screen Field Your Information e Node 1 hostname for the cluster interconnect node1-ic Not used for ServerNet I cluster Node 1 IP address for the cluster interconnect 10.1.0.1 Node 2 hostname for the cluster interconnect node2-ic Node 2 IP address for the cluster interconnect 10.1.0.2 Netmask 255.255.255.
Quick Install Planning Worksheets Table B-2 SCO UnixWare License Worksheet Field Your Information Node 1 license number Node 1 license code Node 1 license data (if necessary) NonStop Cluster Two-Node License Node 2 license number Node 2 license code Node 2 license data (if necessary) Compaq Confidential – Need to Know Required Writer: Rachel Williams Project: Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 Comments: Part Number: 221544-001 File Name: h
Glossary CI Serial Cable See Cluster Integrity Serial Cable CLMS See Cluster Membership Service Cluster Integrity Serial Cable The Cluster Integrity (CI) serial cable is a serial cable that connects to a serial port on each node in a two-node cluster. The cable prevents split-brain, a condition that results in both nodes in a two-node cluster trying to operate as the root node.
2 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 CVIP See Cluster Virtual IP Desktop Management Interface Desktop Management Interface (DMI) is an industry framework for managing and keeping track of hardware and software components in a system of personal computers from a central location. DMI See Desktop Management Interface Ethernet Crossover Cable The Ethernet crossover cable provides the node-to-node communication data path for the cluster.
Glossary 3 PCI See Peripheral Component Interconnect Peripheral Component Interconnect Peripheral Component Interconnect (PCI) is an interconnection bus system that provides high speed operation. SAIL See ServerNet Advanced Interface Logic SAN See Storage Area Network ServerNet Advanced Interface Logic ServerNet Advanced Interface Logic (SAIL) converts software requests into ServerNet operations.
4 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 SNMP See Simple Network Management Protocol SPA See ServerNet PCI Adapter Split-Brain Split-brain is a condition that results in both nodes in a two-node cluster trying to operate as the root node. The use of the CI serial cable, which is included in this cluster kit, eliminates the possibility of split-brain.
Index A ACU (Array Configuration Utility), defined 1-8 additional information 1-12 agents clusterized 4-8 SNMP 4-4 application software cluster-aware 1-11 Compaq white papers 1-11 resources 1-11 Array Configuration Utility See ACU availability, cluster 1-1 B battery backup, UPS-initialed shutdown 4-10 C cables CI serial 1-4, 2-13 Ethernet crossover 1-3, 2-13 Fibre Channel 2-14 Fibre Channel, precautions 2-15 keyboard 2-16 labeling 2-8 monitor 2-16 mouse 2-16 public LAN Ethernet 2-12 ServerNet I intercon
2 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 SmartStart 1-8 checklists, installation B-1 CI (Cluster Integrity), serial cable 1-4, 2-13 client-to-cluster connectivity, problems 5-12 cluster additional setup tasks 3-14 availability 1-1 benefits 1-1 communications 1-3 DL380 configuration 1-4 hardware components 1-2 interconnect 1-3 investment protection 1-1 manageability 1-1 management 4-1 operational efficiency 1-1 registering 3-15 reports 4-5 resour
Index warning viii, ix EPS (Event Processing Subsystem), defined 4-3 erasing the configuration, procedure 3-3 error messages Quick Install 5-4 ServerNet I SAN 5-15 severity 5-16 Ethernet crossover cable 1-3, 2-13 interconnect 1-3 Event Processing Subsystem See EPS exclamation point symbol viii external storage components 2-7 F FFIU (Fibre Channel Fault Isolation Utility), defined 1-8 Fibre Channel cables 2-14, 2-15 Fibre Channel Fault Isolation Utility See FFIU file sets, configuration 4-5 firmware troubl
4 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 informative messages, ServerNet I 5-27 installation checklists B-1 considerations default Quick Install settings 3-2 general steps 1-10 problems 5-2 software 3-1, 3-2 installation considerations internal disk drive 3-2 installing clusters 3-10 Fibre Channel cables 2-14 GBIC-SW 2-5 HBA 2-5 internal disk drives 2-6 public LAN NIC 2-6 redundant controller 2-7 ServerNet I, interconnect 2-6 UPS cabling 2-16 in
Index NonStop Cluster Management Suite See NCMS NonStop Clusters Verification Utility See NSCVU notice messages ServerNet I 5-17 NSCVU (NonStop Clusters Verification Utility) 4-9 defined 3-14 verifying, clusters 1-7 O obtaining licenses 3-3 onnode commands 4-6 operational efficiency, cluster 1-1 Options ROMPaq, utility 1-8 P panic messages, ServerNet I 5-23 PDU (Power Distribution Unit), installation 5-2 planning worksheets B-1 Power Distribution Unit See PDU power system, two-node cluster, illustrated 4
6 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380 screwdriver symbol viii server, hardware components, cluster 1-2 ServerNet I cable labeling suggestion, illustrated 2-11 cluster, cabling, illustrated 2-10 connections, verifying 3-7 continuation and informative messages 5-27 interconnect cabling 2-9 installing 2-6 local adapter, verifying 3-8 panic messages 5-23 PCI adapter connections, illustrated 2-9 verifying node-to-node communication 3-9 warning mes
Index Solving Client-to-Cluster Connectivity Problems 5-12 Solving Cluster Resource Problems 5-14 Solving Installation Problems 5-2 Solving Node-to-Node Communication Problems 5-5 Solving Shared Storage Problems 5-10 technical support x telephone numbers x, xi symbol ix testing clusters 1-7 storage configuration 1-7 text conventions vii transporting racks 2-4 troubleshooting client-to-cluster connectivity problems 5-12 cluster resource problems 5-14 firmware 5-3 installation problems 5-2 node-to-node commu