Servers Troubleshooting Guide First Edition (November 1999) Part Number 161434-001 Compaq Computer Corporation Compaq Confidential – Need to Know Required Writer: Susana Weinstein Salazar Project: Compaq Servers Troubleshooting Guide Comments: Part Number: 161434-001 File Name: a-frnt.
Notice The information in this publication is subject to change without notice. COMPAQ COMPUTER CORPORATION SHALL NOT BE LIABLE FOR TECHNICAL OR EDITORIAL ERRORS OR OMISSIONS CONTAINED HEREIN, NOR FOR INCIDENTAL OR CONSEQUENTIAL DAMAGES RESULTING FROM THE FURNISHING, PERFORMANCE, OR USE OF THIS MATERIAL.
Contents About This Guide Intended Audience....................................................................................................viii How to Use This Guide ..............................................................................................ix Compaq Resources ...................................................................................................... x Contacting Compaq ..................................................................................................
iv Compaq Servers Troubleshooting Guide Hardware Problems continued General Hardware Problems .................................................................................... 2-5 General Loose Connections.............................................................................. 2-5 Power Processor Module (PPM) Problems....................................................... 2-5 Integrated Management Display (IMD) Problems............................................ 2-6 When the Self Tests Fail....
Contents Chapter 4 Diagnostic Tools Accessing Diagnostic Tools .....................................................................................4-4 Run from System Partition ................................................................................4-4 Run from Diskette .............................................................................................4-5 Run from Compaq SmartStart and Support Software CD.................................4-5 Compaq Diagnostics Software ...................
vi Compaq Servers Troubleshooting Guide Chapter 5 ROMPaq Disaster Recovery Chapter 6 Automatic Server Recovery Chapter 7 Preventing Future Problems Preparing for Changes ............................................................................................. 7-1 Minimizing the Impact of Changes Using Compaq Tools ............................... 7-2 Use a Methodology.................................................................................................. 7-4 Visually Check Your Server .....
Contents List of Tables Location of Information..............................................................................................ix Compaq Resources ...................................................................................................... x Table 1-1 Collect the Facts.......................................................................................1-3 Table 1-2 Action Observation ..................................................................................
About This Guide This guide is provided as a tool to troubleshoot Compaq servers. IMPORTANT: The chapters in this guide provide general information for several Compaq servers. Some hardware or software information covered may not apply to your specific server. Some of the examples or procedures may need to be modified for your work environment.
About This Guide How to Use This Guide To find help for a specific problem you are troubleshooting, first refer to “Diagnosis Steps” in Chapter 1 and your server-specific troubleshooting guide. To find general information about troubleshooting topics, use the following chart to understand the organization of the book.
x Compaq Servers Troubleshooting Guide ■ What status indicators your server provides and what they mean ■ How to interpret the LED definitions for your server ■ How to troubleshoot the latest Power-On Self-Test (POST) messages ■ How to troubleshoot array controllers error information from the Compaq Array Diagnostic Utility (ADU) Compaq Resources Refer to the following additional information for help.
About This Guide Compaq Resources continued Resource What it is How to obtain Compaq support on commercial online networks A forum to post questions to Compaq technical support or other Compaq enthusiasts by using the Message Base Feature, a standard on Compaq support forums found on all three online networks. You can access Compaq utility files, drivers, software, and other Compaq related information. America Online Technical warranty and support information provided through a facsimile machine.
xii Compaq Servers Troubleshooting Guide Compaq Resources continued Resource What it is How to obtain SmartStart and Support Software CD The latest system software updates, in addition to an efficient way to deploy new systems. Run the SmartStart and Support Software CD shipped with your Compaq server. The CD is also included with Compaq Request, Replacement, and Subscription packs.
About This Guide Contacting Compaq If you have a problem and have exhausted the information in this guide, you can get further information and other help in the following locations. In the United States and Canada, call the Compaq Technical Support Center at 1-800-OK-COMPAQ (1-800-652-6672), where a technical support specialist will help you diagnose the problem. For continuous quality improvement, calls may be recorded or monitored.
xiv Compaq Servers Troubleshooting Guide Warning Information For a complete list of warnings associated with your server, refer to the user documentation provided with your server. WARNING: To reduce the risk of personal injury from hot surfaces, allow the internal system components to cool before touching. WARNING: This product is very heavy.
About This Guide Symbols on Equipment These icons may be located on equipment in areas where hazardous conditions may exist. Any surface or area of the equipment marked with these symbols indicates the presence of electrical shock hazards. Enclosed area contains no operator serviceable parts. WARNING: To reduce the risk of injury from electrical shock hazards, do not open this enclosure. Any RJ-45 receptacle marked with these symbols indicates a Network Interface Connection.
xvi Compaq Servers Troubleshooting Guide weight kg weight lb WARNING: Any product or assembly marked with these symbols indicates that the component exceeds the recommended weight for one individual to handle safely. WARNING: To reduce the risk of personal injury or damage to the equipment, observe local occupational health and safety requirements and guidelines for manual material handling.
About This Guide Text Conventions This document uses the following conventions to distinguish elements of text: Keys Keys appear in boldface. A plus sign (+) between two keys indicates that they should be pressed simultaneously. USER INPUT User input appears in a different typeface and in uppercase. FILENAMES File names appear in uppercase italics. Menu Options, Command Names, Dialog Box Names These elements appear in initial capital letters.
xviii Compaq Servers Troubleshooting Guide Symbols in Text These symbols may be found in the text of this guide. They have the following meanings. WARNING: Text set off in this manner indicates that failure to follow directions in the warning could result in bodily harm or loss of life. CAUTION: Text set off in this manner indicates that failure to follow directions could result in damage to equipment or loss of information.
Chapter 1 Diagnosing the Problem This chapter covers the steps you should take when an error occurs. Going through a structured set of tasks will help you to isolate the problem quickly. Use this chapter to help you: ■ Devise a troubleshooting plan. ■ Gather and record all of the facts before trying to troubleshoot. ■ Follow an orderly procedure for diagnosis.
1-2 Compaq Servers Troubleshooting Guide Developing a Troubleshooting Plan Evaluate all of the information and symptoms to: ■ Collect the facts. Use the following “Gathering Information” section. ■ Analyze the mode of failure. ■ Identify which components could cause the problem. ■ Identify any third-party components. This is a very important step. Find out exactly what the board is and which slot it is installed in. Some third-party PCI boards must be installed on the primary PCI bus.
Diagnosing the Problem Gathering Information If you encounter an error condition, gather the following information first. Having these details available will reduce your troubleshooting time. This information will also help the Compaq support person diagnose and solve your problem.
1-4 Compaq Servers Troubleshooting Guide Table 1-1 Collect the Facts continued Question Examples What are the symptoms, and under what conditions do they appear? Did the server suddenly shut down? Your Facts Does the server keep rebooting? Did the server ever boot? Do errors occur only when a specific application runs? Are there random errors? Are there intermittent problems? Is there any failure information? Are there messages: Record the FULL error message.
Diagnosing the Problem Table 1-1 Collect the Facts continued Question Examples Your Facts What are the hardware components? What is the system configuration: Run the Inspect Utility. Boot the system and press F10 when you see the following message: Memory? Processor(s)? Processor speed(s)? “Press F10 for system partition utilities” Cache memory? Run the Compaq Survey Utility (for servers running Windows NT or NetWare). Controllers? Run Compaq Insight Manager.
1-6 Compaq Servers Troubleshooting Guide Table 1-1 Collect the Facts continued Question Examples Is the utilization rate/traffic appropriate? What is the bus utilization shown in Compaq Insight Manager? Your Facts What utilization information do your third-party tools provide? How does the current utilization differ from the history? Prepare Server for Diagnosis After you record the facts, complete as many steps as possible to prepare the server for more detailed troubleshooting procedures, then see
Diagnosing the Problem Execute the Action Plan As you complete the following steps, carefully observe each step of the action plan as it is executed, and watch for the occurrence of new symptoms, or the elimination of existing ones. Some results are obvious, such as the introduction of informational or error messages, or of significant changes in functionality. Other changes may not be as obvious and may require checking system logs to see if any new event was recorded after the change was made.
1-8 Compaq Servers Troubleshooting Guide Diagnosis Steps This section provides a quick path to help you locate detailed troubleshooting sections in the remainder of this book.
Diagnosing the Problem Table 1-3 Diagnosis Steps continued Problem Location of information New hardware recently added General Hardware Problems in Chapter 2 Server Pack upgrade Software Problems in Chapter 3 Trouble accessing data on the hard drive Hard Drive Problems in Chapter 2 You have corrected the problem and want to make sure you prevent future problems Chapter 7 None of the techniques attempted work Additional Resources in preface Compaq Confidential – Need to Know Required Writer: Sus
1-10 Compaq Servers Troubleshooting Guide Contacting Compaq Information You Will Need Before contacting Compaq, obtain the following: 1. All of the information from the “Gathering Information” section earlier in this chapter. 2. A printed copy of the system and operating environment information and a copy of any historical data that might be relevant. To obtain this information, run Inspect, found in the latest version of Compaq Diagnostics. 3.
Diagnosing the Problem Operating System Information Make sure you have the following operating system information available prior to contacting your service provider. If possible, gather this information about the last working version and the current version. IMPORTANT: This guide provides general information for several Compaq servers. Some hardware or software information covered may not apply to your specific server. Some of the examples or procedures may need to be modified for your work environment.
1-12 Compaq Servers Troubleshooting Guide IBM OS/2 Operating System 2.x, IBM OS/2 Warp 3.0, IBM OS/2 Warp Server Collect the following information: ■ Operating system version and printouts of: G IBMLAN.INI G PROTOCOL.INI G CONFIG.SYS G STARTUP.CMD G SYSLEVEL information in detail G A directory listing of: C:\ C:\OS2 C:\OS2\BOOT HPFS386.
Diagnosing the Problem Microsoft Windows NT Operating System and Windows NT Server Collect the following information: ■ A current copy of these files: G WINMSD G BOOT.INI G Windows NT Event Log. If there are errors, note them.
1-14 Compaq Servers Troubleshooting Guide Sun Solaris Operating System Collect the following information: ■ Operating system version ■ If Compaq drivers are installed with Driver Updates: ■ G Driver Update (DU) number G List of Compaq drivers in the DU The drive subsystem and file system information: G Number of partitions and logical drives, and their sizes G File system on each logical drive ■ List of all third-party hardware and software installed and the versions, if possible ■ Detaile
Diagnosing the Problem SQL Server If any of the above involve SQL Server for IBM OS/2 or Microsoft Windows NT, collect: ■ General Information: G Description of the database layout G Database activity prior to the problem G Description of how to reproduce the problem, if available G Names and functions of all stored procedures G All available information used to troubleshoot the problem at this point ■ SQL Server version ■ Master Database Configuration information ■ SQL Server Configuration
Chapter 2 Hardware Problems Table 2-1 Hardware Information For information about Look Details about the specifics for your server Your server’s specific troubleshooting guide Audio problems on page 18 CD-ROM problems on page 17 Diskette problems on page 16 Fan problems on page 12 General hardware problems on page 5 Hard drive problems on page 19 Memory problems on page 13 Mouse/keyboard problems on page 22 Network problems on page 27 New hardware not recognized on page 8 Power pro
2-2 Compaq Servers Troubleshooting Guide Power Problems This section contains methods of diagnosis for all Compaq servers. See the server-specific troubleshooting guide for your server’s hardware components. Power Source Table 2-2 Power Source Checklist Item Power On/Standby switch What to Check Make sure the switch is on. If your server has a power on/standby switch that returns to its original position after being pressed, make sure you press it firmly.
Hardware Problems Detecting UPS Failures Check the software version for your UPS to ensure it is the current version. Use the Compaq Power Management software located on your Power Management CD. Also make sure the power cord is the correct one for the UPS and the country. Refer to the UPS reference guide for specifications. If both the software and power cord are correct, use the following table to further troubleshoot the problem.
2-4 Compaq Servers Troubleshooting Guide System Short When powering on the server, if the power status LED blinks once every 10-20 seconds, turns amber, or stays off, the system is trying to start, but may have a short. CAUTION: Never operate the server with an access panel removed for an extended period of time. Doing so may cause thermal damage to drives and components and could void your system warranty. Perform the following. 1.
Hardware Problems General Hardware Problems General Loose Connections Devices often will not work because they are not properly connected or are not supported. If the system does not complete Power-On Self-Test (POST) or start loading an operating system, complete these checks. 1. Verify that the power supply status LED is green. If it is not, the power supply has experienced a failure. 2. Is the power cord properly connected? If not, plug the power cord in firmly and correctly. 3.
2-6 Compaq Servers Troubleshooting Guide Integrated Management Display (IMD) Problems If the server has an Integrated Management Display (IMD): ■ Verify that the IMD backlight is on. ■ If the IMD backlight is not on, check the IMD cable to ensure it is not damaged, and that it is properly connected. ■ If the IMD backlight is on, check the IMD contrast. You can adjust the level of contrast on the Integrated Management Display by using the up and down arrows.
Hardware Problems When the Self Tests Fail If the Power-On Self-Test (POST) appears hung, one of the self-tests is unable to complete. Briefly remove the designated access panel to inspect, and perform each of the following. Restart the server after each action to see if the problem is resolved. 1. Verify all required switch settings are set as dictated by your server’s user documentation. 2. Verify that all expansion boards, drives, and processors are firmly seated, and all latches are firmly closed. 3.
2-8 Compaq Servers Troubleshooting Guide Supported Devices Refer to the documentation provided with the device. Make sure your server and your operating system support the device. Also verify that you have the latest drivers required for the device. See the “Maintaining Current Drivers” section in Chapter 3. New Hardware Not Recognized or Server Will Not Start After Adding Hardware Use the following steps to troubleshoot problems that occur after you add hardware to the server.
Hardware Problems 8. Make sure the boards are properly installed in unit. 9. Check the software requirements. Install the latest drivers. Removing the Device If all of the above steps fail to correct the problem, uninstall the hardware and check the following: 1. Does the server work with the device removed? 2. For SCSI devices, does the device work if it is the only device on that bus? 3. If appropriate, test the board with all other boards removed. 4. Move the device to a different slot on the same bus.
2-10 Compaq Servers Troubleshooting Guide Video Problems When you first start the server, the monitor should display the Compaq ProLiant logo. If there is nothing displayed within approximately 60 seconds, check the following: Table 2-4 Video Problems Problem Possible Cause Possible Solution Screen is blank. Monitor is not turned on and the monitor light is not on. Turn on the monitor and check that the monitor light is on. Verify the monitor power cord is plugged into a working grounded AC outlet.
Hardware Problems Table 2-4 Video Problems continued Problem Possible Cause Possible Solution The power-on password is enabled. Press any key or type your password and wait a few moments for the screen to activate. You can tell if the power-on password is enabled if a key icon appears on the screen when POST completes. If you do not have access to the password, you must disable the power-on password by using the Password Disable switch on the system board.
2-12 Compaq Servers Troubleshooting Guide Fan Problems 1. Be sure there is proper ventilation. Refer to your server’s user documentation for further requirements. IMPORTANT: For good airflow, keep all access panels closed whenever possible. 2. Check any Power-On Self-Test (POST) messages for temperature violation or fan failure information. Refer to your server’s user documentation for Temperature Requirements for your server. 3. If possible, access the Integrated Management Log.
Hardware Problems Hot-Plug Fan LEDs Use the following table to troubleshoot hot-plug fan LED problems. Table 2-5 Hot-Plug Fan LEDs LED Fan Status Possible Solution Green Power to fan; fan is OK. None required. Amber Fan problems. Replace fan. Off No power to fan. Make sure fan is properly seated. Make sure power to fan is good. Replace fan. Memory Problems Use the following steps to troubleshoot memory problems. Be sure the memory is inserted as required by your server.
2-14 Compaq Servers Troubleshooting Guide Memory Count Problem The memory modules may not be installed correctly. 1. Verify that the memory modules have been installed correctly. Refer to your server’s user documentation. 2. Make sure the memory modules are properly seated. 3. Verify your operating system’s error information. 4. Restart the server. If the Power-On Self-Test (POST) count is still wrong, replace the memory. Server Fails to Recognize New Memory 1. Check the Integrated Management Log. 2.
Hardware Problems Server Fails to Recognize Existing Memory Use the following information to troubleshoot memory problems. Table 2-6 Existing Memory Problems Problem Action Server fails to recognize complete amount of memory. 1. Reseat memory. 2. Run the System Configuration Utility. 3. If the server still fails to recognize the memory, replace the memory. 1. Verify that the memory modules have been installed correctly. 2. Make sure the memory modules are properly seated. 3.
2-16 Compaq Servers Troubleshooting Guide Diskette Problems Use the following table to troubleshoot diskette drive problems. Table 2-7 Diskette Problems Problem Possible Cause Possible Solution Diskette drive light stays on. Diskette is damaged. Run CHKDSK on the diskette. Diskette is incorrectly inserted. Remove diskette and reinsert. Software is corrupt. Check the program diskettes or reinstall software from original media. Drive cable is not properly connected. Reconnect drive cable.
Hardware Problems CD-ROM Problems Use the following table to troubleshoot CD-ROM drive problems. Table 2-8 CD-ROM Problems Problem Possible Cause Possible Solution System will not boot from CD-ROM drive. The CD-ROM boot is not enabled through the Setup utility. Run the Setup utility and set the drive priorities. Data read from CD-ROM drive is inconsistent, or drive cannot read data. The CD-ROM drive or the media inserted is dirty. Clean the CD-ROM drive and media.
2-18 Compaq Servers Troubleshooting Guide Audio Problems Use the following table to troubleshoot audio problems. Table 2-9 Audio Problems Problem Possible Cause Possible Solution Server does not beep during the Power-On Self-Test (POST). If speaker has a cable, it is not properly attached. Ensure that speaker cable is connected. Refer to the maintenance and service guide for your server. You can access the guide from the Compaq website: http://www.compaq.
Hardware Problems Hard Drive Problems Use this section to troubleshoot hard drive problems. For more troubleshooting tips, also see “SCSI Device Problems” later in this chapter, and “Hard Drive LEDs” in your server’s specific troubleshooting guide. IMPORTANT: If the hard drive Fault LED is on, follow the proper troubleshooting procedures, find the cause of the problem, and fix it.
2-20 Compaq Servers Troubleshooting Guide Table 2-10 Hard Drive Problems continued Symptom Possible Problem Hard drive not recognized by server. Hard drive connection problem occurred. What to Check Check the LEDs on the hard drive. See “Hard Drive LEDs” in your server’s specific troubleshooting guide. Try removing and replacing the hard drive. If you remove any hard drives, label the drive and its position, and make sure you install it back in its original position. Cannot access data.
Hardware Problems Printer Problems Use this table to troubleshoot printer problems. Table 2-11 Printer Problems Problem Possible Cause Possible Solution Printer will not print. Printer is not turned on and online. Turn the printer on and make sure it is online. The correct printer drivers for your application are not installed. Install the correct printer drivers for your application. Printer network connection not made. Make the proper network connections to the printer.
2-22 Compaq Servers Troubleshooting Guide Mouse/Keyboard Problems Use the following table to troubleshoot mouse and keyboard problems. Table 2-12 Mouse/Keyboard Problems Problem Possible Cause Possible Solution Mouse or keyboard does not respond to movement. The mouse or keyboard is not firmly connected. If this is a rack-mounted server, check the cables to the switch box. Make sure the cables are securely attached. Make sure the switch is set for the server in question.
Hardware Problems Tape Drive Problems Use this section to troubleshoot tape device problems. Table 2-13 Tape Drive Problems Problem Possible Cause Possible Solution Server cannot write to tape. Drive not clean. Clean the drive. Refer to the instructions provided with the drive. DAT cleaning cartridges typically last for 30 passes. When a cartridge is used up, it will no longer be automatically ejected once the cleaning cycle has been completed. The tape must be manually ejected and disposed of.
2-24 Compaq Servers Troubleshooting Guide Table 2-13 Tape Drive Problems continued Problem DLT does not read tape. Server cannot see DLT. Possible Cause Possible Solution DLT is not firmly connected. Make sure the DLT is properly seated. Push the DLT in tightly. If that does not correct the problem, completely remove the DLT and reseat. Tape is write-protected. Remove write-protection. Tape not compatible with drive.
Hardware Problems Troubleshooting DAT Drives Use the following list to troubleshoot common DAT problems: ■ Upgrade drivers/software/firmware: One of the first steps to take when a problem exists is to upgrade to the latest revisions. ■ Clean drive: DAT drives are susceptible to particle contamination on the heads. When a problem occurs, clean the drive at least four times to ensure the heads are clean and to eliminate dirty heads as the cause of the failure. For more information, see Chapter 7.
2-26 Compaq Servers Troubleshooting Guide SCSI Device Problems If you the SCSI device is not being recognized, make sure the SCSI device is configured correctly. Refer to the documentation provided with the SCSI device. General Items to Check Non-hot-plug Drives Compaq ships non-hot-plug hard drives set to ID 0 and CD-ROM drives set to ID 5. ■ Make sure that each SCSI device connected to the same port on a SCSI backplane is set with a unique SCSI identification number.
Hardware Problems Network Controllers Use this section to troubleshoot common problems with network controllers. Before installing your network controllers, always check the support information on the Compaq website to verify that you are using the latest drivers and support files. For additional information concerning network controllers, refer to the Compaq website: http://www.compaq.
2-28 Compaq Servers Troubleshooting Guide Table 2-14 Network Controller Troubleshooting continued Problem Possible Cause Possible Solution Network controller stopped working when an expansion board was added. The cable is not securely connected. Make sure that the cable is securely attached to the network connector and that the other end of the cable is securely attached to the correct device. Network controller interrupt overlaps the interrupt of an expansion board.
Chapter 3 Software Problems The best sources of information for software problems are your operating system and software documentation. Your operating system and software documentation may also point to fault detection tools that can report errors and preserve your system configuration. Another useful resource is the Compaq Survey Utility. You can use it to track the software components within the machine (for servers running Microsoft Windows NT or Novell NetWare).
3-2 Compaq Servers Troubleshooting Guide Troubleshooting Operating System or Application Software Symptoms Use the following table for troubleshooting operating system or application software problems. Table 3-1 Software Problems Symptom Possible Solution Errors after a software setting is changed Check the system logs. Change settings back to original configuration. Errors after the system software is changed Change settings back to original configuration.
Software Problems When to Reconfigure or Reload Software If all other options have not resolved your problem, consider reconfiguring the system. Before you take this step: 1. Weigh downtime versus time spent troubleshooting intermittent problems. It may be advantageous to simply start over by removing the problem software and reinstalling it. In some cases, it may be advantageous to use the Erase Utility and reinstall the server from the beginning.
3-4 Compaq Servers Troubleshooting Guide Maintaining Current Drivers Although basic device support is provided for all Compaq-supported devices, you should check the Compaq Support Software Diskette (SSD) for your operating system to see if updated drivers exist. Compaq enhances basic support of some devices to increase performance and to add specific Compaq support for features such as Compaq Insight Management Agents for Windows NT.
Software Problems Novell NetWare Drivers to support NetWare Programs from Compaq ProLiant Servers are located on the Support Software Diskette (SSD) for NetWare Programs from Compaq, and some of the drivers are contained on the NetWare Programs retail product. These drivers are also located on the Compaq SmartStart and Support Software CD. The drivers on the SSD may be newer versions because of new functionality, problem fixes, and so forth. For more information on these drivers, run the README.
3-6 Compaq Servers Troubleshooting Guide Operating System Updates Be careful when applying operating system updates. Check the bug fix list that comes with each update first. If there are no specific fixes that you require, it is recommended that you do NOT apply the updates. Some updates overwrite Compaq-specific files. If you decide to apply an operating system update: 1. Perform a full system backup. 2. Apply the operating system update, using the instructions provided with it. 3.
Chapter 4 Diagnostic Tools These utilities were developed to assist you in diagnosing problems, testing the hardware, and monitoring and managing your Compaq server hardware. Table 4-1 Diagnostic Tools Tool What is it? How do I run it? Compaq Diagnostics program Utility to assist you in testing and/or verifying the operation of your Compaq hardware. If problems are found, Compaq Diagnostics will isolate the failure(s) down to the replaceable part, whenever possible.
4-2 Compaq Servers Troubleshooting Guide Table 4-1 Diagnostic Tools continued Tool What is it? How do I run it? Array Diagnostic Utility (ADU) A Windows-based tool designed to run on all Compaq servers that support Compaq array controllers and are running SmartStart 4.10 or later. The two main functions of ADU are to collect all possible information about the array controllers in the system and generate a list of detected problems. This tool is available for all Compaq servers covered by this guide.
Diagnostic Tools Table 4-1 Diagnostic Tools continued Tool What is it? How do I run it? Integrated Management Log (IML) A log that records system events, such as system failures or non-fatal error conditions. You can view an event in the Integrated Management Log in the following ways: The Integrated Management Log requires Compaq operating system-dependent drivers. Refer to the Compaq SmartStart and Support Software CD for instructions on installing the appropriate drivers.
4-4 Compaq Servers Troubleshooting Guide Accessing Diagnostic Tools The Compaq SmartStart and Support Software CD contains the SmartStart program and many of the Compaq utilities needed to maintain your system, including: ■ System Configuration Utility ■ Array Configuration Utility ■ Array Diagnostic Utility ■ ROMPaq Firmware Upgrade Utilities ■ Compaq Diagnostics CAUTION: Do not select the Erase Utility when running the SmartStart and Support Software CD.
Diagnostic Tools Run from Diskette You can also run these utilities from their individual diskettes. If you have a utility diskette newer than the version on the SmartStart and Support Software CD, use that diskette. -OrYou can create a diskette version of the utility from the SmartStart and Support Software CD. To create diskette versions of the utilities from the CD: 1. Boot the Compaq SmartStart and Support Software CD. 2.
4-6 Compaq Servers Troubleshooting Guide Compaq Diagnostics Software When you select Diagnostics and Utilities from the System Configuration Utility main menu, the utility prompts you to Test, Inspect, Upgrade, or Diagnose the server. Diagnostics and Utilities are located on the system partition on the hard drive and must be accessed when a system configuration error is detected during the Power-On Self-Test (POST).
Diagnostic Tools ■ Hard drives ■ Processor power modules ■ Fans If Power-On Self-Test (POST) finds an error in the system, an error condition is indicated by an audible and/or visual message. For the most recent POST error messages, descriptions, and corrective actions, refer to your server’s specific troubleshooting guide. Steps for Compaq Diagnostics In each case, the Recommended Action column lists the steps necessary to correct the problem.
4-8 Compaq Servers Troubleshooting Guide 100 - 199, Primary Processor Test Error Codes The 100 series of diagnostic error codes identifies failures with processor and system board functions. Corrective action may require replacing system boards or processor assemblies. Table 4-2 Primary Processor Test Error Codes Error Code Description Recommended Action 101-xx CPU test failed. Replace the processor and retest. 103-xx DMA page registers test failed.
Diagnostic Tools Table 4-2 Primary Processor Test Error Codes continued Error Code Description Recommended Action 122-xx Multiprocessor Dispatch test failed. 1. Check the system configuration and retest. 2. Replace the processor and retest. 123-xx Interprocessor Communication test failed. 3. Replace the system board and retest. 199-xx Installed devices test failed. 1. Check the system configuration and retest. 2. Verify cable connections and retest. 3.
4-10 Compaq Servers Troubleshooting Guide 200 - 299, Memory Test Error Codes The 200 series of diagnostic error codes identifies failures with the memory subsystem. Corrective action may require replacement of the memory expansion board, the memory modules, or the processor assembly. Table 4-3 Memory Test Error Codes Error Code Description Recommended Action 200-xx Invalid memory configuration. Reinsert memory modules in correct location and retest. 201-xx Memory machine ID test failed.
Diagnostic Tools Table 4-3 Memory Test Error Codes continued Error Code Description Recommended Action 210-xx Random pattern test failed. 1. Replace the memory module and retest. 2. Replace the processor and retest. 3. Replace the memory expansion board and retest. 1. Replace the memory module/board and retest. 2. Replace the system board and retest. 1. Replace the memory module/board and retest. 2. Replace the system board and retest. 3. Replace the memory expansion board and retest.
4-12 Compaq Servers Troubleshooting Guide 300 - 399, Keyboard Test Error Codes The 300 series of diagnostic error codes identifies failures with keyboard and system board functions. Corrective action may require replacement of the keyboard or the system board assembly. Table 4-4 Keyboard Test Error Codes Error Code Description Recommended Action 301-xx Keyboard short test, 8042 self-test failed. The following steps apply to error codes 301-xx through 304-xx: 302-xx Keyboard long test failed. 1.
Diagnostic Tools 400 - 499, Parallel Printer Test Error Codes The 400 series of diagnostic error codes identifies failures with parallel printer interface card or system board functions. Corrective action may require replacement of the serial/parallel interface board or the system board assembly. Table 4-5 Parallel Printer Test Error Codes Error Code Description Recommended Action 401-xx Printer failed or not connected.
4-14 Compaq Servers Troubleshooting Guide 500 - 599, Graphics Controller Unit Test Error Codes The 500 series of diagnostic error codes identifies failures with graphics or system board functions. Corrective action may require replacement of the video board or the system board assembly. Table 4-6 Graphics Controller Unit Test Error Codes Error Code Description Recommended Action 501-xx Graphics controller test failed.
Diagnostic Tools 600 - 699, Diskette Drive Test Error Codes The 600 series of diagnostic error codes identifies failures with diskette, diskette drive, or system board functions. Corrective action may require replacement of the diskette, the diskette drive, or the system board assembly. Table 4-7 Diskette Drive Test Error Codes Error Code Description Recommended Action 600-xx Diskette ID drive types test failed. 601-xx Diskette format failed.
4-16 Compaq Servers Troubleshooting Guide 1100 - 1199, Serial Test Error Codes The 1100 series of diagnostic error codes identifies failures with serial/parallel interface board or system board functions. Corrective action may require replacement of the serial/parallel interface board or the system board assembly. Table 4-8 Serial Test Error Codes Error Code Description Recommended Action 1101-xx Serial port test failed. The following steps apply to error codes 1101-xx and 1109-xx: 1109-xx 1.
Diagnostic Tools 1200 - 1299, Modem Communications Test Error Codes The 1200 series of diagnostic error codes identifies failures with the modem. Corrective action may require replacement of the modem. Table 4-9 Modem Communications Test Error Codes Error Code Description Recommended Action 1201-xx Modem internal loopback test failed. 1202-xx Modem time-out test failed. The following steps apply to error codes 1201-xx through 1210-xx: 1203-xx Modem external termination test failed.
4-18 Compaq Servers Troubleshooting Guide 6000 - 6099, Compaq Network Interface Cards Test Error Codes The 6000 series of diagnostic error codes identifies failures with various Compaq Network Interface Controllers. Corrective action may require replacement of the controller. Table 4-10 Compaq Network Interface Cards Test Error Codes Error Code Description Recommended Action 6000-xx Network card ID failed. 6001-xx Network card setup failed.
Diagnostic Tools 6500 - 6599, SCSI Hard Drive Test Error Codes The 6500 series of diagnostic error codes identifies failures with SCSI hard drives, SCSI hard drive controller boards, SCSI hard drive cabling, and system board functions. If the system uses a drive array controller, see the section for Array Diagnostic Utility (ADU). Table 4-11 SCSI Fixed Disk Drive Test Error Codes Error Code Description Recommended Action 6500-xx SCSI Disk ID drive types test failed. 1.
4-20 Compaq Servers Troubleshooting Guide 6600 - 6699, SCSI/IDE CD-ROM Drive Test Error Codes The 6600 series of diagnostic error codes identifies failures with the CD-ROM cabling, CD-ROM drive, adapter board, or system board assembly. Corrective action may require replacement of the CD-ROM cabling, CD-ROM drive, adapter board, or system board assembly. Table 4-12 SCSI/IDE CD-ROM Drive Test Error Codes Error Code Description 6600-xx CD-ROM ID failed.
Diagnostic Tools 6700 - 6799, SCSI Tape Drive Test Error Codes The 6700 series of diagnostic error codes identifies failures with the tape cartridge, tape drive, media changer, tape drive cabling, SCSI adapter, or system board assembly. Corrective action may require replacement of any of these parts. Table 4-13 SCSI Tape Drive Test Error Codes Error Code Description Recommended Action 6700-xx SCSI Tape ID drive types test failed.
4-22 Compaq Servers Troubleshooting Guide 7000 - 7099, Server Manager/R Board Test Error Codes The 7000 series of diagnostic error codes identifies failures with the Server Manager/R board. Corrective action may require replacement of the board, the Integrated 2400-baud modem, voice ROM, or battery on the Server Manager/R board. Table 4-14 Server Manager/R Board Test Error Codes Error Code Description Recommended Action 7000-xx Server Manager/R board identification test failed.
Diagnostic Tools Table 4-14 Server Manager/R Board Test Error Codes continued Error Code Description Recommended Action 7024-xx Server Manager/R board memory Refresh Alert test failed. Replace the Server Manager/R board and retest. 7025-xx Memory Increment Memory Random Data test failed. 7026-xx Memory Disturb Address test failed. 7027-xx Memory HBM test failed. 7028-xx HBM IO test failed. 7033-xx HBM BMIC test failed. 7034-xx HBM Video test failed. 7035-xx The ser_int test failed.
4-24 Compaq Servers Troubleshooting Guide Table 4-14 Server Manager/R Board Test Error Codes continued Error Code Description Recommended Action 7061-xx Voice/DTMF Internal Loopback test failed. Replace the Server Manager/R board Voice ROM. 7062-xx Voice/DTMF Internal Loopback test failed. 7078-xx Host ADC Measurements test failed. 7079-xx Battery test failed. Replace the Server Manager/R board battery.
Diagnostic Tools Array Diagnostic Utility (ADU) The Array Diagnostic Utility (ADU) is a Windows-based tool designed to run on all Compaq servers that support Compaq array controllers and are running SmartStart 4.10 or later. The two main functions of ADU are to collect all possible information about the array controllers in the system and generate a list of detected problems. The error messages and codes listed include all codes generated by Compaq products.
4-26 Compaq Servers Troubleshooting Guide Integrated Management Log On servers supporting the Integrated Management Display, Compaq Integrated Management Display Log replaces the Critical Error Log and Correctable Memory Logs, recording system events and storing them in an easily viewable form. It marks each event with a time-stamp with one-minute granularity.
Diagnostic Tools Compaq Insight Manager Compaq Insight Manager is a comprehensive management tool to monitor and control the operation of Compaq servers and clients. Compaq Insight Manager consists of two components: a Windows-based console application, and server- or client-based management data collection agents. Starting with Compaq Insight Manager 4.0, the agents for Windows NT and NetWare are also Web-enabled; that is, these agents enable Web browser access and monitoring of management information.
4-28 Compaq Servers Troubleshooting Guide ■ SNMP standards provide integration with other management products. ■ Flexible network conductivity supports multiple transport protocols including IPX, TCP/IP and PPP to operate over LANs, WANs, and modems.
Diagnostic Tools Compaq Survey Utility The Compaq Survey Utility is a serviceability tool available from Windows NT and Novell NetWare that delivers online-configuration capture and comparison to maximize server availability. It is delivered on the Compaq Management CD in the SmartStart package or is available on the Compaq website. Refer to the Compaq Management CD for information on installing and running the Survey Utility.
4-30 Compaq Servers Troubleshooting Guide Table 4-16 Event Messages Event Type Event Message Action Fan Failure System Fan Failure (Fan X, Location) Replace fan. Fans Not Redundant System Fans Not Redundant Add fan. Overheat Condition System Overheating (Zone X, Location) Check fans.
Diagnostic Tools Table 4-16 Event Messages continued Event Type Event Message Action EISA Expansion Bus Master Timeout (Slot X) Power down server and replace EISA board. EISA Bus EISA Expansion Bus Slave Timeout EISA Expansion Board Error (Slot X) EISA Expansion Bus Arbitration Error PCI Bus Error PCI Bus Error (Slot X, Bus X, Device X, Function X) Power down PCI slot and replace board. Power Supply Failure System Power Supply Failure (Power Supply X) Replace power supply.
4-32 Compaq Servers Troubleshooting Guide Table 4-16 Event Messages continued Event Type Event Message Action Automatic Server Recovery System Lockup ASR Lockup Detected: Cause Operating System System Crash Blue Screen Trap: Cause [NT] Refer to the documentation for your operating system.
Diagnostic Tools Inspect Utility The Inspect Utility provides configuration information such as the contents of the operating system startup files, the current memory configuration, and the ROM version. It operates with MS-DOS and in the MS-DOS emulation mode of MS OS/2. To Run the Inspect Utility 1. Turn the server off and then back on again. Press F10 when the cursor appears in the upper right corner of the screen. At the main menu, select Diagnostics and Utilities. 2. Press Enter. 3.
4-34 Compaq Servers Troubleshooting Guide Running the Compaq System Erase Utility IMPORTANT: Perform a backup before running the Compaq System Erase Utility. All data and configuration information on your existing server is erased by the Erase Utility. This utility sets the system to its original factory state, deleting the current hardware configuration information (including array setup and disk partitioning) and erasing all attached hard drives completely.
Chapter 5 ROMPaq Disaster Recovery From time to time, it may be desirable to upgrade your current system ROM. Some reasons for this may be as follows: ■ Desire to upgrade ROM ■ Receive new SmartStart and Support Software CD ■ Desire to upgrade server processors ■ Request from Compaq The process of upgrading your system ROM is referred to as flashing the ROM. Flashing consists of using software to replace the current ROM image with a new one through ROMPaq.
5-2 Compaq Servers Troubleshooting Guide 1. Build a fresh ROMPaq diskette, using the latest version for the server involved. NOTE: If the ROM is corrupted by a ROMPaq interruption, the initial ROMPaq attempt may have affected the contents of the original diskette. 2. Power down the server. 3. Set configuration switch position 4 on the system configuration switch block to ON to enable disaster mode.
Chapter 6 Automatic Server Recovery Compaq servers provide a recovery service that causes the system to reboot in the event of a catastrophic operating system error such as a blue screen trap, ABEND or kernel panic. This system failsafe timer, called Automatic Server Recovery (ASR), is started when the health driver is loaded, and can be disabled from within the Compaq Insight Manager console.
Chapter 7 Preventing Future Problems This chapter provides information to help you avoid future problems. While most of the pointers provided are common sense suggestions, these prevention tasks are too important to overlook. Preparing for Changes Most problems occur when something in the server system has been changed. Use these tips to prepare for any changes. ■ Back up your system often. Verify that the backups are good before making changes.
7-2 Compaq Servers Troubleshooting Guide ■ Check the Compaq resources, your software, and third-party product resources for information about potential problems. Websites are excellent places to find this information. ■ If possible, make changes in an incremental fashion. It is easier to troubleshoot one change rather than several at once. ■ Record the results of each change after it is executed, making sure to include any error messages or additional information collected.
Preventing Future Problems Installing Servers Consistently through a Replicated Install When initially setting up a server, SmartStart can access the Integration Server as the source of the system software instead of the system software CDs. SmartStart on the target server connects to the Integration Server during the Assisted Integration or Replicated Install interview process. SmartStart examines the Integration Server to determine which software is available for installation.
7-4 Compaq Servers Troubleshooting Guide Use a Methodology Following a set of procedures in your server use can help prevent problems and should problems occur, make your troubleshooting easier. ■ Use uniform naming conventions for your servers. Consider using names that denote server location. A uniform server naming convention will help when you try to remember those details often overlooked that can hold the clue to resolving the crisis. ■ Use unique IDs or names for your devices.
Preventing Future Problems ■ Keep a trend analysis. You will know what to expect during certain points in time. For example, if the CPU utilization rate always increases by 50% during certain hours, you will know that increase is normal for the server you are tracking. ■ Create a problem resolution notebook. When problems do occur, keep a log of the actions you took to resolve it. This could help you more quickly solve the same problem in the future.
7-6 Compaq Servers Troubleshooting Guide Visually Check Your Server Periodically you should look at the following items on your server. A visual check can prevent many problems. ■ Make sure systems and racks are not positioned tightly up against walls and that there is adequate space around them for proper airflow. ■ Move magnetized office items such as magnetized screwdrivers and telephones with electromagnetic ringers away from the system.
Preventing Future Problems Power Problems Caused by Acts of Nature Some power problems are caused by acts of nature, which can range from lightning and excessive heat, to ice, rain, and windstorms. Lightning can cause spikes and surges. (A spike is a quick impulse of undesirable high voltage on a power line, typically lasting only a fraction of a second. A surge is a sudden increase in line voltage of short duration.
7-8 Compaq Servers Troubleshooting Guide Preventing Electrostatic Damage Many electronic components are sensitive to electrostatic discharge (ESD). Circuitry design and structure determine the degree of sensitivity. Networks built into many integrated circuits provide some protection, but in many cases the discharge contains enough power to alter device parameters or melt silicon junctions.
Index A abnormal program termination 4-32 access panel caution about removal 2-4 action plan executing 1-7 acts of nature problems caused by 7-7 ACU See Array Configuration Utility ADU See Array Diagnostic Utility airflow precautions 7-6 applications software errors 3-2 Array Configuration Utility 4-4 accessing 4-4 array controllers troubleshooting 4-2.
2 Compaq Servers Troubleshooting Guide C cable connections network problems 2-28 cable considerations rack-mounted servers 2-22 cable damage preventing 7-8 cables precautions 7-6 cannot access data 2-20 caution status IMD events 4-26 cautions Erase Utility 4-4 CD-ROM drive boot problems 2-17 not found 2-17 settings 2-26 troubleshooting 2-17 changes to server preparing for 7-1 suggested sequence 7-2 checkmarks IMD display 2-6 cleaning DAT drives 2-25 tape drives 7-8 cleaning precautions connectors 7-5 clus
Index contacting Compaq how to 1-10 IBM OS/2 information required 1-12 information you need 1-10 Microsoft NT server information required 1-13 Microsoft Windows NT information required 1-13 Novell NetWare information required 1-11 SCO UNIX information required 1-13 SQL Server information required 1-15 Sun Solaris information required 1-14 controller POST messages 4-6 corrupted files network drivers 2-28 troubleshooting 2-20 critical status IMD events 4-26 cycle the power caution for ROM upgrades 4-25 D DA
4 Compaq Servers Troubleshooting Guide documentation required for troubleshooting 1-2 system components 1-5 system settings 7-1 technical notes xii white papers xii Drive Array Advanced Diagnostics defined 4-2 running 4-2 website 4-2 drive not found 2-16 drivers accessing latest Compaq x maintaining current 3-4 staying current with 7-3 drives preventing damage 7-6, 7-7 dust precautions 7-6 removing from server 7-6 E EISA bus IMD event 4-31 electronic documentation accessing technical xii electronic mail
Index IDs H hard disk space maintenance 7-4 hard drive cannot access data on 2-20 Failure 2-19 POST messages 4-7 server does not recognize 2-20 upgrade tips 7-2 hard drive signal cables troubleshooting pin problems 2-8 hardware configuring 4-2 fault condition reports 4-3 moving a failed device 2-9 new hardware not recognized 2-8 problems with new 2-8 troubleshooting general problems 2-5 utilities 4-3 hardware information utility for 4-3 historical data gathering 1-10 suggestions 7-4 historical record of c
6 Compaq Servers Troubleshooting Guide internet providers Compaq support on xi K kernel panic 4-32 keyboard POST messages 4-6 keyboard test error codes 4-12 L LCD error IMD display 2-6 LED definitions Refer to your server’s specific troubleshooting guide LEDs red on UPS 2-3 line voltage troubleshooting power problems 2-2 M machine environment IMD events 4-30 magnetized office items precautions 7-6 main memory IMD Events 4-30 maintenance suggestions 7-3 Maintenance and Service Guide how to access 2-18 m
Index O obvious problems what to check 1-3 On/Off switch troubleshooting power problems 2-2 online networks Compaq support on xi operating environment gathering information 1-10 operating system changing 4-34 IMD event 4-32 installing a new 4-34 updating 3-6 OS/2 See IBM out of memory error 2-13 outlet troubleshooting power problems 2-2 P PaqFax accessing xi defined xi parallel printer test error codes 4-13 PCI bus error IMD event 4-31 peripheral devices disconnecting to testing 1-6 phone numbers for supp
8 Compaq Servers Troubleshooting Guide power-on password deleting via switch 2-11 Power-On Self-Test 4-1, 4-6.
Index ROM See also ROMPaq accessing latest x corrupted 5-2 flashing 5-1 POST messages 4-6 upgrading 5-1 ROM upgrade cycle the power caution 4-25 ROMPaq accessing xii defined xii disaster recovery, failed 5-2 firmware upgrade utilities 4-4 interruption 5-2 running firmware upgrade utilities 4-5 S SCO UNIX Compaq Extended Feature Supplement 3-5 device drivers 3-5 information for Compaq 1-13 screen problems See display SCSI devices See also Devices troubleshooting 2-26 SCSI fixed disk drive test error codes
10 Compaq Servers Troubleshooting Guide software errors after changing software setting 3-2 errors after changing system software 3-2 errors after installing application 3-2 information to record 1-5 reloading 3-3 restoring 3-6 troubleshooting 3-1 troubleshooting crashes 3-2 software information utility for 4-3 software updates accessing xii suggestions 7-3 software utilities accessing xii Solaris See Sun Solaris sound problems missing during POST 2-18 troubleshooting 1-3, 2-18 spare parts keeping onsite
Index tape drive errors defective media 7-8 preventing 7-8 technical electronic documentation accessing xii technical information requesting xi technical support accessing from QuickFind xi TechNotes accessing xii defined xii temperature optimal for UPS batteries 2-3 terminator card 7-2 text conventions xvii thermal damage caution 2-4 third-party cables pin-outs 1-2 third-party hardware/software information to gather 1-10 time required for troubleshooting 1-2 tips 7-3 tools identifying for troubleshooting
12 Compaq Servers Troubleshooting Guide V W ventilation server requirements 2-12 version control defined 7-3 video colors incorrect 2-11 video error codes 4-14 viruses scanning for 7-4 troubleshooting 2-20 voltage regulator module See processor power module VREPAIR 3-3 VRM See processor power module warning information hot surfaces xiv locating xiv warnings electrical shock xv rack stability xvi warranty information obtaining xi website Compaq xiii Drive Array Advanced Diagnostics 4-2 white papers acce