HP Proliant Servers Troubleshooting Guide Abstract This document describes common procedures and solutions for the many levels of troubleshooting for HP ProLiant G7 and earlier servers. This document is intended for the person who installs, administers, and troubleshoots servers or server blades. HP assumes you are qualified in the servicing of computer equipment and trained in recognizing hazards in products with hazardous energy levels.
© Copyright 2004, 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein. Microsoft®, Windows®, and Windows Server® are U.S.
Contents Introduction .................................................................................................................................. 8 What's new ................................................................................................................................................. 8 Revision history ............................................................................................................................................ 8 375445-403 (October 2011) ............
Third-party device problems .............................................................................................................. 41 Internal system problems ............................................................................................................................. 42 Battery pack problems ...................................................................................................................... 42 CD-ROM and DVD drive problems ..........................................
Erase Utility ..................................................................................................................................... 74 HP Systems Insight Manager ............................................................................................................. 74 Redundant ROM support ................................................................................................................... 74 USB support ....................................................................
Operating system installation and configuration information (for factory-installed operating systems) ......... 89 Server configuration information ........................................................................................................ 89 Installation and configuration information for the server setup software .................................................. 89 Software installation and configuration of the server ...........................................................................
Expansion board-related port 85 codes ............................................................................................ 182 Miscellaneous port 85 codes ........................................................................................................... 182 Windows® Event Log processor error codes ............................................................................................... 183 Message ID: 4137 ...............................................................................
Introduction What's new The thirteenth edition of the HP ProLiant Servers Troubleshooting Guide, part number 375445-404, includes the following additions and updates: • Added information about HP Service Pack for ProLiant (on page 78). SPP replaces older methods of updating firmware and system software on many of the servers supported by this document. • Added a reference to the product page for HP Smart Update Manager (on page 79).
• Updated Option ROM Configuration for Arrays (on page 71) • Updated Automatic Server Recovery (on page 72) • Updated the following section in HP Smart Update Manager deployment (on page 82): o • Online deployment ("Online deployment (if SPP is not supported)" on page 83) Added or updated multiple messages in Error messages (on page 91) o ADU version 8.0 through 8.
o Insight Diagnostics processor error codes Introduction 10
Getting started HP ProLiant 100 Series Server troubleshooting information Use this guide for troubleshooting information on the HP ProLiant ML110 G7 Server and the HP ProLiant DL120 G7 Server. For troubleshooting information on HP ProLiant 100 Series Servers other than the HP ProLiant ML110 G7 Server and HP ProLiant DL120 G7 Server, see the respective server user guides. How to use this guide NOTE: For common troubleshooting procedures, the term "server" is used to mean servers and server blades.
When additional information becomes necessary, use this section to identify websites and supplemental documents that contain troubleshooting information.
This symbol indicates the presence of electric shock hazards. The area contains no user or field serviceable parts. Do not open for any reason. WARNING: To reduce the risk of injury from electric shock hazards, do not open this enclosure. This symbol on an RJ-45 receptacle indicates a network interface connection. WARNING: To reduce the risk of electric shock, fire, or damage to the equipment, do not plug telephone or telecommunications connectors into this receptacle.
WARNING: To reduce the risk of personal injury or damage to the equipment: weight in kg weight in lb • Observe local occupation health and safety requirements and guidelines for manual handling. • Obtain adequate assistance to lift and stabilize the chassis during installation or removal. • The server is unstable when not fastened to the rails. • When mounting the server in a rack, remove the power supplies and any other removable module to reduce the overall weight of the product.
Symptom information Before troubleshooting a server problem, collect the following information: • What events preceded the failure? After which steps does the problem occur? • What has been changed since the time the server was working? • Did you recently add or remove hardware or software? If so, did you remember to change the appropriate settings in the server setup utility, if necessary? • How long has the server exhibited problem symptoms? • If the problem occurs randomly, what is the duration
Performing processor procedures in the troubleshooting process Because this document supports multiple generations of HP ProLiant server models, it also covers processes that include troubleshooting of various models and types of processors. Before performing any troubleshooting steps that involve processors, review the following guidelines: • Be sure that only authorized personnel perform the troubleshooting steps that involve installation, removal, or replacement of a processor.
CAUTION: Before removing or replacing any processors, be sure to follow the guidelines provided in "Performing processor procedures in the troubleshooting process (on page 16)." Failure to follow the recommended guidelines can cause damage to the system board, requiring replacement of the system board.
Common problem resolution Loose connections Action: • Be sure all power cords are securely connected. • Be sure all cables are properly aligned and securely connected for all external and internal components. • Remove and check all data and power cables for damage. Be sure no cables have bent pins or damaged connectors. • If a fixed cable tray is available for the server, be sure the cords and cables connected to the server are routed correctly through the tray.
HP offers a subscription service that can provide notification of firmware updates. For more information, see "Subscriber's Choice (on page 80)." For more information on updating firmware, see "Firmware maintenance (on page 80)." DIMM handling guidelines CAUTION: Failure to properly handle DIMMs can cause damage to DIMM components and the system board connector. When handling a DIMM, observe the following guidelines: • Avoid electrostatic discharge (on page 14).
• The system automatically sets all SCSI IDs. • If only one SCSI hard drive is used, install it in the bay with the lowest number. • Drives must be the same capacity to provide the greatest storage space efficiency when drives are grouped together into the same drive array.
Online/activity LED (green) Fault/UID LED (amber/blue) Interpretation On, off, or flashing Alternating amber and blue The drive has failed, or a predictive failure alert has been received for this drive; it also has been selected by a management application. On, off, or flashing Steadily blue The drive is operating normally, and it has been selected by a management application. On Amber, flashing regularly (1 Hz) A predictive failure alert has been received for this drive.
Diagnostic flowcharts Troubleshooting flowcharts To effectively troubleshoot a problem, HP recommends that you start with the first flowchart in this section, "Start diagnosis flowchart (on page 24)," and follow the appropriate diagnostic path. If the other flowcharts do not provide a troubleshooting solution, follow the diagnostic steps in "General diagnosis flowchart (on page 24).
o HP BladeSystem c-Class Technical Documentation (http://www.hp.com/go/bladesystem/documentation) Select Support, Drivers and Manuals, and then select the product. Select Manuals, and then locate the link for the maintenance and service guide. 3. HP BladeSystem p-Class Support and Documents (http://www.hp.com/products/servers/proliant-bl/p-class/info) To locate the HP BladeSystem p-Class System Maintenance and Service Guide, select the product. Select Manuals (guides, supplements, addendums, etc).
Start diagnosis flowchart Use the following flowchart to start the diagnostic process.
The General diagnosis flowchart provides a generic approach to troubleshooting. If you are unsure of the problem, or if the other flowcharts do not fix the problem, use the following flowchart.
Power-on problems flowchart Server power-on problems flowchart Some servers have an internal health LED and an external health LED, while other servers have a single system health LED. The system health LED provides the same functionality as the two separate internal and external health LEDs. Depending on the model, the internal health LED and external health LED may either appear solid or they may flash. Both conditions represent the same symptom.
Diagnostic flowcharts 27
p-Class server blade power-on problems flowchart c-Class server blade power-on problems flowchart For the location of server LEDs and information on their statuses, see the server documentation on the HP website (http://www.hp.com/support).
Symptoms: • The server does not power on. • The system power LED is off or amber. • The health LED is red or amber.
POST problems flowchart Symptoms: • Server does not complete POST NOTE: The server has completed POST when the system attempts to access the boot device.
Server and p-Class server blade POST problems flowchart Diagnostic flowcharts 31
c-Class server blade POST problems flowchart Operating system boot problems flowchart Symptoms: • Server does not boot a previously installed OS • Server does not boot SmartStart Possible causes: • Corrupted OS • Hard drive subsystem problem Diagnostic flowcharts 32
• Incorrect boot order setting in RBSU There are two ways to use SmartStart when diagnosing OS boot problems on a server blade: • Use iLO to remotely attach virtual devices to mount the SmartStart CD onto the server blade. • Use a local I/O cable and drive to connect to the server blade, and then restart the server blade.
Server fault indications flowchart Symptoms: • Server boots, but a fault event is reported by Insight Management Agents • Server boots, but the internal health LED, external health LED, or component health LED is red or amber NOTE: For the location of server LEDs and information on their statuses, refer to the server documentation.
For the location of server LEDs and information on their statuses, see the server documentation on the HP website (http://www.hp.com/support).
c-Class server blade fault indications flowchart Diagnostic flowcharts 36
Hardware problems Procedures for all ProLiant servers The procedures in this section are comprehensive and include steps about or references to hardware features that may not be supported by the server you are troubleshooting. CAUTION: Before removing or replacing any processors, be sure to follow the guidelines provided in "Performing processor procedures in the troubleshooting process (on page 16).
2. If the power supplies have LEDs, be sure they indicate that each power supply is working properly. If the LEDs indicate a problem with a power supply, replace the power supply. For more information, see the server documentation on the HP website (http://www.hp.com/support). 3. Be sure the system has enough power, particularly if you recently added hardware, such as hard drives. Additional power supplies may be required. Check the system information from the IML.
4. Be sure the power cord is the correct type for the UPS and the country in which the server is located. See the UPS reference guide for specifications. 5. Be sure the line cord is connected. 6. Be sure each circuit breaker is in the On position, or replace the fuse if needed. If this occurs repeatedly, contact an authorized service provider. 7. Check the UPS LEDs to be sure a battery or site wiring problem has not occurred. See the UPS documentation. 8.
o Installation of a SCSI device without termination or without proper ID settings o Setting of an IDE device to Primary/Secondary when the other device is set to CS o Connection of the data cable, but not the power cable, of a new device 4. Be sure no memory, I/O, or interrupt conflicts exist. 5. Be sure no loose connections (on page 18) exist. 6. Be sure all cables are connected to the correct locations and are the correct lengths. For more information, see the server documentation. 7.
CAUTION: Only authorized technicians trained by HP should attempt to remove the system board. If you believe the system board requires replacement, contact HP Technical Support ("Contacting HP" on page 185, "Contacting HP technical support or an authorized reseller" on page 185) before proceeding. CAUTION: Before removing or replacing any processors, be sure to follow the guidelines provided in "Performing processor procedures in the troubleshooting process (on page 16).
CAUTION: Clearing NVRAM deletes the configuration information. Refer to the server documentation for complete instructions before performing this operation or data loss could occur. 5. Clearing NVRAM can resolve various problems. Clear the NVRAM, but do not use the backup .SCI file if prompted. Have available any .CFG, .OVL, or .PCF files that are required.
3. Be sure the inserted CD or DVD format is valid for the drive. For example, be sure you are not inserting a DVD into a drive that only supports CDs. Drive is not detected Action: 1. Be sure no loose connections (on page 18) exist. 2. Refer to the drive documentation to be sure cables are connected as required. 3. Be sure the cables are working properly. Replace with known functional cables to test whether the original cables were faulty. 4. Be sure the correct, current driver is installed.
2. Be sure the diskette is not write protected. If it is, use another diskette or remove the write protection. 3. Be sure you are attempting to write to the proper drive by checking the drive letter in the path statement. 4. Be sure enough space is available on the diskette.
contact HP support ("Contacting HP" on page 185, "Contacting HP technical support or an authorized reseller" on page 185). System completes POST but drive fails Action: 1. Be sure no loose connections (on page 18) exist. 2. Be sure no device conflict exists. 3. Be sure the hard drive is cabled properly and terminated if necessary. 4. Be sure the hard drive data cable is working by replacing it with a known functional cable. 5.
A new drive is not recognized Action: 1. Be sure the drive is supported. To determine drive support, see the server documentation or the HP website (http://www.hp.com/go/bizsupport). 2. Be sure the drive bay is not defective by installing the hard drive in another bay. 3. Run HP Insight Diagnostics (on page 75). Then, replace failed components as indicated. 4.
Fan problems General fan problems are occurring Action: 1. Be sure the fans are properly seated and working. a. Follow the procedures and warnings in the server documentation for removing the access panels and accessing and replacing fans. b. Unseat, and then reseat, each fan according to the proper procedures. c. Replace the access panels, and then attempt to restart the server. 2. Be sure the fan configuration meets the functional requirements of the server. Refer to the server documentation. 3.
All fans in an HP ProLiant G6 server are not spinning or are not spinning at the same speed Action: For all servers, access the IML. If no error messages are listed, then the fans are operating as designed. If an error message is listed in the IML, then perform the suggested procedure to correct the error. For all server blades, access more information from the Onboard Administrator or iLO 3.
• o If you are unsure which DIMM has failed, test each bank of DIMMs by removing all other DIMMs. Then, isolate the failed DIMM by switching each DIMM in a bank with a known working DIMM. o Remove any third-party memory. To test the memory, run HP Insight Diagnostics (on page 75). Server is out of memory Action: 1. Be sure the memory is configured properly. Refer to the application documentation to determine the memory configuration requirements. 2. Be sure no operating system errors are indicated.
6. Test the memory by installing the memory into a known working server. Be sure the memory meets the requirements of the new server on which you are testing the memory. 7. Replace the memory. See the server documentation. Server fails to boot, all DIMM LEDs illuminate amber, .... ...
Processor problems Action: 1. If applicable, check the processor LEDs to identify if a PPM failure occurred. For LED information, see the server documentation. 2. Be sure each processor is supported by the server and is installed properly. For processor requirements, see the server documentation. 3. Be sure the server ROM is current. If an "unsupported processor detected" message appears, see "Unsupported processor stepping with Intel® processors (on page 85)." 4.
To download HP StorageWorks Library and Tape Tools, see the HP website (http://www.hp.com/support/tapetools). For more information about common tasks, see the HP website (http://www.hp.com/support/lttfaq). Stuck tape issue Action: 1. Manually press the Eject button. Allow up to 10 minutes for the tape to rewind and eject. 2. Perform a forced eject: a. Press and hold the Eject button for at least 10 seconds. b. Allow up to 10 minutes for the tape to rewind and eject. The green Ready LED should flash. 3.
Media issue Action: 1. Verify that the correct media part number is being used. 2. Pull a support ticket using HP StorageWorks Library and Tape Tools. o Look for issues in the cartridge health section. o Look for issues in the drive health section. 3. Run the Media Assessment Test in HP StorageWorks Library and Tape Tools. 4. Check for media damage: 5.
1. Be sure the monitor power cord is plugged into a working grounded (earthed) AC outlet. 2. Power up the monitor and be sure the monitor light is on, indicating that the monitor is receiving power. 3. Be sure the monitor is cabled to the intended server or KVM connection. 4. Be sure no loose connections (on page 18) exist. o For rack-mounted servers, check the cables to the KVM switch and be sure the switch is correctly set for the server.
Mouse and keyboard problems Action: 1. Be sure no loose connections (on page 18) exist. If a KVM switching device is in use, be sure the server is properly connected to the switch. o For rack-mounted servers, check the cables to the switch box and be sure the switch is correctly set for the server. o For tower model servers, check the cable connection from the input device to the server. 2.
Cable problems Drive errors, retries, timeouts, and unwarranted drive failures when using an older Mini SAS cable Action: The Mini SAS connector life expectancy is 250 connect/disconnect cycles (for external, internal, and cable Mini SAS connectors). If using an older cable that could be near the life expectancy, replace the Mini SAS cable. Local I/O cable problems NOTE: The local I/O cable is used only with HP ProLiant p-Class server blades.
2. Be sure the software is set for the correct terminal emulation. a. Reconfigure the software correctly. b. Restart the server. c. Run the communications software, checking settings and making corrections where needed. d. Restart the server, and then reestablish the modem connection. Modem does not answer an incoming call Action: 1. Enable the auto-answer option in the communications software. 2. Be sure an answering machine is not answering the line before the modem is able to answer. a.
3. Be sure no line interference exists. Retry the connection by dialing the number several times. If conditions remain poor, contact the telephone company to have the line tested. 4. Be sure the modem is current and compliant with CCITT and Bell standards. Replace with a supported modem if needed. You are unable to connect to an online subscription service Action: 1. If the line you are accessing requires error control to be turned off, do so using the AT command AT&Q6%C0. 2.
Network controller has stopped working Action: 1. Check the network controller LEDs to see if any statuses indicate the source of the problem. For LED information, refer to the network controller documentation. 2. Be sure the correct network driver is installed for the controller and that the driver file is not corrupted. Reinstall the driver. 3. Be sure no loose connections (on page 18) exist. 4. Be sure the network cable is working by replacing it with a known functional cable. 5.
Software problems The best sources of information for software problems are the operating system and application software documentation, which may also point to fault detection tools that report errors and preserve the system configuration. Other useful resources include HP Insight Diagnostics (on page 75) and HP SIM ("HP Systems Insight Manager" on page 74). Use either utility to gather critical system hardware and software information and to help with problem diagnosis.
Errors are displayed in the error log Action: Follow the information provided in the error log, and then refer to the operating system documentation. Problems occur after the installation of a service pack Action: Follow the instructions for updating the operating system ("Operating system updates" on page 61). During installation of Oracle Solaris, the system locks up or a panic error occurs Action: Disable ACPI support in Oracle Solaris.
Restoring to a backed-up version If you recently upgraded the operating system or software and cannot resolve the problem, you can try restoring a previously saved version of the system. Before restoring the backup, make a backup of the current system. If restoring the previous system does not correct the problem, you can restore the current set to be sure you do not lose additional functionality. Refer to the documentation provided with the backup software.
o IBM OS/2—Power up the server from the startup diskettes. For more information, see the OS/2 documentation. o Linux—For more information, see the operating system documentation. Linux operating systems For troubleshooting information specific to Linux operating systems, refer to the Linux for ProLiant website (http://h18000.www1.hp.com/products/servers/linux). Application software problems Software locks up Action: 1.
ROM problems Remote ROM flash problems General remote ROM flash problems are occurring Action: Be sure you follow these requirements for using the Remote ROM flash utility: • A local administrative client system that is running the Microsoft® Windows NT® 4.0, Windows® 2000, or Windows Server™ 2003 operating system • One or more remote servers with system ROMs requiring upgrade • An administrative user account on each target system.
Failure occurs during ROM flash After the online flash preparation has been successfully completed, the system ROM is flashed offline. The flash cannot be interrupted during this process, or the ROM image is corrupted and the server does not start. The most likely reason for failure is a loss of power to the system during the flash process. Initiate ROMPaq disaster recovery procedures.
3. Remove the access panel. 4. Change positions 1, 5, and 6 of the system maintenance switch to on. 5. Install the access panel. 6. Install the server into the rack. 7. Power up the server. 8. After the system beeps, repeat steps 1 through 3. 9. Change positions 1, 5, and 6 of system maintenance switch to off. 10. Repeat steps 5 and 6. If both the current and backup versions of the ROM are corrupt, return the system board for a service replacement.
Software tools and solutions Configuration tools SmartStart software SmartStart is a collection of software that optimizes single-server setup, providing a simple and consistent way to deploy server configuration. SmartStart has been tested on many ProLiant server products, resulting in proven, reliable configurations.
• Displaying system information • Selecting the primary boot controller • Configuring memory options • Language selection For more information on RBSU, see the HP ROM-Based Setup Utility User Guide on the Documentation CD or the HP website (http://www.hp.com/support/smartstart/documentation). Using RBSU To use RBSU, use the following keys: • To access RBSU, press the F9 key during power-up when prompted. • To navigate the menu system, use the arrow keys.
By default, the auto-configuration process configures the system for the English language. To change any default settings in the auto-configuration process (such as the settings for language, operating system, and primary boot controller), execute RBSU by pressing the F9 key when prompted. After the settings are selected, exit RBSU and allow the server to reboot automatically. For more information on RBSU, see the HP ROM-Based Setup Utility User Guide on the Documentation CD or the HP website (http://www.
Array Configuration Utility ACU is a browser-based utility with the following features: • Runs as a local application or remote service • Supports online array capacity expansion, logical drive extension, assignment of online spares, and RAID or stripe size migration • Suggests the optimum configuration for an unconfigured system • Provides different operating modes, enabling faster configuration or greater control over the configuration options • Remains available any time that the server is on
• File system types, contents, or status • Partition types, sizes, or layout • Software RAID information • Operating system device names or mount points Option ROM Configuration for Arrays Before installing an operating system, you can use the ORCA utility to create the first logical drive, assign RAID levels, and establish online spare configurations.
• When re-entering the serial number and product ID on an HP ProLiant G6 server or later, use the following procedure: After you replace the system board, you must re-enter the server serial number and the product ID. 1. During the server startup sequence, press the F9 key to access RBSU. 2. Select the Advanced Options menu. 3. Select Service Options. 4. Select Serial Number.
For more information, go to the HP website (http://www.hp.com/go/hpsc) and click on Drivers, Software & Firmware. Then, enter your product name in the Find an HP product field and click Go. iLO and iLO 2 technology The iLO subsystem is a standard component of selected ProLiant servers that provides server health and remote server manageability. The iLO or iLO 2 subsystem includes an intelligent microprocessor, secure memory, and a dedicated network interface.
Erase Utility CAUTION: Perform a backup before running the System Erase Utility. The utility sets the system to its original factory state, deletes the current hardware configuration information, including array setup and disk partitioning, and erases all connected hard drives completely. Refer to the instructions for using this utility. Run the Erase Utility if you must erase the system for the following reasons: • You want to install a new operating system on a server with an existing operating system.
• POST • RBSU • Diagnostics • DOS • Operating environments which do not provide native USB support Diagnostic tools HP Insight Diagnostics HP Insight Diagnostics is a proactive server management tool, available in both offline and online versions, that provides diagnostics and troubleshooting capabilities to assist IT administrators who verify server installations, troubleshoot problems, and perform repair validation.
This functionality supports operating systems that may not be supported by the server. For operating systems supported by the server, see the HP website (http://www.hp.com/go/supportos). If a significant change occurs between data-gathering intervals, the survey function marks the previous information and overwrites the survey data files to reflect the latest changes in the configuration.
Remote support and analysis tools HP Insight Remote Support software HP strongly recommends that you install HP Insight Remote Support software to complete the installation or upgrade of your product and to enable enhanced delivery of your HP Warranty, HP Care Pack Service, or HP contractual support agreement.
• VCRM manages the repository for Windows and Linux PSPs as well as online firmware. Administrators can browse a graphical view of the PSPs or configure VCRM to automatically update the repository with Internet downloads of the latest software from HP. • VCA compares installed software versions and available updates. Administrators can configure VCA to point to a repository managed by VCRM.
SPP supports most HP ProLiant servers and HP BladeSystem products, but might not support older models. To determine if your product is supported by the SPP, see the latest server support guide on the HP website (http://www.hp.com/go/spp/documentation). HP Smart Update Manager HP SUM is included in many HP products for installing and updating firmware and software on HP ProLiant servers.
IMPORTANT: This utility supports operating systems that may not be supported by the server. For operating systems supported by the server, see the HP website (http://www.hp.com/support). • Integrates with other software maintenance, deployment, and operating system tools • Automatically checks for hardware, firmware, and operating system dependencies, and installs only the correct ROM upgrades required by each target server To download the tool and for more information, see the HP website (http://www.
A system reboot is required for a ROM upgrade to take effect. For disaster recovery or ROM downgrade purposes, backups of the most current ROM image are available in either redundant ROM or a ROM backup. ProLiant servers support either the redundant ROM feature or the Disaster Recovery feature. Both assist with the management of system ROM revisions and ensure the proper operation of the platform if a failure occurs during the firmware upgrade process.
NOTE: The Onboard Administrator and NIC firmware are only supported in online deployments. Verifying firmware versions To locate and verify the most current firmware versions, use the following tools: • Insight Diagnostics Online Edition ("HP Insight Diagnostics" on page 75) Access this tool from the System Management Homepage (http://h18013.www1.hp.com/products/servers/management/agents/index.html ).
information located in the HP Smart Update Manager User Guide on the HP website (http://www.hp.com/go/hpsum/documentation). For more information about HP SUM, see "HP Smart Update Manager (on page 79)." Offline deployment (if SPP is not supported) Use the procedure in this section only if your product is not supported by the SPP. If your product is supported by the SPP, use the deployment information located in the HP Smart Update Manager User Guide on the HP website (http://www.hp.
If you are using a USB drive key with multiple images, navigate to the appropriate subfolder to launch autorun for the Firmware Maintenance CD or DVD, or Smart Update Firmware DVD. 2. Read the End-User License Agreement. If you agree to the terms of the license agreement, click Agree to continue. The firmware maintenance interface appears. 3. Click the Firmware Update tab. 4. Click Install Firmware. HP SUM is initiated. 5. Select and install the preferred components.
o Servers that exhibit an Unsupported processor state ("Unsupported Processor Detected System will ONLY boot ROMPAQ Utility. System Halted." on page 130) o Servers that support the Disaster Recovery feature ("Disaster recovery support" on page 81) For additional information, see the documentation contained in the Enhanced SoftPaq.
HP resources for troubleshooting Online resources HP Technical Support website Troubleshooting tools and information, as well as the latest drivers and flash ROM images, are available on the HP website (http://www.hp.com/support). HP Guided Troubleshooting website HP Guided Troubleshooting is available for many products and components on the HP website (http://www.hp.com/support/gts). Server documentation Server documentation is the set of documents that ships with a server.
To create a profile and select notifications, refer to the HP website (http://www.hp.com/go/subscriberschoice). Change control and proactive notification HP offers Change Control and Proactive Notification to notify customers 30 to 60 days in advance of upcoming hardware and software changes on HP commercial products. For more information, refer to the HP website (http://www.hp.com/go/pcn).
• HP Support Center website (http://www.hp.com/go/hpsc) Teardown procedures, part numbers, specifications See the server maintenance and service guide, available in the following locations: • Documentation CD that ships with the server • Documentation CD that ships with the enclosure (for HP BladeSystem documentation) • HP Support Center website (http://www.hp.com/go/hpsc) Technical topics White papers are electronic documentation on complex technical topics.
Product configuration resources Device driver information Refer to driver information on the HP Software and Drivers website (http://www.hp.com/support). DDR3 memory configuration See the DDR3 Memory Configuration Tool on the HP website (http://www.hp.com/go/ddr3memory-configurator). Operating System Version Support For information about specific versions of a supported operating system, refer to the operating system support matrix (http://www.hp.com/go/supportos).
Management of the server Refer to the HP Systems Insight Manager Help Guide on the Management CD or DVD, or the HP website (http://www.hp.com/go/hpsim). Installation and configuration information for the server management system Refer to the HP Systems Insight Manager Installation and User Guide on the Management CD or DVD, or the HP website (http://www.hp.com/go/hpsim).
Error messages ADU error messages Introduction to ADU error messages This section contains a complete alphabetical list of all ADU ("Array diagnostic software" on page 76) error messages for ADU version 7.85.16.0 and earlier. IMPORTANT: This guide provides information for multiple servers. Some information may not apply to the server you are troubleshooting. Refer to the server documentation for information on procedures, hardware options, software tools, and operating systems supported by the server.
Accelerator Status: Cache was Automatically Configured During Last Controller Reset Description: Cache board was replaced with one of a different size. Action: No action is required. Accelerator Status: Data in the Cache was Lost... ...due to some reason other than the battery being discharged. Description: Data in cache was lost, but not because of the battery being discharged. Action: Be sure the array accelerator is properly seated. If the error persists, you may need to replace the array accelerator.
Accelerator Status: Obsolete Data Detected Description: During reset initialization, obsolete data was found in the cache due to the drives being moved and written to by another controller. Action: No action is required. The controller either writes the data to the drives or discards the data completely. Accelerator Status: Obsolete Data was Discarded Description: During reset initialization, obsolete data was found in the cache, and was discarded (not written to the drives). Action: No action is required.
Accelerator Status: Warranty Alert Description: Catastrophic problem exists with array accelerator board. Refer to other messages on Diagnostics screen for exact meaning of this message. Action: Replace the array accelerator board. Adapter/NVRAM ID Mismatch Description: EISA NVRAM has an ID for a different controller from the one physically present in the slot. Action: Run the server setup utility. Array Accelerator Battery Pack X not Fully Charged Description: Battery is not fully charged.
Configuration Signature is Zero Description: ADU ("Array diagnostic software" on page 76) detected that NVRAM contains a configuration signature of zero. Old versions of the server setup utility could cause this. Action: Run the latest version of server setup utility to configure the controller and NVRAM. Configuration Signature Mismatch Description: The array accelerator board is configured for a different array controller board.
Controller Reported POST Error. Error Code: X Description: The controller returned an error from its internal POST. Action: Replace the controller. Controller Restarted with a Signature of Zero Description: ADU ("Array diagnostic software" on page 76) did not find a valid configuration signature to use to get the data. NVRAM may not be present (unconfigured) or the signature present in NVRAM may not match the signature on the controller.
Drive (Bay) X is a Replacement Drive Description: This drive has been replaced. This message is displayed if a drive is replaced in a fault-tolerant logical volume. Action: If the replacement was intentional, allow the drive to rebuild.
Drive Monitoring Features Are Unobtainable Description: ADU ("Array diagnostic software" on page 76) is unable to get monitor and performance data due to a fatal command problem (such as drive time-out), or is unable to get data due to these features not being supported on the controller. Action: Check for other errors such as time-outs. If no other errors occur, upgrade the firmware to a version that supports monitor and performance, if desired.
Identify Logical Drive Data did not Match with NVRAM Description: The identify unit data from the array controller does not match with the information stored in NVRAM. This can occur if new, previously configured drives have been placed in a system that has also been previously configured. Action: Run the server setup utility to configure the controller and NVRAM.
Otherwise, follow the procedures for correcting problems when an incorrect drive is replaced or a loose cable is detected. Logical Drive X Status = Interim Recovery (Volume Functional, but not Fault Tolerant) Description: A physical drive in this logical drive has failed. The logical drive is operational, but the loss of an additional drive may cause permanent data loss. Action: Replace the failed drive as soon as possible. Logical Drive X Status = Loose Cable Detected... ...
Logical Drive X Status = Wrong Drive Replaced Description: A physical drive in this logical drive has failed. The incorrect drive was replaced. Action: 1. Power down the server. 2. Replace the drive that was incorrectly replaced. 3. Replace the original drive that failed with a new drive. CAUTION: Do not run the server setup utility and try to reconfigure, or data will be lost.
Other Controller Indicates Different Hardware Model Description: The other controller in the redundant controller configuration is a different hardware model. Action: Be sure both controllers are using the same hardware model. If they are, make sure the controllers are fully seated in their slots. Other Controller Indicates Different Firmware Version Description: The other controller in the redundant controller configuration is using a different firmware version.
RIS Copies Between Drives Do Not Match Description: The drives on this controller contain copies of the RIS that do not match. The hard drives in the array do not have matching configuration information. Action: 1. Resolve all other errors encountered. 2. Obtain the latest version of ADU, and then rerun ADU ("Array diagnostic software" on page 76). 3. If unconfigured drives were added, configure these drives using ACU. 4.
2. Reconnect the cable securely. 3. Restart the system. 4. If the problem persists, replace the cables and connectors as needed. SCSI Port X, Drive ID Y RIS Copies Within This Drive Do Not Match Description: The copies of RIS on the drive do not match. Action: Check for other errors. The drive may need to be replaced. SCSI Port X, Drive ID Y...S.M.A.R.T. Predictive Failure Errors Have Been Detected in the Factory Monitor and Performance Data... ...
Storage Enclosure on SCSI Bus X has a Cabling Error (Bus Disabled)... ...SOLUTION: The SCSI controller has an internal and external cable attached to the same bus. Please disconnect the internal or external cable from the controller. If this controller supports multiple buses, the cable disconnected can be reattached to an available bus. Description: The current cabling configuration is not supported. Action: Refer to the server documentation for cabling guidelines, and reconfigure as indicated.
Description: One or more fans in the external storage unit have failed. Action: Replace the failed fans. Storage Enclosure on SCSI Bus X Indicated that the Fan Module is Unplugged... ...SOLUTION: Make sure the fan module is properly connected. Description: A fan in the external storage unit is not connected properly. Action: Check and reseat all fan connections securely. Storage Enclosure on SCSI Bus X - Wide SCSI Transfer Failed... ...SOLUTION: This may indicate a bad SCSI cable on bus X.
Swapped Cables or Configuration Error Detected. An Unsupported Drive Arrangement Was Attempted... ...SOLUTION: Power down system then move drives back to their original location. Description: One or more physical drives were moved, causing a configuration that is not supported. Action: Move all drives to their original locations, and then refer to the server documentation for supported configurations. Swapped cables or configuration error detected. The cables appear to be interchanged... ...
System Board is Unable to Identify which Slots the Controllers are in Description: The slot indicator on the system board is not working correctly. Firmware recognizes both controllers as being installed in the same slot. Action: 1. Be sure both controllers are fully seated in their slots. If the problem persists, this might indicate a controller problem or a system board problem. CAUTION: Only authorized technicians trained by HP should attempt to remove the system board.
Unable to Retrieve Identify Controller Data. Controller May be Disabled or Failed ...SOLUTION: Power down the system. Verify that the controller is fully seated. Then power the system on and look for helpful error messages displayed by the controller. If this doesn’t help, contact your HP service provider. Description: ADU ("Array diagnostic software" on page 76) requested the identify controller data from the controller, but was unable to obtain it.
WARNING - Mixed Feature Processors Were Detected Description: Mixed feature processors were detected. The server will boot using the lowest featured processor. If you install supported processors with different features in the same system, this informational message is displayed. WARNING - Resetting Corrupted CMOS Description: This informational message displays when the ROM detects that CMOS is corrupted. The default values are restored.
Wrong Accelerator Description: This may mean that the board was replaced in the wrong slot or was placed in a system previously configured with another board type. Included with this message is a message indicating (1) the type of adapter sensed by ADU ("Array diagnostic software" on page 76), and (2) the type of adapter last configured in EISA NVRAM. Action: Check the diagnosis screen for other error messages. Run the server setup utility to update the system configuration. ADU version 8.0 through 8.
Array Accelerator: The cache is disabled because the restore operation from flash memory failed. Action: Reseat the controller cache module. If the problem persists, contact HP support ("Contacting HP" on page 185, "Contacting HP technical support or an authorized reseller" on page 185). Array Accelerator: The cache is disabled because the charge on the flash-memory capacitor is too low. Action: Replace the capacitor if the capacitor does not recharge within 10 minutes.
Array Status: The array has a spare drive assigned which is smaller than the smallest data drive in the array… …Some operations in the array will not be available. Action: Replace the spare drive with another drive at least the size of the smallest data drive in the array. Controller State: The array controller contains a volume that was created with a different version of controller firmware… …and is not backward-compatible with the current version of firmware.
Controller State: The array controller is connected to an expander card or an external enclosure… …and is operating without a memory board. If there are physical drives attached to the expander card or external enclosure, and those drives contain any logical drives, then making any configuration change will lead to potential data loss on those logical drives. Action: Install a cache memory module.
Controller State: The array controller has an unknown disabled configuration status message… …Any configuration command (e.g. logical drive creation, array expansion, etc.) or modification to the controller will result in the loss of all existing data on the disabled volume(s). Action: Contact HP support ("Contacting HP" on page 185, "Contacting HP technical support or an authorized reseller" on page 185).
Drive Offline due to Erase Operation: The physical drive is currently queued for erase. Action: No action is required. The logical drive containing this physical drive cannot be migrated or extended while the erase operation is in progress. Drive Offline due to Erase Operation: The physical drive is offline and currently being erased. Action: No action is required. The logical drive containing this physical drive cannot be migrated or extended while the erase operation is in progress.
Logical drive state: The current array controller is performing capacity expansion,... ...extension, or migration on this logical drive. Action: No action is required. Further configuration is disabled until the process completes. Logical drive state: The logical drive is disabled from a SCSI ID conflict. Action: Check all SCSI components to make sure they all have a unique SCSI ID. Logical drive state: The logical drive is not configured. Action: Refresh the system using the Array Configuration Utility.
NVRAM Error: Bootstrap NVRAM image failed checksum test... ...and could not be restored. This error may or may not be recoverable. A firmware update might be able to correct the error. Action: Update the controller firmware. If the update fails, contact HP support ("Contacting HP" on page 185, "Contacting HP technical support or an authorized reseller" on page 185). Physical Drive State: The data on the physical drive is being rebuilt. Action: No action is required.
Action: Replace the physical drive with a larger drive supported by the controller. Physical Drive State: This drive is unrecognizable... ...It is not supported for configuration and should be disconnected from this controller. Action: Replace the physical drive with a drive supported by the controller. Physical Drive State: This physical drive is part of a logical drive that is not supported by the current configuration... … Any configuration command (e.g. logical drive creation, array expansion, etc.
Redundant Path Failure: Warning: Redundant I/0 modules of this storage box... ...are not cabled in a recommended configuration. Action: To correctly connect the cables to the storage system, see the product user guide. Smart SSD State: SSD has less than 2% of usage remaining before wearout. Action: Monitor the drive frequently and replace the drive before wearout.
Storage Enclosure: One or more fans have failed. Action: Replace the failed fan. Storage Enclosure: Warning: The enclosure is reporting a high temperature status. Action: Be sure that all fans are connected and operating properly. Replace any defective fans. For better airflow, remove any dust buildup from fans or other areas. If the problem persists, contact HP support ("Contacting HP" on page 185, "Contacting HP technical support or an authorized reseller" on page 185).
Advanced Memory Protection mode: Online spare with Advanced ECC ...Xxxx MB System memory and xxxx MB memory reserved for Online Spare. Audible Beeps: None Possible Cause: This message indicates Online Spare Memory is enabled and indicates the amount of memory reserved for this feature. Action: None. Advanced Memory Protection mode: Multi-board mirrored memory with Advanced ECC ...Xxxx MB System memory and xxxx MB memory reserved for Mirroring.
Fan Solution Not Sufficient Audible Beeps: Possible Cause: The minimum number of required fans is missing or failed. Action: Install fans or replace any failed fans. Fatal DMA Error Audible Beeps: None Possible Cause: The DMA controller has experienced a critical error that has caused an NMI. Action: Run Insight Diagnostics ("HP Insight Diagnostics" on page 75) and replace failed components as indicated.
Fatal Hub Link Error Audible Beeps: None Possible Cause: The hub link interface has experienced a critical failure that caused an NMI. Action: Run Insight Diagnostics ("HP Insight Diagnostics" on page 75) and replace failed components as indicated. FATAL ROM ERROR: The System ROM is not Properly Programmed. Audible Beeps: 1 long, 1 short Possible Cause: The System ROM is not properly programmed. Action: Replace the physical ROM part. Fibre Channel Mezzanine/Balcony Not Supported.
Internal CPU Check - Processor Audible Beeps: None Possible Cause: A processor has experienced an internal error. Action: 1. Run Insight Diagnostics ("HP Insight Diagnostics" on page 75). CAUTION: Before removing or replacing any processors, be sure to follow the guidelines provided in "Performing processor procedures in the troubleshooting process (on page 16)." Failure to follow the recommended guidelines can cause damage to the system board, requiring replacement of the system board. 2.
Mismatched power supplies not supported Audible Beeps: 1 long, 1 short Possible Cause: The power supplies installed in the server are not supported in the current configuration. The server does not support installing more than one type of power supply. Action: Install supported power supplies in a supported configuration. For supported power supply configurations, see the server documentation on the HP website (http://www.hp.com/support). Mixed processor speeds detected.
No Floppy Drive Present Audible Beeps: None Possible Cause: No diskette drive is installed or a diskette drive failure has occurred. Action: 1. Power down the server. 2. Replace a failed diskette drive. 3. Be sure a diskette drive is cabled properly, if a diskette drive exists. No Keyboard Present Audible Beeps: None Possible Cause: A keyboard is not connected to the server or a keyboard failure has occurred. Action: 1. Power down the server, and then reconnect the keyboard. 2.
Power Supply Solution Not Fully Redundant Audible beeps: None Possible cause: The minimum power supply requirement is installed, but a redundant power supply is missing or failed. Action: Do one of the following: • Install a power supply. • Replace failed power supplies to complete redundancy. Processor X Unsupported Wattage. Audible beeps: 1 long, 1 short Possible cause: Processor not supported by current server.
Audible Beeps: None Possible Cause: The primary system ROM is corrupt. The system is booting from the redundant ROM. Action: Run ROMPaq Utility to restore the system ROM to the correct version. Temperature violation detected - system Shutting Down in X seconds Audible Beeps: 1 long, 1 short Possible Cause: The system has reached a cautionary temperature level and is shutting down in X seconds. Action: Adjust the ambient temperature, install fans, or replace any failed fans.
Trusted Execution Error found: 0X Audible beeps: None Possible cause: Intel Trusted Execution Technology has indicated an error during the previous attempt at trusted boot. Action: Check the error code in the Intel documentation. For more information, see the Intel website (http://www.intel.com). Unsupported DIMM(s) found in system. - DIMM(s) may not be used Description: Unsupported memory types found in system.
Audible Beeps: None Possible Cause: A USB tape device that supports One Button Disaster Recovery (OBDR) is installed in the system. Action: 1. Press 1 or 2. o Pressing 2 exits the configuration. o Pressing 1 starts the configuration. The following message appears Attempting to enable OBDR for the attached USB tape drive... 2. Observe the configuration progress. The following error may appear: Error - USB tape drive not in Disaster Recovery mode. 3.
CAUTION: Before removing or replacing any processors, be sure to follow the guidelines provided in "Performing processor procedures in the troubleshooting process (on page 16)." Failure to follow the recommended guidelines can cause damage to the system board, requiring replacement of the system board. Correct the processor configuration. WARNING: ProLiant Demand Based Power Management cannot be supported with the following processor configuration. The system will run in Full Performance mode.
102-System Board Failure Audible Beeps: None Possible Cause: 8237 DMA controllers, 8254 timers, and similar devices. CAUTION: Only authorized technicians trained by HP should attempt to remove the system board. If you believe the system board requires replacement, contact HP Technical Support ("Contacting HP" on page 185, "Contacting HP technical support or an authorized reseller" on page 185) before proceeding. Action: Replace the system board. Run the server setup utility.
104-ASR Timer Failure Audible Beeps: None Possible Cause: System board failure. CAUTION: Only authorized technicians trained by HP should attempt to remove the system board. If you believe the system board requires replacement, contact HP Technical Support ("Contacting HP" on page 185, "Contacting HP technical support or an authorized reseller" on page 185) before proceeding. Action: Run Insight Diagnostics ("HP Insight Diagnostics" on page 75) and replace failed components as indicated.
200 Series 201-Memory Error Audible Beeps: None Possible Cause: Memory failure detected. Action: Run Insight Diagnostics ("HP Insight Diagnostics" on page 75) and replace failed components as indicated. 203-Memory Address Error Audible Beeps: None Possible Cause: Memory failure detected. Action: Run Insight Diagnostics ("HP Insight Diagnostics" on page 75) and replace failed components as indicated. 207 - Invalid Memory Configuration Detected. DIMMs installed when no corresponding processor is detected.
207-Invalid Memory Configuration - DIMM Size Parameters Not Supported. Audible Beeps: 1 long, 1 short Possible Cause: Installed memory module is an unsupported size. Action: Install a memory module of a supported size. 207-Invalid Memory Configuration - Incomplete Bank Detected in Bank X Audible Beeps: 1 long, 1 short Possible Cause: Bank is missing one or more DIMMs. Action: Fully populate the memory bank.
207-Invalid Memory Configuration - Single channel memory... ...mode supports a single DIMM installed in DIMM socket 1. Please remove all other DIMMs or install memory in valid pairs. System Halted. Audible Beeps: 1 long, 1 short Possible Cause: DIMMs are installed in pairs, but the server is in single channel memory mode. Action: Remove all other DIMMs or install memory in valid pairs and change the memory mode.
Possible Cause: Memory boards are not installed sequentially. Action: Install or reinstall memory boards sequentially. 209-Invalid Lockstep memory configuration Audible Beeps: 1 long, 1 short Possible Cause: The memory is not installed properly to support Lockstep mode. Action: See the server documentation for supported Lockstep memory configurations.
214-Processor PPM Failed, Module X Audible Beeps: None Possible Cause: Indicated PPM failed. Action: Run Insight Diagnostics ("HP Insight Diagnostics" on page 75) and replace failed components as indicated. 300 Series 301-Keyboard Error Audible Beeps: None Possible Cause: Keyboard failure occurred. Action: 1. Power down the server, and then reconnect the keyboard. 2. Be sure no keys are depressed or stuck. 3. If the failure reoccurs, replace the keyboard.
304-Keyboard or System Unit Error Audible Beeps: None Possible Cause: Keyboard, keyboard cable, mouse controller, or system board failure. Action: 1. Be sure the keyboard and mouse are connected. CAUTION: Only authorized technicians trained by HP should attempt to remove the system board. If you believe the system board requires replacement, contact HP Technical Support ("Contacting HP" on page 185, "Contacting HP technical support or an authorized reseller" on page 185) before proceeding. 2.
2. Replace the diskette drive, the cable, or both. 3. Run Insight Diagnostics ("HP Insight Diagnostics" on page 75) and replace failed components as indicated. 602-Diskette Boot Record Error Audible Beeps: None Possible Cause: The boot sector on the boot disk is corrupt. Action: 1. Remove the diskette from the diskette drive. 2. Replace the diskette in the drive. 3. Reformat the diskette. 605-Diskette Drive Type Error. Audible Beeps: 2 short Possible Cause: Mismatch in drive type occurred.
1100 Series 1151-Com Port 1 Address Assignment Conflict Audible Beeps: 2 short Possible Cause: Both external and internal serial ports are assigned to COM X. Action: Run the server setup utility and correct the configuration. 1600 Series 1609 - The server may have a failed system battery. Some... ...configuration settings may have been lost and restored to defaults. Refer to server documentation for more information. If you have just replaced the system battery, disregard this message.
Audible Beeps: None Possible Cause: Required fans are missing or not spinning. Action: 1. Check the fans to be sure they are installed and working. 2. Be sure the assembly is properly connected and each fan is properly seated. 3. If the problem persists, replace the failed fans. 4. If a known working replacement fan is not spinning, replace the assembly. 1611-CPU Zone Fan Assembly Failure Detected. Single fan... ...failure. Assembly will provide adequate cooling.
1611-Fan x Not Present (Fan Zone CPU) Audible Beeps: 2 short Possible Cause: Required fan is not installed or spinning. Action: 1. Check the fans to be sure they are working. 2. Be sure each fan cable is properly connected, if applicable, and each fan is properly seated. 3. If the problem persists, replace the failed fans. 1611-Fan x Not Present (Fan Zone I/O) Audible Beeps: 2 short Possible Cause: Required fan is not installed or spinning. Action: 1. Check the fans to be sure they are working. 2.
1611-Redundant Fan Failure (Fan Zone System) Audible Beeps: None Possible Cause: A redundant fan is not spinning. Action: Replace the failed fan. 1612-Primary Power Supply Failure Audible Beeps: 2 short Possible Cause: Primary power supply has failed. Action: Replace power supply. 1615-Power Supply Configuration Error Audible Beeps: None Possible Cause: The server configuration requires an additional power supply.
1700 Series 1700-Slot X Drive Array - Please replace Array Accelerator Battery... ...The Array Accelerator Cache will be enabled once the battery has been replaced and charged. Audible Beeps: None Possible Cause: The battery needs to be replaced and charged. Action: Replace and charge the Array Accelerator battery. 1701-Slot X Drive Array - Please install Array Accelerator Battery... ...The Array Accelerator Cache will be enabled once the battery is installed and charged.
Audible Beeps: None Possible Cause: An application has overwritten memory reserved by the Smart Array controller. Action: If this occurs when a particular application is loaded, check for an updated version of that application.
1711-Slot X Drive Array - RAID ADG logical drive(s) configured but Array Accelerator size <= 32 MB ...This configuration is not recommended. Consider migrating logical drive(s) to RAID 5 or upgrading the Array Accelerator module. Audible Beeps: None Possible Cause: This configuration is not recommended. Action: Migrate logical drives to RAID 5 or upgrade to a larger array accelerator module. 1711-Slot X Drive Array - Stripe size too large for RAID 5/6 logical drive(s) ...
1715-Slot X Drive Array Controller - Memory Error(s) Occurred... Warning: Corrected Memory Error(s) were detected during controller memory self-test. Upgrade to the latest firmware. If the problem persists, replace the Cache Module or Controller. Audible Beeps: None Possible Cause: The memory is beginning to fail. Action: • Upgrade to the latest firmware. • If this error persists, replace the cache module or controller.
1719-Slot X Drive Array - A controller failure event occurred prior to this power-up (previous lock-up code = 0x####) Audible Beeps: None Possible Cause: A controller failure event occurred before the server powered up. Action: Install the latest version of controller firmware. If the condition persists, then replace the controller. 1720-Slot X Drive Array - S.M.A.R.T. Hard Drive(s) Detect imminent failure: Port X Box Y Bay(s) Z... ...
1725-Slot X Drive Array-Optional SIMM (Memory Module) Problem Detected Audible Beeps: None Possible Cause: SIMM has been automatically disabled because of memory errors or because an unsupported SIMM type was installed. Action: Replace the SIMM memory module on the indicated controller. 1726-Slot X Drive Array - Cache Memory Size or Battery Presence Has Changed ...Array Accelerator configuration has automatically been updated.
1729-Slot X Drive Array - Performance Optimization Scan In Progress ...RAID 4/5/ADG performance may be higher after completion. Audible Beeps: None Possible Cause: One or more RAID 4/5/ADG parity drives are being initialized. Performance of the controller improves after the parity data has been initialized by ARM, an automatic process that runs in the background on the controller. Action: No action is required. 1729-Slot X Disk Performance Optimization Scan In Progress... (sometimes followed by:) ...
Audible Beeps: None Possible Cause: An incorrect enclosure firmware version is installed, or an enclosure firmware upgrade is needed. Action: • Upgrade the enclosure firmware and the controller firmware. • If the condition persists, then replace the enclosure components. For more information, see the HP BladeSystem c-Class Enclosure Troubleshooting Guide on the HP website (http://www.hp.com/support/BladeSystem_Enclosure_TSG_en).
1737-Slot X Drive Array - Redundant Cabling Configuration has excess Device Paths... ...Redundant I/O paths to some devices attached to the controller are exceeding per device limit by firmware. These excess paths are ignored. Audible Beeps: None Possible Cause: The redundant cabling configuration creates more redundant I/O paths than the firmware allows. Action: Update the firmware to the correct version. Verify the redundant cabling configuration.
1741-Fixed Disk 1 failed Set Block Mode Audible Beeps: None Possible Cause: Fixed drive error detected. Action: Run the server setup utility and correct the configuration. 1743-Slot X Drive Array - Logical Drive Erase Operation in Progress... ...Logical drives being erased are temporarily offline. Audible Beeps: None Possible Cause: The drives being erased are offline. Action: Do one of the following: • Wait for the erase process to complete before using the logical drive.
1746-Slot X Drive Array - Unsupported Storage Connection Detected... ...SAS connection via expander is not supported on this controller model. Access to all storage has been disabled until the expander and connections beyond it are detached or controller is upgraded. Audible Beeps: None Possible Cause: The controller or firmware version does not support the attached drive enclosure. Action: Upgrade the controller, or detach the expander-based storage connections.
Audible Beeps: None Possible Cause: The current battery pack is not supported on this Array Accelerator module. Action: Install only supported battery packs with the correct part number. 1749-Slot X Drive Array - Array Accelerator Flash Memory being erased... ...Accelerator will be reenabled when flash memory erase has completed. Audible Beeps: None Possible Cause: The Array Accelerator flash memory is being erased. Action: No action is required.
• A defective system board or a controller is not fully seated in the PCI slot. Action: • If the controllers are different models, replace one of the controllers so they are both the same model. • Reseat the controllers. CAUTION: Only authorized technicians trained by HP should attempt to remove the system board.
1760-Fixed Disk 0 does not support Block Mode Audible Beeps: None Possible Cause: Fixed drive error detected. Action: Run the server setup utility and correct the configuration. 1761-Fixed Disk 1 Does Not Support Block Mode Audible Beeps: None Possible Cause: Fixed drive error detected. Action: Run the server setup utility and correct the configuration. 1762-Slot X Drive Array - Controller Firmware Upgrade Needed ...
Audible Beeps: None Possible Cause: The capacity expansion process has been temporarily disabled. Action: Follow the action that is displayed onscreen to resume the capacity expansion process. 1768-Slot X Drive Array - Resuming Logical Drive Expansion Process Audible Beeps: None Possible Cause: Power was lost while a logical expansion operation was performed. A controller reset or power cycle occurs while array expansion is in progress. Action: No action is required.
1771-Primary Disk port Address conflict Audible Beeps: None Possible Cause: Internal and external hard drive controllers are both assigned to the primary address. Action: Run the server setup utility and correct the configuration. 1772-Secondary Disk port Address conflict Audible Beeps: None Possible Cause: Address assignment conflict. Internal and external hard drive controllers are both assigned to the secondary address. Action: Run the server setup utility and correct the configuration.
2. Be sure the cables to the specified port are connected properly and securely ("Loose connections" on page 18). 3. Reconfigure the drives to different SCSI ports. 1776-Slot X Drive Array - Shared SAS Port Connection Conflict Detected - Ports 1I, 1E: Storage connections detected on both shared internal and external ports. ...Controller selects internal port until connection is removed from one of the ports.
• Be sure the internal plenum cooling fan in tower servers or storage systems is operational. If the fan is not operating, check for obstructions and check all internal connections. • Replace the unit side panel if removed. • Check the LEDs. If the ProLiant Storage System power LED is amber instead of green, this indicates a redundant power supply failure. • If the message indicates to check SCSI cables, do the following: a.
1778-Slot X Drive Array resuming Automatic Data Recovery (Rebuild) process Audible Beeps: None Possible Cause: A controller reset or power cycle occurred while Automatic Data Recovery was in progress. Action: No action is required. 1779-Slot X Drive Array - Replacement drive(s) detected OR previously failed drive(s) now appear to be operational:... ...Port X Box Y Bay(s) Z Restore data from backup if replacement drive(s) have been installed.
1781-Disk 1 Failure Audible Beeps: None Possible Cause: Hard drive or format error detected. Action: Run Insight Diagnostics ("HP Insight Diagnostics" on page 75) and replace failed components as indicated. 1782-Disk Controller Failure Audible Beeps: None Possible Cause: Hard drive circuitry error detected. Action: Run Insight Diagnostics ("HP Insight Diagnostics" on page 75) and replace failed components as indicated. 1783-Slot X Drive Array Controller Failure... followed by one of the following:) ...
Audible Beeps: None Possible Cause: Defective drive or cables detected. Action: 1. Be sure all cables are connected properly and securely. 2. Be sure all drives are fully seated. 3. Replace defective cables, drive X, or both. 1784-Slot X Drive Array - Logical Drive Failure Audible Beeps: None Possible Cause: Defective drive or cables detected. Action: 1. Be sure all cables are connected properly and securely. 2. Be sure all drives are fully seated. 3. Replace defective cables, drive X, or both.
Audible Beeps: None Possible Cause: A failed or replacement drive has not yet been rebuilt. Action: • • Perform one of the following actions: o Press the F1 key to continue with recovery of data to the drive. Data will be automatically restored to drive X when a failed drive has been replaced, or to the original drive if it is working again without errors. o Press the F2 key to continue without recovery of data to the drive.
Action: • • If replacement drives are installed in the wrong bays, properly reinstall the drives as indicated and then do one of the following: o Press the F1 key to restart the server with the drive array disabled. o Press the F2 key to use the drives as configured and lose all the data on them. If a bad power cable connection exists: a. Repair the connection and press the F2 key. b. If the problem persists, run ADU ("Array diagnostic software" on page 76) to resolve.
1791-Disk 1 Error Audible Beeps: None Possible Cause: Hard drive error or wrong drive type detected. Action: 1. Run the server setup utility and correct the configuration. 2. If the problem persists, run Insight Diagnostics ("HP Insight Diagnostics" on page 75) and replace the failed assembly as indicated. 1792-Slot X Drive Array - Valid Data Found in Array Accelerator... ...Data will automatically be written to drive array.
Audible Beeps: None Possible Cause: Power was interrupted while data was in the array accelerator memory, or the data stored in the array accelerator does not correspond to this drive array. Action: Match the array accelerator to the correct drive array, or run ACU to clear the data in the array accelerator. 1796-Slot X Drive Array - Array Accelerator Not Responding... ...Array Accelerator is temporarily disabled. Audible Beeps: None Possible Cause: Array accelerator is defective or is missing.
1800 Series 1800-Slot X Drive Array - Array Accelerator Super-Cap is charging... ...The Array Accelerator Cache will be enabled once Super-Cap has been charged. No action is required. Audible Beeps: None Possible Cause: The Array Accelerator Super-Cap is charging. Action: No action is required. 1801-Slot X Drive Array - Please install Array Accelerator Super-Cap... ...The Array Accelerator Cache will be enabled once Super-Cap is installed and charged.
A CPU Power Module (System Board, Socket X)... ...A CPU Power Module (Slot X, Socket Y) Failed Event Type: Power module failure Action: Replace the power module. In the case of an embedded power module, replace the system board. ASR Lockup Detected: Cause Event Type: System lockup Action: Examine the IML ("Integrated Management Log" on page 76) to determine the cause of the lockup.
EISA Expansion Bus Master Timeout (Slot X)... ...EISA Expansion Bus Slave Timeout EISA Expansion Board Error (Slot X) EISA Expansion Bus Arbitration Error Event Type: Expansion bus error Action: Power down the server, and then replace the EISA board. PCI Bus Error (Slot X, Bus Y, Device Z, Function X) Event Type: Expansion bus error Action: Replace the PCI board.
System AC Power Overload (Power Supply X) Event Type: Power supply overload Action: 1. Switch the voltage from 110 V to 220 V or add an additional power supply (if applicable to the system). 2. If the problem persists, remove some of the installed options. System AC Power Problem (Power Supply X) Event Type: AC voltage problem Action: Check for any power source problems. System Fan Failure (Fan X, Location) Event Type: Fan failure Action: Replace the fan.
CAUTION: Only authorized technicians trained by HP should attempt to remove the system board. If you believe the system board requires replacement, contact HP Technical Support ("Contacting HP" on page 185, "Contacting HP technical support or an authorized reseller" on page 185) before proceeding. Action: Replace the board on which the processor is installed. Uncorrectable Memory Error (Slot X, Memory Module Y)... ...
Location: Server blade management module Action: Perform the following steps to resolve the problem. Stop when the problem is resolved. 1. Press the server blade management module reset button. 2. Replace the server blade management module. Server blade management module signal backplane error codes LED code: 10-1, 10-2, or 10-3 Location: Server blade management backplane Action: Perform the following steps to resolve the problem. Stop when the problem is resolved. 1.
For more information, refer to the HP BladeSystem Maintenance and Service Guide on the HP website (http://www.hp.com/products/servers/proliant-bl/p-class/info). 3. Replace the interconnect device. For more information, refer to the HP BladeSystem Maintenance and Service Guide on the HP website (http://www.hp.com/products/servers/proliant-bl/p-class/info).
For more information, refer to the HP BladeSystem Maintenance and Service Guide on the HP website (http://www.hp.com/products/servers/proliant-bl/p-class/info). Interconnect Module B (10-Connector) Error Code LED code: 16-1 or 16-2 Location: Interconnect module - side B (10-connector) Action: Perform the following steps to resolve the problem. Stop when the problem is resolved. 1. Press the server blade management module reset button. 2. Reseat the interconnect module.
Location LED codes Power Supply - Slot 3 3-1 or 3-2 Power Supply - Slot 4 4-1 or 4-2 Power Supply - Slot 5 5-1 or 5-2 Power Supply - Slot 6 6-1 or 6-2 Action: Perform the following steps to resolve the problem. Stop when the problem is resolved. 1. Reseat the power supply. For more information, refer to the HP BladeSystem Maintenance and Service Guide on the HP website (http://www.hp.com/products/servers/proliant-bl/p-class/info). 2. Reseat the power management module. 3.
For example, if the port 85 code displays "31h," see "Processor-related port 85 codes (on page 180)" for more information. Port 85 code Description 3xh Port 85 codes in this format indicate processor-related errors. See "Processor-related port 85 codes (on page 180)" for more information. 4xh Port 85 codes in this format indicate memory-related errors. See "Memory-related port 85 codes (on page 181)" for more information. 6xh Port 85 codes in this format indicate expansion board-related errors.
CAUTION: Before replacing or reseating any processors, be sure to follow the guidelines provided in "Performing processor procedures in the troubleshooting process (on page 16)." Failure to follow the recommended guidelines can cause damage to the system board requiring replacement of the system board. 4. Reseat the remaining processors, rebooting after each installation to identify any failed processors. IMPORTANT: Populate the processors in the following order: 1, 2, 4, 3.
5. Replace the DIMMs with a remaining bank of memory. 6. Replace the memory board, if applicable. 7. Replace the system board. IMPORTANT: If replacing the system board or clearing NVRAM, you must re-enter the server serial number through RBSU ("Re-entering the serial number and product ID" on page 71). Expansion board-related port 85 codes Expansion board-related port 85 codes display in the format 6xh. IMPORTANT: Reboot the server after completing each numbered step.
IMPORTANT: Reboot the server after completing each numbered step. If the error condition continues, proceed with the next step. 1. Bring the server to base configuration by removing all components that are not required by the server to complete POST. For more information, see "Breaking the server down to the minimum hardware configuration (on page 16).
Message ID: 4140 Severity: Warning Description: The system is operating with a heterogeneous processor environment. Action: None Message ID: 4141 Severity: Warning Description: Only X out of the X installed processors have been started by the operating system. The system will continue to operate. Action: Confirm that the license agreement in use supports all of the installed processors.
Contacting HP Contacting HP technical support or an authorized reseller Before contacting HP, always attempt to resolve problems by completing the procedures in this guide. IMPORTANT: Collect the appropriate server information and operating system information ("Operating system information you need" on page 186) before contacting HP for support. For United States and worldwide contact information, see the Contact HP website (http://www.hp.com/go/assistance).
For more information on obtaining the Onboard Administrator SHOW ALL report, see the HP website (http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?lang=en&cc=us&objectID=c 02843807).
o IRQ and I/O address information in text format • An updated Emergency Repair Diskette • If HP drivers are installed: • o Version of the PSP used o List of drivers from the PSP The drive subsystem and file system information: o Number and size of partitions and logical drives o File system on each logical drive • Current level of Microsoft® Windows® Service Packs and Hotfixes installed • A list of each third-party hardware component installed, with the firmware revision • A list of each
o List of drivers from the PSP (/var/log/hppldu.
o • Output of /etc/ifconfig command o /etc/conf/cf.d/sdevice o /etc/inittab o /etc/conf/cf.d/stune o /etc/conf/cf.d/config.h o /etc/conf/cf.
• Warp Server version used and: o Whether Entry, Advanced, Advanced with SMP, or e-Business o All services running at the time the problem occurred • A list of each third-party hardware component installed, with the firmware revisions • A list of each third-party software component installed, with the versions • A detailed description of the problem and any associated error messages Oracle Solaris operating systems Collect the following information: • Operating system version number • Type of
Acronyms and abbreviations ABEND abnormal end ACPI Advanced Configuration and Power Interface ACU Array Configuration Utility ADG Advanced Data Guarding (also known as RAID 6) ADU Array Diagnostics Utility AMP Advanced Memory Protection ASR Automatic Server Recovery BMC baseboard management controller CS cable select DMA direct memory access DU driver update EFS Extended Feature Supplement Acronyms and abbreviations 191
ESD electrostatic discharge FBDIMM fully buffered DIMM FDT Firmware Deployment Tool HP SIM HP Systems Insight Manager HP SUM HP Smart Update Manager IDE integrated device electronics iLO Integrated Lights-Out iLO 2 Integrated Lights-Out 2 iLO 3 Integrated Lights-Out 3 IMD Integrated Management Display IML Integrated Management Log IRQ interrupt request KVM keyboard, video, and mouse LVD low-voltage differential Acronyms and abbreviations 192
MMX multimedia extensions NMI nonmaskable interrupt NVRAM nonvolatile memory OBDR One Button Disaster Recovery ORCA Option ROM Configuration for Arrays PCI-X peripheral component interconnect extended POST Power-On Self Test PPM processor power module PSP HP ProLiant Support Pack PXE preboot execution environment RBSU ROM-Based Setup Utility RIS reserve information sector RPM Red Hat Package Manager SAS serial attached SCSI Acronyms and abbreviations 193
SATA serial ATA SIM Systems Insight Manager SIMM single inline memory module SP1 Service Pack 1 SPP HP Service Pack for ProLiant SSD support software diskette TPM Trusted Platform Module UPS uninterruptible power system USB universal serial bus VCA Version Control Agent VCRM Version Control Repository Manager Acronyms and abbreviations 194
Index A accelerator error log 90 accelerator status 91, 92, 93 ACPI support 60 ACU (Array Configuration Utility) 69 adapters 93, 98 additional information 86 ADG enabler dongle is broken or missing 93 ADU (Array Diagnostic Utility) 75 ADU error messages 90, 110 Advanced ECC support 120, 121 Advanced Memory Protection (AMP) 120, 121 advisories 85 application software problems 62 array accelerator 110 array accelerator battery pack 168 array accelerator board 90, 91, 93, 97, 98, 100, 108, 110, 147, 150, 168 a
critical error 121 CSR (customer self repair) 184 customer self repair (CSR) 184 D data loss 41 data recovery 41, 45 DDR3 memory configuration 88 deployment, offline 82 deployment, online 82 device driver information 88 diagnose tab, HP Insight Diagnostics 74 diagnosing problems 74 diagnostic tools 66, 71, 74 diagnostics 69 Diagnostics tasks 69 diagnostics utility 74 dial tone 55 DIMM installation guidelines 49 DIMMs 19, 124, 126, 128, 129, 134, 135, 136, 137 dirty data 91 disable command issued 95 disaste
graphics card option 52 guided troubleshooting 85 guidelines, cabling 19 H hard drive guidelines 19 hard drive LED combinations 20 hard drive LEDs 20, 43 hard drive problems, diagnosing 43 hard drive, failure of 44 hard drives, determining status of 20 hard drives, moving 44, 45 hardware features 36 hardware problems 36, 38 hardware supported 36 hardware troubleshooting 36, 38, 39, 41, 52 health driver 71 health LEDs 20 hot fixes 60 hot-plug PCI slot, power fault 126 how to use this guide 11 HP BladeSystem
memory, mirrored 68, 69, 137 memory, RAID 68, 137 memory-related port 85 codes 180 Microsoft operating systems 185 minimum hardware configuration 16 Mini-SAS cable 55 mirror data miscompare 100 mirrored memory 68, 69, 121 miscellaneous port 85 codes 181 modem problems 55 modems 55, 56 monitor 52, 53 mouse 54 mouse problems 54 N network connection problems 63 network controller problems 57 network controllers 57, 58 network interconnect blades 58 new hardware 38 NMI event 122, 123, 125 no dial tone 55 non-s
PPM (processor power module) 49, 50, 124 PPM failure LEDs 49, 50 PPM problems 49, 138 PPM slots 49 pre-diagnostic steps 12 predicive failure errors detected 103 preparation procedures 15 preparing the server for diagnosis 15 printer problems 54 printers 54 processor correctable error threshold passed 172 processor error codes 84, 182 processor failure LEDs 50 processor installation tool 16 Processor Power Module (PPM) 49, 50 processor problems 16, 50, 84, 124, 125, 137, 182, 183 processor stepping 84, 130 p
Service Packs 60, 77 shared ports 161 shared SAS port connection conflict 161 short circuits 37 Smart Array SCSI Diagnosis feature 43, 74 Smart Update Manager 77, 78, 81 SmartStart autorun menu 66 SmartStart Scripting Toolkit 66 SmartStart software 88 SmartStart, overview 66 software 59, 66, 88 software errors 62 software failure 62 software problems 59 software resources 66, 88 software troubleshooting 59, 62 specifications, option 86 specifications, server 86, 87 SPP 77 start diagnosis flowchart 24 static
W warning messages 108, 109 warnings 13, 86 website, HP 85, 86 websites, reference 22, 85 what's new 8 when to reconfigure or reload software 61 white papers 85, 87 Windows Event Log processor error codes 182 Index 201