PCI Error Handling Product Note HP-UX Servers and Workstations HP Part Number: 5992-0539 Published: March 2007 Edition: Third Edition
© Copyright 2001-2007 Hewlett-Packard Development Company LP. All rights reserved Legal Notices The information in this document is subject to change without notice. Hewlett-Packard makes no warranty of any kind with regard to this manual, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose.
{} The contents are required in formats and command descriptions. If the contents are a list separated by |, you must choose one of the items ... The preceding element may be repeated an arbitrary number of times. | Separates litems in a list of choices.
Table of Contents 1 PCI Error Handling Product Note..................................................................................7 What is PCI Error Handling?..................................................................................................................7 Accessing and Installing the PCI Error Handling Feature.....................................................................7 Confirm PCI Error Handling is Supported.....................................................................
List of Tables 1-1 6 July 2008 Defect Fix on HP-UX 11i v2 OS.....................................................................................
1 PCI Error Handling Product Note What is PCI Error Handling? The PCI Error Handling feature allows an HP-UX system to avoid a Machine Check Abort (MCA) or a High Priority Machine Check (HPMC), if a PCI error occurs (for example, a parity error). If a PCI error occurs on a bus without the PCI Error Handling feature installed, an MCA or an HPMC will occur, then the system will crash.
| | | | | MP | 15.22 | | | | ED | 3.13 | | | | CLU | 15.2 | 15.2 | 15.2 | 15.2 | PM | 15.0 | 15.0 | 15.0 | 15.0 | CIO (bay 0, chassis 1) | 15.0 | 15.0 | 15.0 | 15.0 | CIO (bay 0, chassis 3) | 15.0 | 15.0 | 15.0 | 15.0 | CIO (bay 1, chassis 1) | 15.0 | 15.0 | 15.0 | 15.0 | CIO (bay 1, chassis 3) | 15.0 | | 15.0 | 15.
Core IO Master : Event Dict. : Slave : Event Dict. : A.007.008 0.009 A.007.008 0.009 Cell 0 PDHC : A.003.027 Pri SFW : 23.001 (PA) Sec SFW : 23.001 (PA) Cell 1 PDHC : A.003.027 Pri SFW : 23.001 (PA) Sec SFW : 23.001 (PA) Cell 2 PDHC : A.003.027 Pri SFW : 23.001 (PA) Sec SFW : 23.001 (PA) Cell 3 PDHC : A.003.027 Pri SFW : 23.001 (PA) Sec SFW : 23.
NOTE: The sysrev command output on some systems includes extra zeros in the system firmware version number. These zeros can be ignored. For example, 3.88 and 3.088 on HP Integrity systems are the same firmware version, also 23.1 and 23.001 on HP 9000 systems represent the same firmware version. 3. The system firmware is the main component of the firmware recipe required to support PCI Error Handling.
NOTE: In addition to installing the PCIErrorHandling bundle, the btlan, igelan, and iether drivers require patches to enable PCI Error Handling. Also, the latest version of the fcd and mpt driver must be installed to enable PCI Error Handling.The patch required for the btlan driver is included with the PCIErrorHandling bundle. The patches required for the igelan and iether drivers must be downloaded and installed separately from the IT Resource Center at http://www.itrc.hp.com.
New iether Driver Error Messages The new error messages for the iether driver (Gigabit Networking) that will appear in the console log as illustrated in the following examples: -------------------100BT/Gigabit Ethernet LAN/9000 Networking---------------@#%Thu Jan 24 MST 2008 21:50:49.540624 DISASTER Subsys:IETHER Loc:00000<1002> 1000Base-T in path 6/0/0/1/0 Was moved to DEAD state due to a PCI error.
4. and the same (or later) release version number, then repeat the Post Replace operation described in Step 2. If the Post Replace operation succeeds and the I/O card/slot recovers from the error, the software state of the components will be marked CLAIMED in the ioscan(1M) output. If you continue to experience errors on this slot, there is a high probability that the I/O card is bad.
C. Execute ioscan -kfnH on the iether driver to confirm the card is in error state: Class I H/W Path Driver S/W State H/W Type Description ba 0 6/0/0 lba ERROR BUS_NEXUS Local PCI-X Bus Adapter(12ee) lan 0 6/0/0/1/0 iether ERROR INTERFACE HP A7012-60001 PCI/PCI-X 1000Base-T Dual-port Adapterlan 1 6/0/0/1/1 iether ERROR INTERFACE HP A7012-60001 PCI/PCI-X 1000Base-T Dual-port Adapter D. To recover from the error, use the olrad -p off command to power off the slot: # olrad -p off 0-1-1-0 E.
Activity : Target slot powered on, drivers resumed, OK to start using the cardTarget slot : 0-1-1-0 G.
Table 1-1 July 2008 Defect Fix on HP-UX 11i v2 OS Defect ID QXCR1000812720 Description The PCI Error Handling product bundle, delivered during 11.23 0712, as pcie_eh module supports error recovery on express slots. As this functionality is not applicable to PA-RISC, the module pcie_eh is delivered only for HP Integrity platforms.
Post Replace Operation - By issuing the olrad -R slot_id command after an I/O card is replaced, slot power is turned on, suspended drivers are resumed, driver scripts (post_replace) for the slot (slot_id) and affected slots (if any) are run, and the attention LED for the slot (slot_id) is set to OFF.