PCI / PCIe Error Recovery Product Note, September 2010
1 PCI / PCIe Error Recovery Product Note
The PCI / PCIe Error Recovery feature provides the ability to detect, isolate, and automatically
recover from a PCI / PCIe error, avoiding a system crash. PCI Error Recovery is included with
the HP-UX 11i v3 operating system, and it is enabled by default.
NOTE: PCI / PCIe Error Recovery is not supported on all platforms. To determine if PCI / PCIe
Error Recovery is supported on your system, see the PCI Error Recovery Support Matrix, available
at http://www.hp.com/go/hpux-networking-docs in the PCI Error Recovery section.
With the PCI / PCIe Error Recovery feature enabled, if an error occurs on a PCI bus containing
an I/O card that supports PCI Error Recovery:
• The PCI bus is quarantined to isolate the system from further I/O and prevent the error from
damaging the system.
• The PCI Error Recovery feature will attempt to recover from the error and reinitialize the
bus so I/O can resume.
If an error occurs during the automated error recovery process, the bus and I/O card will remain
quiesced. If the bus contains a card that supports online addition, replacement, or deletion (OL*)
and the card is in a hotpluggable slot, you can use the olrad command (or the attention button)
to manually recover from the error by replacing the card.
For information on OL* operations, see the Interface Card OL* Support Guide, available at: http://
www.hp.com/go/hpux-core-docs
To determine if OL* is supported, see the I/O card documentation or support matrix available at
http://www.hp.com/go/hpux-iocards-docs
If the PCI Error Recovery feature is disabled and an error occurs on a PCI bus, a Machine Check
Abort (MCA) or a High Priority Machine Check (HPMC) will occur, and the system will crash.
NOTE: PCI / PCIe Error Recovery is enabled by default. If you use HP Serviceguard, HP
recommends the PCI Error Recovery feature only be enabled if your storage devices are configured
with multiple paths and you have not disabled HP-UX native multipathing. If PCI Error Recovery
is enabled, but your storage devices are configured with only a single path, HP Serviceguard
might not detect when connectivity is lost. If HP Serviceguard does not detect loss of connectivity,
it does not cause a failover. For instructions on using the pci_eh_enable tunable to disable
PCI Error Recovery, see “Tunable Kernel Parameters”.
If a PCI error occurs on an I/O card very early in the boot process or an OL* online addition
operation, the I/O card will not be claimed and the software state of the I/O card will be marked
as UNUSABLE in the ioscan(1) output. To recover I/O cards that are in the UNUSABLE state, a
system reboot is required.
5