PCI / PCIe Error Recovery Product Note, September 2010

Automatic Recovery from a PCI Error
With the PCI Error Recovery feature enabled, if an error occurs on a PCI bus containing an I/O
card that supports PCI Error Recovery, the following sequence of events occur during automatic
error recovery:
1. The PCI bus is isolated from further I/O
2. The I/O devices are quiesced
3. The error is cleared
4. The bus is reset
5. The devices are resumed
The following example illustrates what you can expect if automatic recovery from a PCI error
occurs:
1. Automatic recovery from a PCI error occurs on a PCI bus containing a LAN card at hardware
path 0/0/0, which is associated with the iether driver.
2. Error and recovery messages are displayed on the console as follows:
PCI Error reported at Hardware Path 0/0/0
Hardware path 0/0/0 Successfully recovered from PCI Error
3. The olrad -q command output will be normal after a PCI Error recovery. See path 0/0/0/1
in the following example:# olrad -q
4. The ioscan -fnH command output will be normal after the PCI Error recovery, for example:
#ioscan -fnH 0/0/0
NOTE: If the devices on a bus that supports PCI error recovery encounter further errors
within the time interval specified by the pci_error_tolerance_time tunable following
automatic error recovery, they will remain quiesced. If this happens and the devices are in
hotpluggable slots, you can recover manually by using the olrad command or the attention
button to do an online replacement. You can also recover manually using the olrad
command to do an online deletion.
For more information on the pci_error_tolerance_time tunable, see “Tunable Kernel
Parameters”.
12 PCI / PCIe Error Recovery Product Note