PCI Error Recovery Product Note, 3rd Edition, March 2008

PCI Error Recovery Product Note
Manual Recovery from a PCI Error
Chapter
13
For more information on manual recovery from a PCI error, see “Manual Recovery from a PCI
Error” on page 13.
Manual Recovery from a PCI Error
After a successful automatic PCI error recovery, if another PCI Error is detected within the time interval
specified by the pci_error_tolerance_time tunable, the card in the I/O slot will be suspended. A manual
PCI Error Recovery operation is required to restore the card.
The following error messages are examples of what will be displayed on the console, if a PCI error is detected
within the time interval specified by the pci_error_tolerance_time tunable following an automatic PCI
error recovery:
PCI Error reported at Hardware Path 0/0/0
Multiple PCI Errors reported at Hardware path 0/0/0 within
pci error tolerance time limit of 1 minutes.
Refer to pci_error_tolerance_time(5) man page for details.
Automatic PCI Error Recovery Operation failed at Hardware path 0/0/0.
Path may be recovered using a Manual Error Recovery operation.
Refer to olrad(1M) man page for details.
A successful attempt at manual recovery will restore the card. A failed attempt at manual recovery will
confirm that there is a persistent error condition.
To recover from the PCI error manually, follow these steps:
1. Execute the olrad -q command to confirm that the card is suspended. In the following example, the
device in slot 0-0-1-0, path 0/0/0/1, is suspended:
# olrad -q
Driver(s)
Capable
Slot Path Bus Max Spd Pwr Occu Susp OLAR OLD Max Mode
Num Spd Mode
0-0-1-0 0/0/0/1 0 133 133 On Yes Yes Yes Yes PCI-X PCI-X
0-0-1-1 0/0/1/1 256 133 66 On Yes No Yes Yes PCI-X PCI
0-0-1-8 0/0/12/1 2304 133 66 On Yes No Yes Yes PCI-X PCI
0-0-1-9 0/0/10/1 2048 133 133 Off No N/A N/A N/A PCI-X PCI-X
0-0-1-10 0/0/9/1 1792 133 33 On Yes No Yes Yes PCI-X PCI
0-0-1-11 0/0/8/1 1536 133 133 Off No N/A N/A N/A PCI-X PCI-X
PCI-Express Slots Information
-----------------------------