Interface Card Critical Resource Analysis(CRA) Whitepaper for HP-UX 11i v3

Scenario 23
System Configuration:
HP-UX OS running in Service Guard (SG) cluster environment with common
lock disk between the cluster nodes and the cluster is configured with a common lock disk, which has
multiple lunpath(s) through different cards, controlled by different I/O drivers.
Operation Performed:
# kcmodule <interface driver associated with card>=unused
Criticality reported by CRA:
No user impact
Explanation and user action required:
The above kcmodule operation on any one of the I/O drivers controlling a card, which provides a
lunpath to the common lock disk, will not have any user impact.
If the common lock disk has been configured using path dependent device special file, then the above
operation will report a CRA_WARNING and the user will be allowed to proceed with the operation.
No user action is required.
Cards in PCI Error and Suspended State Scenarios
Scenario 24
System Configuration:
HP-UX OS is running and an interface card in one of the PCI I/O slots is in PCI
ERROR state. That is, ioscan(1M) displays the nodes below the slot in ERROR state.
Operation Performed:
# olrad –C | -r | -d <slot id>
Criticality reported by CRA:
CRA_WARNING
Explanation and user action required:
The olrad(1M) command with –C option reports the resource usage and its criticality.
Under normal situations, when an I/O slot occupied with a card encounters any PCI errors, the
automatic PCI error recovery feature of HP-UX 11iv3 will clear the PCI error and make the slot and
card functional again. No user intervention is needed to recover from this scenario.
However under certain conditions (for example, the card is faulty), the automatic PCI error recovery
may fail to clear the error and the slot will be left in ERROR state to prevent any system crash leading
to system downtime.
In such scenario, the user can choose to perform one of the following operations:
The user can suspend the card using the olrad –r option. Then, a manual recovery operation
using the olrad –R option can be attempted to recover from the error. If the card is in a
recoverable state, the card should start functioning normally after the olrad -R operation.
However, if the card is faulty, the above manual recovery operation may also fail. In such a
scenario, the user can remove the faulty card from the slot, replace with a new but same type
of card and then run the olrad –R operation.
Alternatively, the user can choose to delete the card using the olrad –d option. Once this
operation is complete, the user can physically add a new card to the slot and run olrad –A
and thus recover from the PCI error scenario.
17