Datasheet
Reliability, Availability, and Serviceability
6-10 Intel
®
E8870 Scalable Node Controller (SNC) Datasheet
6.2.2 Server Management (SM)
SM provides “out-of-band” error handling features for high RAS systems on the Itanium processor
family:
• Error logging.
• Remote diagnostics.
• Re-configuration (graceful degradation). In case of a catastrophic event, SM can analyze and
isolate troublesome components and assist the system boot after a reset.
6.2.3 OS/System Software
• Resume from correctable errors.
• Recover:
— Re-configuration: Disable malfunctioning hardware components without crashing the
system.
— Hot plug: online repair and upgrade.
— Shoot down crashed processes/threads/applications.
— Communicate I/O errors to device drivers.
• Reboot from fatal hardware and uncorrectable hardware errors that are not recoverable by
system software.
6.2.4 Device Driver
• Retry failed I/O transactions.
• Support for fail-over.