Intel Server Management (ISM) Installation and User's Guide, Version 5.5.5 - HP Carrier-Grade Server cc3310

Intel Server Management (ISM) Installation and User's Guide
Client SSU (CSSU) Details
38
Sets the Last Error Update value to During PIC Runtime, indicating the update occurred while the
system was operational
The BIOS stops logging noncritical single-bit errors when the SBE error count reaches nine. This prevents
the errors from filling the SEL. Upon system reboot, the OS uses the SEL records, along with the results
from its own memory test, to map out bad memory by reducing the usable size of a memory bank to avoid
using the bad memory element(s). This elimination of hard errors is a precaution that prevents single-bit
errors from becoming multiple-bit errors after the system has booted, and also to prevent single-bit errors
from being detected and logged each time the failed locations are accessed. Upon reboot, the single-bit error
count is set to zero in the SEL.
Multiple-Bit Error (MBE) Handling
If a multiple-bit error occurs, the system generates a System Management Interrupt (SMI) that allows the
BIOS to log information about the error in the SEL, identifying the memory bank in which the error
occurred. However, on some systems, it is not possible to determine the exact memory device that caused a
multiple-bit error.
Because a multiple-bit error is a critical condition, upon logging the error the BIOS generates an NMI that
halts the system. Upon rebooting the server, this error is indicated as a critical condition on the Memory
Array and Memory Device in the health branch of PIC. The requested event actions are carried out, and PIC:
Increments the critical error count on the Sensor Settings tab
Sets the Memory Device Error Type to MBE on the Sensor Information tab for the Memory Device
Sets the Last Error Update value to Previous Boot, indicating the last update occurred during the last
system boot
Comparison of Single-bit Errors to Multiple-bit Errors
Table 4-3 compares the steps taken with single-bit and multiple-bit errors.
Table 4-3 SBE and MBE Comparison
Memory Error Handling SBE MBE
Generate SMI Yes Yes
Log information includes Exact SIMM or DIMM Memory bank only
Action after SEL logging Continue operation Stop the system
Indicated by PIC screen changes Immediately After the system reboots
Bad memory is mapped out at next reboot Yes Yes (immediately after the failure)
PCI Hot-Plug Device
This sensor screen displays information about each PCI hot-plug device installed in a PHP slot.
Power Supply and Power Unit
The Power Supply sensor screen shows information about each power supply.
The Power Unit represents power-supply redundancy. For systems that support it, PIC monitors the status
of the power supplies in the managed server. The power unit sensor screen displays information and status
about each power unit.