User Guide

Chapter 2 Sun Blade T6320 Server Module Diagnostics 2-11 Memory Fault Handling
The Sun Blade T6320 server module uses advanced ECC technology, also called chipkill,
that corrects up to 4-bits in error on nibble boundaries, as long as they are all in the same
DRAM. If a DRAM fails, the DIMM continues to function.
Note The chipkill function is only supported on DIMMs that use “x4” DRAMs.
The following server module features manage memory faults independently.
POST Runs when the server module is powered on (based on configuration
variables) and thoroughly tests the memory subsystem.
If a memory fault is detected, POST displays the fault with the FRU name of the
faulty DIMMs, logs the fault, and disables the faulty DIMMs by placing them in
the Automatic System Recovery (ASR) blacklist. For a given memory fault, POST
disables half of the physical memory in the system. When this occurs, you must
replace the faulty DIMMs based on the fault message and enable the disabled
DIMMs with the ILOM command set /SYS/component component_state=
enabled .
Solaris Predictive Self-healing (PSH) technology A feature of the Solaris OS,
uses the fault manager daemon (fmd) to watch for various kinds of faults. When
a fault occurs, the fault is assigned a unique fault ID (UUID), and logged. PSH
reports the fault and provides a recommended proactive replacement for the
DIMMs associated with the fault. Troubleshooting Memory Faults
If you suspect that the server module has a memory problem, follow the flowchart
FIGURE 2-1). Type the ILOM command: show /SP/faultmgmt . The
faultmgmt command lists memory faults and lists the specific DIMMs that are
/SYS/MB/CMP0/BR3/CH0/D1 J2501 3 H
Channel 1 /SYS/MB/CMP0/BR3/CH1/D0 J2601 2 G
/SYS/MB/CMP0/BR3/CH1/D1 J2701 3 H
* Upgrade path: DIMMs should be added with each group populated in the order shown.
\ Fault replacement path: Each pair is addressed as a unit, and each pair must be identical.
FB-DIMM Configuration and Installation (Continued)
Branch Name Channel Name FRU Name