Specifications
5102ch04.fm Draft Document for Review May 12, 2014 12:46 pm
114 IBM Power System S822 Technical Overview and Introduction
Depending on the configuration of the system, the HMC IBM Service Focal Point™, OS
Service Focal Point, or service processor will receive a notification of the failed component,
and will trigger a service call.
4.2.4 Cache protection
POWER8 processor-based systems are designed with cache protection mechanisms,
including cache-line delete in both L2 and L3 arrays, processor instruction retry and alternate
processor recovery protection on L1-I and L1-D, and redundant “repair” bits in L1-I, L1-D, and
L2 caches, and L2 and L3 directories.
L1 instruction and data array protection
The POWER8 processor instruction and data caches are protected against intermittent errors
using processor instruction retry and against permanent errors by alternate processor
recovery, both mentioned previously. L1 cache is divided into sets. POWER8 processor can
deallocate all but one before doing a processor instruction retry.
In addition, faults in the Segment Lookaside Buffer (SLB) array are recoverable by the
POWER Hypervisor. The SLB is used in the core to do address translation calculations.
L2 and L3 array protection
The L2 and L3 caches in the POWER8 processor are protected with double-bit detect
single-bit correct error detection code (ECC). Single-bit errors are corrected before forwarding
to the processor and are subsequently written back to the L2 and L3 cache.
In addition, the caches maintain a cache-line delete capability. A threshold of correctable
errors that are detected on a cache line can result in the data in the cache line being purged
and the cache line removed from further operation without requiring a reboot. An ECC
uncorrectable error detected in the cache can also trigger a purge and delete of the cache
line. This results in no loss of operation because an unmodified copy of the data can be held
on system memory to reload the cache line from main memory. Modified data is handled
through Special Uncorrectable Error handling.
L2 and L3 deleted cache lines are marked for persistent deconfiguration on subsequent
system reboots until they can be replaced.
L4 cache protection
The POWER8 processor has an integrated memory buffer with L4 cache error protection
similar to a L3 cache error protection.
4.2.5 Special Uncorrectable Error handling
Although rare, an uncorrectable data error can occur in memory or cache. IBM POWER
processor-based systems attempt to limit the impact of an uncorrectable error to the least
possible disruption, using a well-defined strategy that first considers the data source.
Sometimes, an uncorrectable error is temporary in nature and occurs in data that can be
recovered from another repository, as in the following example:
Data in the instruction L1 cache is never modified within the cache itself. Therefore, an
uncorrectable error discovered in the cache is treated like an ordinary cache miss, and
correct data is loaded from the L2 cache.