Datasheet

Memory Subsystem
5-8 Intel
®
E8870 Scalable Node Controller (SNC) Datasheet
The 18 main channel data bits cannot be evenly divided among the eight devices providing the
packet (this is why there are some 12-bit symbols and some 8-bit). Therefore, IDM defines a
rotating series of eight addresses over which each device drives each symbol.
Since the main channel request lines are duplicated for each of the four main channels, an address
error outside the SNC is highly likely to be detected prior to use. If an address error occurs on one
channel during a read, that channel will provide data from some other location. This will be
detected unless that data happens to form a valid codeword with the data in other channels. If a
fault occurs during a write, the error will be detected when that address is read.
The checkbits must be stored in memory so that a device that does not respond to a request
produces an illegal code. Since the main channels use open drain technology, the high termination
returns all 0s. Therefore the SNC will invert all write data checkbits when they are submitted to
the RAC. Read data checkbits will be un-inverted when sampled.
When an uncorrectable error is detected in memory write data, the memory ECC is poisoned. This
is achieved by inverting symbol g on RAC1 and RAC3.
5.2.5 Memory Device Failure Correction and Failure Isolation
In order to isolate all detectable errors to a Field Replacable Unit (FRU), the data must be
partitioned into codewords so that each codeword comes from a single FRU. However, Memory
Device Failure Correction (MDFC) can only be achieved when the errors produced by a single
device are correctable.
In some cases there are as few as eight devices on an FRU (single-sided DIMM with X8 DRAMs).
If a codeword came from the same FRU, and one of those DRAMs failed, one eighth of all the bits
in a codeword would be corrupted. This number of errors cannot be corrected with the number of
checkbits provided in standard 72-bit DIMMs. Therefore, both MDFC and isolation to the DIMM
have not been provided by E8870 chipsets for X8 devices.
5.2.5.1 FRU Isolation
The main channel data packet is provided by a DIMM Row, which is comprised of one DIMM side
on each of the four main channels. Isolation information is captured in the REDMEM register. The
channel field identifies one of the two DDR branches and the device field identifies the Chip Select
associated with the failed DIMM side on that DDR channel. For uncorrectable errors, only the
failed DIMM Row can be reliably isolated. However, if the error was correctable, the locator field
will identify the symbol in error.
By consulting Figure 5-1, the main channel on which the error occurred can be identified. This is
sufficient to identify the failed DIMM side.
5.2.6 Memory Test
The SNC can autonomously test memory while code is running to support fast boots. The test will
initialize memory to legal ECC values. Section 3.7.7, MTS: Memory Test and Scrub Register
describes the register that controls memory testing. The MIRNUM field in the MTS register
defines a Memory Interleave Range to be tested, and MTS.GO bit for run control.
The engine tests memory prior to normal operation. 32 data buffers are not available for normal
operation during memory test. The Local, Remote, and Total data buffer quotas must be reduced by
this amount. The test engine is local to the memory controller and cannot generate remote memory
accesses.