Datasheet

Intel
®
E8870 Scalable Node Controller (SNC) Datasheet 6-7
Reliability, Availability, and Serviceability
For reliable signaling of errors in the system, each component guarantees that the pin associated
with the error is asserted within four system clock cycles (200 MHz) after the error is detected by
the component. For example, if a multi-bit ECC error is detected at the SP interface in cycle x, the
uncorrectable error pin (ERR[1]#) is asserted in cycle x+3.
6.1.4 Interface Details
Major interfaces in the chipset can be enabled/disabled via software to aid fault isolation. Any
requests routed to a disabled interface will be master-aborted. Any responses will be absorbed.
That is, no issue is required on the disabled interface, but the disabled interface must not assert
internal flow control.
6.1.4.1 Processor Bus
ECC or parity will be checked at the input pins only when the processor is an initiator of the
processor bus transaction.
Parity protection is provided on the Itanium 2 processor address bus and control bus. ECC
protection is provided on the Itanium 2 processor data bus.
BERR# is asserted on the processor bus by the SNC when BERRIN# is asserted. BINIT# is
asserted on the processor bus by the SNC when BINITIN# is asserted. Both signals are
asynchronous inputs, and are sampled over four consecutive system clocks (200 MHz) to filter
glitches. When a assertion edge of these signals is observed, BERR# and BINIT# on the node
are asserted according to the protocol requirements of the host bus. Multiple assertion edges of
BERRIN# and BINITIN# are ignored until the host bus assertion of BERR# and/or BINIT#
has been completed.
The SNC may not detect errors in implicit write-back data for full-line writes. In this case the
implicit write-back data is completely overwritten, so its value does not affect operation. This
error is not reported to prevent unnecessary corrective action by software that might reduce
reliability. Since software cannot tell that the uncorrected data was discarded, it might kill an
application or reset the system.
6.1.4.2 Scalability Port (SP)
Data is protected by ECC. ECC is checked only on entry of packets.
Flit transfers are protected by parity.
The information contained in the SP control and idle flits packet are protected by both parity
and duplication (each field is duplicated on different wires to enhance error detection).
Link level retry is supported on the SP. Link level retry is entered when parity errors are
detected on flits, or when phits within an idle flit have a duplication error.
6.1.4.3 DRAM
The addresses that are logged in the error registers are the decoded memory addresses
(channel, devices, bank, row, column). Software can determine from the log which bit failed
on correctable errors and which DIMM failed for uncorrectable errors.
Note: In the event of a multiple error, it may not be possible to isolate the failed device when the MDFC
feature is enabled.
If MDFC is enabled, correctable ECC errors will be corrected as the data is read from memory.