NonStop NS-Series Planning Guide (H06.04+)

Table Of Contents
Integrity NonStop NS-Series System Description
HP Integrity NonStop NS-Series Planning Guide529567-005
4-8
Memory Reintegration
°
Allow each PE to individually and deterministically respond to asynchronous
incoming interrupts and then to respond collectively as a single logical
processor.
°
Exchange software state information when performing operations that are
distributed across PEs; for example, memory reintegration, error handling, and
memory scrubbing.
Compare output from each PE. If identical, the output is transmitted over the
ServerNet fabrics. If the PE outputs are not the same, appropriate actions occur to
identify the errant one and to recover from the failure. Under some failure
conditions, it can be necessary to stop normal operations of the erring PE.
Memory Reintegration
Memory reintegration initiates processing in a PE whose operation has been stopped
because the NonStop Blade Element diverged or has been replaced. This reintegration
requires that all of the memory and processor states be copied from a functioning PE
to the target PE. Once the memory and processor state data is copied, rendezvous is
used to complete the reintegration. This entire reintegration operation is invisible to the
running applications.
Failure Recovery for Duplex Processor
Duplex processors have no single points of failure. Any single element of a duplex
processor might fail, but alternative paths exist for operation of user applications.
Failure of a complete NonStop Blade Element reduces the system to operation on the
running NonStop Blade Element. The failure of an LSU might take down the associated
logical processor, but in this event, the operating system activates the backup
processes in other logical processors. The system remains available to the
applications as if no failure occurred.
The errant processor is reset and then it is synchronized with the running one. If the
failure rate exceeds a predetermined threshold value within a period of time, the failing
processor is reset and held for repair action.
Failure Recovery for Triplex Processor
In triplex processors, each LSU has inputs from the three processor elements within a
logical processor. As with the duplex processor, the LSU keeps the three PEs in loose
lockstep. The LSU also checks the outputs from the three PEs. If outputs from one of
the PEs is not the same as the other two, the errant result is ignored, and the result
from the other two PEs is sent to the ServerNet fabrics. Reintegration works the same
as in the duplex processor. The number of PEs in a reintegration depends on the
conditions of the failure and the configuration of the hardware.
The failure of a NonStop Blade Element in a triplex processor reduces processor
operation to duplex. When the failing unit is replaced, the reintegration function