NonStop NS14000 Series Planning Guide (H06.13+)
memory and processor states be copied from a functioning PE to the target PE. Once the memory
and processor state data is copied, rendezvous is used to complete the reintegration. This entire
reintegration operation is invisible to the running applications.
Failure Recovery for Duplex Processor
Duplex processors have no single points of failure. Any single element of a duplex processor might
fail, but alternative paths exist for operation of user applications. Failure of a complete NonStop
Blade Element reduces the system to operation on the running NonStop Blade Element. The failure
of an LSU might take down the associated logical processor, but in this event, the operating system
activates the backup processes in other logical processors. The system remains available to the
applications as if no failure occurred.
The errant processor is reset and is then synchronized with the running one. If the failure rate
exceeds a predetermined threshold value within a period of time, the failing processor is reset and
held for repair action.
Failure Recovery for Triplex Processor
In triplex processors, each LSU has inputs from the three processor elements within a logical
processor. As with the duplex processor, the LSU keeps the three PEs in loose lockstep. The LSU
also checks the outputs from the three PEs. If outputs from one of the PEs is not the same as the
other two, the errant result is ignored, and the result from the other two PEs is sent to the ServerNet
fabrics. Reintegration works the same as in the duplex processor. The number of PEs in a
reintegration depends on the conditions of the failure and the configuration of the hardware.
The failure of a NonStop Blade Element in a triplex processor reduces processor operation to
duplex. When the failing unit is replaced, the reintegration function restores the system to triplex
operation. If failure of an LSU takes down its associated logical processor, the operating system
activates the backup processes in other logical processors. The system runs user applications as if
no failure occurred.
As with a duplex processor, the errant processor is reset, and is then synchronized with the running
processors. If the failure rate exceeds a predetermined threshold value within a period of time, the
failing processor is reset and held for repair action.
108 NonStop NS14000 Series System Architecture










