White Paper on Dynamic Processor Deallocation and Dynamic Processor Resilience

same type of cache error (I-Cache Data, I-Cache Tag, D-Cache Data and D-Cache
Tag) , one of two actions will be taken:
1. If the processor IS NOT the monarch, the Dynamic Processor Deallocation
facility will be invoked to deallocate it. The monitor will then generate a
serious EMS event indicating that the processor was deallocated and should
be scheduled for replacement (see example in figure 3). Subsequent to this,
the processor state is checked once every 24 hours and a warning EMS event
is generated if the processor is found to still be dallocated. This warning is
intended to serve as a reminder that it is essential that the processor be
scheduled for replacement.
If the system is iCOD enabled and there are reserve processors available, a
reserve processor will be immediately allocated to ensure that full processing
capacity is maintained.
2. If the processor IS the monarch processor, it cannot be deallocated. In this
case, a serious EMS event is generated, indicating that the processor is
experiencing a high cache error rate, that it was not possible to dynamically
deallocate it, and that it should be replaced as soon as possible before a
catastrophic failure occurs. Subsequent to this, the processor will continue to
be monitored. After a 24-hour period has elapsed, the monitor will reset its
counters and begin generating informational events again for each correctable
cache error that occurs. This time period, referred to as the “repeat
frequency”, is configurable. If the cache errors persist and the threshold is met
again, another serious event will be generated. Figure 1 depicts the sequence
of EMS events that could be generated for a processor that is consistently
experiencing cache errors over a period of 48 hours but cannot be deallocated.