User's Manual

Vol. 3 15-39
MACHINE-CHECK ARCHITECTURE
When multiple recoverable errors are reported and no other fatal condition (e.g..
overflowed condition for SRAR error) is found for the reported recoverable errors,
it is possible for system software to recover from the multiple recoverable errors
by taking necessary recovery action for each individual recoverable error.
However, system software can no longer expect one to one relationship with the
error information recorded in the IA32_MCi_STATUS register and the states of
the RIPV and EIPV flags in the IA32_MCG_STATUS register as the states of the
RIPV and the EIPV flags in the IA32_MCG_STATUS register may indicate the
information for the most severe error recorded on the processor. System
software is required to use the RIPV flag indication in the IA32_MCG_STATUS
register to make a final decision of recoverability of the errors and find the
restart-ability requirement after examining each IA32_MCi_STATUS register
error information in the MC banks.
15.9.5 Machine-Check Error Codes Interpretation
Appendix E, “Interpreting Machine-Check Error Codes,” provides information
on interpreting the MCA error code, model-specific error code, and other
information error code fields. For P6 family processors, information has been
included on decoding external bus errors. For Pentium 4 and Intel Xeon
processors; information is included on external bus, internal timer and cache
hierarchy errors.
15.10 GUIDELINES FOR WRITING MACHINE-CHECK
SOFTWARE
The machine-check architecture and error logging can be used in three
different ways:
To detect machine errors during normal instruction execution, using the
machine-check exception (#MC).
To periodically check and log machine errors.
To examine recoverable UCR errors, determine software recoverability and
perform recovery actions via a machine-check exception handler or a corrected
machine-check interrupt handler.
To use the machine-check exception, the operating system or executive
software must provide a machine-check exception handler. This handler may
need to be designed specifically for each family of processors.
A special program or utility is required to log machine errors.