SR870BN4 Error Reference Guide Revision 1.
Revision History SR870BN4 Error Reference Guide Revision History Date 01/2002 Revision Number 0.5 Modifications Initial Release. 04/2002 0.6 Update Machine Check Error Handling section, update SEL data tables. 10/2003 1.0 Updated sensor and beep code tables. Disclaimers Information in this document is provided in connection with Intel® products. No license, express or implied, by estoppel or otherwise, to any intellectual property rights is granted by this document.
SR870BN4 Error Reference Guide Table of Contents Table of Contents 1. Introduction .......................................................................................................................... 1 1.1 Document Organization ................................................................................................... 1 1.2 SEL Overview .................................................................................................................. 1 2. EFI-Based SELViewer Utility ...
List of Figures SR870BN4 Error Reference Guide List of Figures Figure 1. SEL Viewer Utility .......................................................................................................... 2 iv Revision 1.
SR870BN4 Error Reference Guide List of Tables List of Tables Table 1. SAL 3.0 MCA Records.................................................................................................... 8 Table 2. SEL Event Logs for Machine Check Errors .................................................................. 10 Table 3: Onboard PCI Devices and Slots ................................................................................... 12 Table 4. Error Code Classification ......................................
SR870BN4 Error Reference Guide 1. Introduction Introduction This document is an error reference guide for the SR870BN4 server system. 1.1 Document Organization Section 1: An introduction to the SEL. Section 2: A brief introduction to the EFI-based SEL Viewer utility. Section 3: SEL Data Tables. Section 4: MCA Error Handling including SEL event format for machine check events. Section 5: SR870BN4 PCI Device IDs Section 6: BIOS POST error codes and messages.
EFI-Based SELViewer Utility 2. SR870BN4 Error Reference Guide EFI-Based SELViewer Utility The EFI-based SEL Viewer utility is used to view the SEL records from Itanium™ -based servers. The SEL Viewer provides support for the user to perform the following: Examine all SEL entries stored in the non-volatile storage area of the server in text form or in hexadecimal. Examine previously stored SEL entries from a file in text form or in hexadecimal. Save the SEL entries to a file.
SR870BN4 Error Reference Guide 3. SR870BN4 SEL Data Tables SR870BN4 SEL Data Tables The tables in this section provide information on the data provided by the SEL Viewer utility. 3.1 SR870BN4 Generator ID Codes 3.
SR870BN4 SEL Data Tables Sensor Type Sensor Number 22h 23h 24h 25h 26h 27h 28h 29h 2Ah 60h 61h 62h 63h 64h 65h 7Eh 7Fh A4h A5h A6h A7h Sensor Name CPU Board +1.8V CPU Board +3.3V SB CPU Board +12V SB IO Riser Board +12V SB IO Riser Board +2.5V IO Riser Board +1.5V SB IO Riser Board +1.
SR870BN4 Error Reference Guide Sensor Type Sensor Number 77h 78h 79h 7Ah 7Bh 7Ch I/O Board 5V D2D 2 Processor Board 3.3V D2D 1 Processor Board 2.5V D2D 1 Processor Board 2.5V D2D 2 Memory Board 1 1.25V D2D Memory Board 2 1.
SR870BN4 SEL Data Tables Sensor Type 23 Sensor Number Sensor Name 03h Watchdog BMC Watchdog2 50h 51h 52h 53h 54h 55h 56h A0h A1h A2h A3h OEM Fan Boost Memory Board 1 Temp Fan Boost Memory Board 2 Temp Fan Boost IO Board Temp 1 Fan Boost IO Board SIOH Temp Fan Boost IO Board Temp 3 Fan Boost CPU Board Ambient Temp Fan Boost CPU Board SNC Temp Fan Boost Proc 1 Temp Fan Boost Proc 2 Temp Fan Boost Proc 3 Temp Fan Boost Proc 4 Temp C7 6 SR870BN4 Error Reference Guide Revision 1.
SR870BN4 Error Reference Guide 4. SR870BN4 Machine Check Error Handling SR870BN4 Machine Check Error Handling This section gives an overview of the implementation of machine check error handling on the SR870BN4 server system. For additional details about Itanium-based system error generation and error handling, refer to the Itanium™ Processor Family Error Handling Guide (document number: 249278-002) and the Itanium™ System Abstraction Layer Specification (document number: 245359-005).
SR870BN4 Machine Check Error Handling SR870BN4 Error Reference Guide There are two types of machine check events: local and global. A local MCA is when an individual processor enters machine check. Some examples of local machine checks include a Distributed Translation Lookaside Buffer (DTLB) data parity error, or when the processor consumes data with an uncorrectable error. A machine check is global when all processors enter machine check.
SR870BN4 Error Reference Guide SR870BN4 Machine Check Error Handling PCI Component Critical Interrupt PERR SERR PCI Bus, Device, Function info Memory Device Memory Error Correctable Uncorrectable SMBIOS Type 16 0-based index SMBIOS Type 17 0-based index Other Critical Interrupt Bus Correctable error Bus Uncorrectable error 4.3 Thresholding MCA classifies errors into one of three categories: corrected, recoverable, and fatal.
SR870BN4 Machine Check Error Handling SR870BN4 Error Reference Guide Table 2.
SR870BN4 Error Reference Guide Event Logging Disabled (Thresholding) SBE Memory Logging 0x31 0x4 Disabled Bus Correctable Logging 0x31 0x4 Disabled Proc Correctable 0x31 0x4 Logging Disabled PCI PERR Logging 0x31 0x4 Disabled System Event (MCA Event Indicator) 0x31 0x4 Aux Log Entry 0x31 0x4 Aux Log Entry Revision 1.
SR870BN4 PCI Device IDs 5. SR870BN4 Error Reference Guide SR870BN4 PCI Device IDs The SR870BN4 server has the following PCI devices and slots on the I/O board: Table 3: Onboard PCI Devices and Slots Device Description PCI Bus Bus Number Device ID SNC FSB 0xFF 0x18 SIOH SNC 0xFF 0x1C MRH-D SNC 0xFF 0x018 1 Internal ICH4 Function Number 0,1.
SR870BN4 Error Reference Guide 6. BIOS POST Error Codes and Messages BIOS POST Error Codes and Messages The following error codes are relevant to the SR870BN4 server. The system BIOS displays POST error messages on the video screen and are also logged in the SEL. The SR870BN4 BIOS will prompt the user to press a key in case of serious errors. Error Code Classification Red: Critical events that require user interaction. BIOS POST will pause with a message requesting to Press F1, F2, or ESC.
BIOS POST Error Codes and Messages SR870BN4 Error Reference Guide Error Code 0145 Error Message PCI ROM not found 0146 Insufficient Memory to Shadow PCI ROM Yes This is due to lack of option ROM space in the BIOS. This error can be resolved by disabling all of the option ROMS on all devices except for the boot device. 8100 Processor 01 failed BIST Yes; user input required Replace Processor 01. 8101 Processor 02 failed BIST Yes; user input required Replace Processor 02.
SR870BN4 Error Reference Guide BIOS POST Error Codes and Messages Error Code 8151 Error Message Processor 02: failed initialization on last boot Pause on Boot Yes; user input required Recommended User Action Retest processor. If error persists, hardware failure. User should replace processor. 8152 Processor 03: failed initialization on last boot Yes; user input required Retest processor. If error persists, hardware failure. User should replace processor.
BIOS POST Error Codes and Messages SR870BN4 Error Reference Guide Error Code 8504 Error Message Persistent Single-bit Error Detected Row1. Row 1 mapped out 8505 Persistent Single-bit Error Detected Row2. Row 2 mapped out Yes Verify the affected memory and replace with correct memory. 8506 Persistent Single-bit Error Detected Row3. Row 3 mapped out Yes Verify the affected memory and replace with correct memory. 8507 Persistent Single-bit Error Detected Row4.
SR870BN4 Error Reference Guide Revision 1.
POST Codes 7. SR870BN4 Error Reference Guide POST Codes In order to indicate progress through BIOS POST, and in special cases where errors are encountered during BIOS POST, there are three common mechanisms which shall be employed by the SR870BN4 BIOS. The first method is to display port 80/81 codes to a I2C* adapter connected to the processor baseboard. The second common method is the use of beep codes, encoded beep sequences emitted by the PC speaker when an error is encountered.
SR870BN4 Error Reference Guide POST Codes Secret Decoder: Bit 11:8 – 0xF stack-less code being executed, 0xD-0x0 – memory is available Table 5.
POST Codes SR870BN4 Error Reference Guide Code Value (bit 8 = 1, bits 11:4 shown below) Module Display 0xF2 Memory Autoscan (stackless) North SUB MODULES BITS Memory Autoscan 15:12 11:8 7:4 3:0 8 F 2 0 Pass1 Entry 8 F 2 1 Process Auto Scan Input 8 F 2 2 Execute Auto scan (C- code) 8 F 2 3 Process Auto Scan Output North 0xF1 Recovery stackless North 0xF0 Reserved North 0xEF-0xEE Memory Autoscan C-code North 0xED-E8 Recovery C-Code 0xE8-0xE6 HOB 0xE5-0xC1 Reser
SR870BN4 Error Reference Guide POST Codes 0x8F72 Validate DIMMs (Mem_ValidateInstalledConfiguration()) North 0x8F73 Program MIRs/MITs (Mem_DoMirMitProgram()) North 0x8F74 Calculate CAS (Mem_CalcSysCas()) North 0xCF74 Calculate CAS Error Loop North 0x8F75 Program CAS (Mem_SetMrhdCasLatency()) North 0x8F76 Set Mrhd DIMM Geomentry (Mem_SetMrhdDimmGeometry()) North 0x8F77 Perform SLEW rate calibration (Mem_DoSlewRateCalibration) North 0x8F78 Mem_InitDimmAndSetCasLatencyAndBurst() North
POST Codes SR870BN4 Error Reference Guide And BSP+APs min_state_area for all CPU's (cpu_data_base+cpu_bspstore_base+cpu_health)cpu_data_base points to min state save area. TOM below and above 4G. Allocate sal_mp_info_table data and sal_efi stack area and legacy_stack (temp). Initialize legacy stack top and bottom for temporary use during POST only. INT_15,(FN# F788 in EM code) uses INT-8 timer tick for frequency calculation. (BSP+APs) Save ID,EID, Initialize BSPSTORE,SP.
SR870BN4 Error Reference Guide POST Codes 0x07F3 BSP only Hang on ERROR. South *0x87F2 BSP only Initialize sal data top address Physical equals to virtual for runtime use and above 4G Load Call backs for byte/word checkpoint display entry and Address. SAL PMI address EFI to SAL call back address SAL procedure address SAL SST base and address SAL procudure entry base inside SST Buildtime address where SAL_PROC entry is stored Buildtime GP Runtime GP SAL SST size.
POST Codes SR870BN4 Error Reference Guide 0x87E5 BSP + APs Set PMI entry point PAL Call (pal_pmi_entrypoint_20). South 0x07E5 BSP + APs Hang if ERROR. South 0x87E4 BSP + APs PAL Cache Summary by PAL Call (pal_cache_summary_04). South 0x07E4 BSP + APs Hang if ERROR South 0x87E3 BSP + APs PAL Cache Information set. PAL Call cache_info_02. South 0x07E3 BSP + APs Hang, if ERROR. South 0x87E2 BSP + APs pal_mc_register_mem_1b/find CPU min state pointer.
SR870BN4 Error Reference Guide POST Codes 0x07BB BSP Hang on ERROR. South 0x87BA BSP Feed system information (0x1) with call to SAL_C. South 0x07BA BSP Hang on ERROR. South 0x87B9 BSP Initialize MP table v1.4 (0x2) with call to SAL_C South 0x07B9 BSP Hang on ERROR. South 0x87B8 BSP Initialize IA-32 ACPI v1.1 (0x3) with call to SAL_C South 0x07B8 BSP Hang on ERROR. South 0x87B7 BSP Initialize IA64 ACPI v1.1 (0x4) with call to SAL_C South 0x07B7 BSP Hang on ERROR.
POST Codes SR870BN4 Error Reference Guide Code Value Module Display 0x00D7 Passing control to the interface module next. South 0x00D8 The main system BIOS runtime code will be decompressed next. South 0x00D9 Passing control to the main system BIOS in shadow RAM next. South 0x0003 Next, checking for a soft reset or a power on condition. South 0x0005 The BIOS stack has been built. Next, disabling cache memory. South 0x0006 Uncompressing the POST code next.
SR870BN4 Error Reference Guide Code Value POST Codes Module Display after the video ROM had control. 0x002E Complete post-video ROM test processing. If the EGA/VGA controller is not found, perform the display memory read/write test next. South 0x0037 The display mode is set. Display the power on message next. South 0x0038 Initialize the bus input, IPL, and general devices next, if present. South 0x0039 Late processor self test. Display bus initialization error messages.
POST Codes SR870BN4 Error Reference Guide Code Value Module Display 0x0099 Configuring the timer data area and printer base address. South 0x009B Returned after setting the RS-232 base address. Performing any required initialization before the Coprocessor test next. South 0x009E Initialization after the Coprocessor test is complete. Checking the extended keyboard, keyboard ID, and Num Lock key next. Issuing the Keyboard ID command. South 0x00A2 Displaying any soft errors.
SR870BN4 Error Reference Guide POST Codes Code Value Module Display 0x85F8 Print banner with entry address to make it easy to debug with symbols. Install any devices that are integrated system volume devices. South 0x85F9 System volumes installed. South 0x85FA Init Nv Var Store Mem Set EFIDebug based on NVRAM variable. Set default console environment variables if they are not already set. Install Console Splitter. Print Banner with entry address to make it easy to debug with symbols.
POST Codes SR870BN4 Error Reference Guide 0xAFED OS request for SAL Clear Processor/Platform Error/State Log in progress. South 0xAFEE SAL Platform OEM MCA Error Handler In Control. South 0xAFEF OS request for SAL Get Processor/Platform Error/State Log in progress. South 0xAFF0 SAL INIT Handler is in control. South 0xAFF1 Passing Control to IA-32 OS Init Handler. South 0xAFF2 Found valid OS_INIT Ep, Passing Control to EM OS Init Handler.
SR870BN4 Error Reference Guide 8. Beep Codes Beep Codes During the course of executing POST, there are occassions where fatal problems happen before video is enabled. These fatal errors are conveyed with the use of the speaker via encoded beeps, coupled with post debug codes. Since the duration of the display-less POST execution is relatively short, there are fewer beep codes than displayed error codes.
Beep Codes 8.1.2 SR870BN4 Error Reference Guide Recovery Beep Codes Table 16. Recovery Mode Beep Codes 8.1.3 Beeps 1 short – medium tone Description BIOS Flash Update Started 2 short – medium tone BIOS Flash Update Complete Repeating – low tone BIOS Recovery Error Occurred BMC Beep Code Generation The BMC generates beep codes upon detection of the failure conditions listed in Table 17. Each digit in the code is represented by a sequence of beeps whose count is equal to the digit.
SR870BN4 Error Reference Guide Appendix A: Glossary Appendix A: Glossary Term ACPI Definition Advanced Configuration and Power Interface. ANSI American National Standards Institute. ASCII American Standard Code for Information Interchange. An 8-level code (7 bits plus parity check) widely used in data processing and data communications systems. ASIC Application specific integrated circuit. BERR Bus Error Signal.
Appendix A: Glossary Term GPIO General Purpose I/O. HSC Hot-Swap Controller. Hz 2 SR870BN4 Error Reference Guide Definition Hertz (1 cycle/second). I C Inter-integrated circuit bus. I2O Intelligent I/O. An open architecture for the development of device drivers in network system environments IA Intel® Architecture. IBF Input Buffer. ICH I/O Controller Hub. ICMB Intelligent Chassis Management Bus. IERR Internal Error. IOP I2O compliant-I/O Platforms.
SR870BN4 Error Reference Guide Term Appendix A: Glossary Definition PEF Platform Event Filtering. PEP Platform Event Paging. PERR Parity Error. A signal on the PCI bus that indicates a parity error on the bus. PID Programmable Interrupt Device. The PID is an interrupt controller that provides interrupt steering functions. The PID interfaces include a PCI bus, an APIC bus, and serial IRQ interfaces, and an interrupt input interface. PIROM Processor Information ROM.
Appendix B: Reference Documents SR870BN4 Error Reference Guide Appendix B: Reference Documents Intelligent Platform Management Interface Specification v1.5, ©2001, Intel Corporation. http://developer.intel.com/design/servers/ipmi System Management BIOS Reference Specification v2.3. http://www.dmtf.org/ Itanium™ Processor Family Error Handling Guide (Doc. Number: 249278-002). http://developer.intel.com/ Itanium™ System Abstraction Layer Specification (Doc. Number: 245359-005). http://developer.intel.
SR870BN4 Error Reference Guide Appendix C: Index Appendix C: Index A ACPI, 18, 20, 22, 24, 25, 29 Address, 22 AMI, 25 AP, See also Application Processor, 23, 24 B Baseboard Management Controller, See also BMC, 1, 15 BIOS, 1, 3, 9, 13, 14, 15, 18, 22, 24, 25, 26, 27, 28, 29, 31, 32 BIST, 14 BMC, 1, 3, 5, 8, 9, 14, 15, 28, 32 BSP, 19, 20, 21, 22, 23, 24, 25, 30, 31 Built-in Self Test, See also BIST, 14 Bus Number, See also BUSN, 12 C Checksum, 13 CMOS, 13, 26, 31 Configuration, 32 Controller, 1, 12, 15, 27
Appendix C: Index SR870BN4 Error Reference Guide R System Management BIOS, 9 Recovery, 20, 21, 30, 32 Reset, 19, 20 T Temperature, 3, 5 S SBE, 11 SCSI, 4, 5, 12, 27 Security, 4 SEL, 2 SEL, See also System Event Log, 1, 2, 3, 8, 9, 10, 13, 15, 32 Sensor, 5 Sensor Event, 1, 2, 5, 7, 8, 9, 10, 11, 15, 30 Sensor, Type, 2, 3, 8 SERR, 8, 9, 10 Server Management, 1 Shadow, 14, 23 Shutdown, 27, 31 SMBIOS, 8, 9, 10 SR460AC4, 1, 3, 9 System Event Log, See also SEL, 1, 15 VI U USB, See also Universal Serial Bus