Specifications

ManualsBrandsQuantum Data ManualsProjector822S

131

132

133

134

135

136

137

138

139

140

5102ch04.fm Draft Document for Review May 12, 2014 12:46 pm

122 IBM Power System S822 Technical Overview and Introduction

which logs the error. I/O devices can also include specific exercisers that can be invoked by

the diagnostic facilities for problem recreation if required by service procedures.

4.3.3 Reporting

In the unlikely event that a system hardware or environmentally induced failure is diagnosed,

IBM Power Systems servers report the error through various mechanisms. The analysis

result is stored in system NVRAM. Error log analysis (ELA) can be used to display the failure

cause and the physical location of the failing hardware.

With the integrated service processor, the system can automatically send an alert through a

phone line to a pager, or call for service in the event of a critical system failure. A hardware

fault also illuminates the amber system fault LED, located on the system unit, to alert the user

of an internal hardware problem.

On POWER8 processor-based servers, hardware and software failures are recorded in the

system log. When a management console is attached, an ELA routine analyzes the error,

forwards the event to the Service Focal Point (SFP) application running on the management

console, and has the capability to notify the system administrator that it has isolated a likely

cause of the system problem. The service processor event log also records unrecoverable

checkstop conditions, forwards them to the SFP application, and notifies the system

administrator. After the information is logged in the SFP application, if the system is properly

configured, a call-home service request is initiated and the pertinent failure data with service

parts information and part locations is sent to the IBM service organization.This information

will also contain the client contact information as defined in the IBM Electronic Service Agent

(ESA) guided setup wizard. With the new HMC V8R8.1.0 a Serviceable Event Manager is

available to manually block problems from being automatically transferred to IBM. Please

refer to “Service Event Manager” on page 137 for more details.

Error logging and analysis

When the root cause of an error is identified by a fault isolation component, an error log entry

is created with basic data such as the following examples:

򐂰 An error code that uniquely describes the error event

򐂰 The location of the failing component

򐂰 The part number of the component to be replaced, including pertinent data such as

engineering and manufacturing levels

򐂰 Return codes

򐂰 Resource identifiers

򐂰 FFDC data

Data that contains information about the effect that the repair will have on the system is also

included. Error log routines in the operating system and FSP can then use this information

and decide whether the fault is a call-home candidate. If the fault requires support

intervention, a call is placed with service and support, and a notification is sent to the contact

that is defined in the ESA-guided setup wizard.

Remote support

The Remote Management and Control (RMC) subsystem is delivered as part of the base

operating system, including the operating system that runs on the Hardware Management

Console. RMC provides a secure transport mechanism across the LAN interface between the

operating system and the Hardware Management Console and is used by the operating