System Fault Management C.07.05.08.
© Copyright 2012 Hewlett-Packard Development Company, L.P Legal Notices ©Copyright 2012 Hewlett-Packard Development Company, L.P.Confidential computer software. Valid license from HP required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor’s standard commercial license.
Contents 1 Introduction...............................................................................................7 Overview................................................................................................................................7 Features and benefits................................................................................................................7 Components of SFM......................................................................................................
Viewing FRU information.....................................................................................................46 Viewing information about Management Processor.................................................................47 Viewing information about Firmware information....................................................................47 Viewing information about Onboard Administrator.................................................................
Viewing List of Low Level Logs..............................................................................................65 Viewing List Of Low Level Logs using GUI.........................................................................66 Viewing List Of Low Level Logs using CLI...........................................................................66 Viewing Details of Low Level Logs.........................................................................................
Temperature instances...........................................................................................................105 Voltage instances..................................................................................................................106 FRU Information instances......................................................................................................107 Management Processor instances....................................................................................
1 Introduction The System Fault Management (SFM) supports HP Integrity Superdome 2 (HP Superdome 2), HP Integrity BL860c i2, BL870c i2 & BL890c i2 Server Blades and rx2800 i2 in addition to other HP Integrity Servers. All the features supported on systems running the HP-UX 11i v3 operating system are available for HP Integrity Servers. This chapter introduces you to the System Fault Management (SFM) software and the tools that SFM includes.
• Enables you to view and administer WBEM indications. • Provides the same features and benefits as those found in the EMS hardware monitors. NOTE: 8 Introduction SFM is the replacement of EMS hardware monitors.
Components of SFM This section discusses the following topics: • EVWEB • Error Management Technology (EMT) • CIMUtil • IPMI Event Viewer • providers EVWEB EVWEB is a component of SFM that enables you to administer and view WBEM indications generated on the local system on which SFM is installed. For more information on EVWEB, see “Evweb overview” (page 49). EMT EMT is a component of SFM that enables you to view and administer information about errors which can occur on the server.
Table 1 Instance providers (continued) Instance provider Description NOTE: The Blade instance provider is available on HP Integrity BL860c i2/BL870c i2/BL890c i2 Server Blades, HP Integrity Superdome 2.
Table 1 Instance providers (continued) Instance provider Description For more information on the DAS provider, see the HP-UX WBEM Direct Attached Storage (DAS) provider Data Sheet and Release Notes at: http:// www.hp.com/go/hpux-wbem-docs Management Processor Retrieves information about the management processor on the system.
Table 1 Instance providers (continued) Instance provider Description Blade CPU Memory Environmental Firmware Revision Management Processor Enclosure Temperature Sensor Indication providers SFM includes four indication providers, the EMS Wrapper provider, the Event Manager Common Information Model (EVM CIM) provider, SFMIndicationProvider and MCA indication provider. Table 2 describes the SFM indication providers.
Table 2 Indication providers (continued) Indication provider Description 1. Converts hardware, software, and kernel events generated by the EVM into WBEM indications. 2. Reports the WBEM indications to the CIMOM. Using a WBEM-based management application, such as HP SIM, you can subscribe to and receive EVM events generated on a remote system. On the system on which SFM is installed, you can use an SFM tool, called EVWEB, to view and administer events through the HP SMH interface. 3.
NOTE: The following apply to indication providers: • The terms events and indications are used interchangeably. • Although both EMS Wrapper provider and EVM CIM provider generate events related to system hardware, the nature of events are different. In addition, the support for CPUIndicationProvider and MemoryIndicationProvider has been added with additional support for hardware events on HP Integrity BL860c i2, BL870c i2 & BL890c i2 Server Blades.
1 Legacy : HP Integrity platforms supporting processors prior Intel 9300.
Figure 1 Block Diagram of SFM The following list describes the sequence of events when a request is made for information: 1. The CIMOM receives requests from the CMS for information about devices. 2. The CIMOM directs the requests to the appropriate SFM provider, for example, the CPU instance provider. 3. The SFM provider queries the associated hardware device for property information. 4. The SFM provider returns the query information to the CIMOM. 5.
4. The CIMOM directs these indications to EVWEB and to the CMS that has created subscriptions for indications. EVWEB then stores the indications either in the Event Archive, in syslog, or in your E-mail box, or all, depending on your configuration. Indications can be viewed using HP SIM on the remote system and HP SMH on the local system. 5. The indications generated by the SFMIndicationProvider, and reported to the CIMOM, can also be directed to the EMS framework through the WBEM Wrapper Monitor.
2 Installing the SFM software The System Fault Management (SFM) software is installed by default with the HP-UX 11i v3 Operating Environment (OE) media. However, at some point you may need to install the SFM software separately. This chapter describes how to install the SFM software as a standalone component on the HP-UX 11i v3 operating system.
NOTE: • The listed versions of the software are the minimum supported requirements. Subsequent versions are compatible with this version of SFM unless otherwise noted. • WBEM Services, Online Diagnostics, SysMgmtWeb, and HP SIM are available on the Operating Environment (OE) media and can be selected for install during the SFM installation. • HP System Management Homepage (SMH) – bundled in SysMgmtWeb.
Selecting these options automatically installs all the dependencies. NOTE: The system selects some options by default. However, you must select the two options mentioned in step 5 to automatically install the prerequisites. 7. 8. Click OK in the Note window to confirm the selection of dependencies. In the SD Install - Software Selection window, select Actions->Install, as shown in the following figure.
When the SFM software installs, the Install window appears indicating that the SFM software is installed successfully, as shown in the following figure: 9. Unmount the CD. To unmount, enter the following command at the HP-UX prompt: # unmount /tmp/cdrom 10.
3. To install the SFM software and all the dependencies, enter the following command at the HP-UX prompt: # swinstall -x autoselect_dependencies=true -x enforce_dependencies=true -s /tmp/cdrom SysFaultMgmt 4. Unmount the CD. To unmount, enter the following command at the HP-UX prompt: # unmount /tmp/cdrom 5.
Verifying the installation using the TUI To verify the SFM software installation, complete the following steps: 1. Log in to the system as a superuser. 2. Click Logfile in the Install window, as shown in the following figure: The Logfile, which includes details about the installation, is displayed. If there are no errors in the Logfile, the SFM software is installed properly. If the SFM software is not installed properly, you must repeat the installation procedure. 3.
Verifying the installation using the CLI To verify your installation using the CLI, complete the following steps: 1. Log in to the system as a superuser. 2. Enter the following command at the HP-UX prompt: # swjob If the output contains no errors, the SFM software is installed properly. Otherwise, you must install the SFM software again. A sample output is shown in the following figure: 3.
4. Select Actions->Mark for Remove in the SD Remove window, as shown in the following figure: 5.
6. Click OK in the Remove Analysis window to confirm the removal of the SFM software, as shown in the following figure: The following figure is a sample of the removal process in progress: 7.
8. To verify whether the SFM software is removed properly, enter the following command at the HP-UX prompt: # swlist | grep SysFaultMgmt If the SFM software is removed properly, SysFaultMgmt and the version number of the SFM software does not appear in the output. If the SFM software is not removed properly, you must repeat the removal procedure. For more information, see “Verifying removal of the SFM software” (page 27).
3. For information about errors related to the removal of SFM, enter the following command at the HP-UX prompt: # swjob -a log @ :/ The jobid is available in the Logfile. Verifying removal using the CLI To verify if the SFM software is removed successfully, complete the following steps: 1. Log in to the system as a superuser. 2. Enter the following command at the HP-UX prompt: # swjob If the output contains no errors, the SFM software is removed successfully.
3 Configuring indication providers This chapter describes how to configure indication filters, error logging, and the SFMIndicationProvider. Configuring indication filters You must configure the indication filters to view desired indications. You use the Filter Metadata provider (FMD) to configure indication filters that deliver important or desired indications, for example, indications with a certain severity.
Filter Filter Filter Filter Filter Filter Filter Unique Identifier Query Query Language Source Namespace Description State Last Operation : : : : : : : 10002 Select * from HP_AlertIndication where (PerceivedSeverity >= 4) WQL root/cimv2 Admin Filter Enabled Filter State Add Filter HP_AlertIndication is derived from CIM_AlertIndication and HP_DeviceIndication is derived from HP_HardwareIndication. HP_HardwareIndication is derived from HP_AlertIndication.
Sending test event for memory monitor. NOTE: You can also send test events for other devices that the SFMIndicationProvider monitors. For information on the devices monitored by the SFMIndicationProvider, see Table 2 (page 12). To view the list of events, enter the following command at the HP-UX prompt: # evweb eventviewer -L A list of events along with the details such as event number, severity, and event category are displayed by querying the Event Archive.
4 Administering indications and instances using HP SIM This chapter describes System Fault Management (SFM) administration on a remote system using HP Systems Insight Manager (HP SIM). NOTE: You can perform similar tasks using other management applications that are compliant with the Common Information Model (CIM) (2.7.2) schema (or later) of the Distributed Management Task Force (DMTF). The terms events and indications are used interchangeably in this document.
2. To create subscriptions, select Options-->Protocol Settings-->Global Protocol Settings in the HP SIM Home page, as shown in Figure 4-1. Figure 2 HP SIM Home Page The Global Protocol Settings window is displayed, as shown in Figure 4-2. Figure 3 Global protocol settings 3. In Figure 4-2, under default WBEM settings, select Enable WBEM. Click OK to save your settings.
4. Select Configure->Configure or Repair Agents, as shown in Figure 4-3. Figure 4 Configuration The Configure or Repair agents window is displayed, as shown in Figure 4-4. Figure 5 Configure or Repair Agents 5. 34 From the Add targets by selecting from: list in Figure 3-4, select All systems to view and select the systems.
the selected system. The list of systems is displayed in the Select Target Systems window, as shown in Figure 6. Figure 6 Select Target Systems 6. To select all the systems in the network, select the Select “All Systems” itself check box, as shown in Figure 4-5. Click Apply. The Verify Target Systems window is displayed, as shown in Figure 4-6.
7. Select the appropriate check box to verify the target systems and click Next, as shown in Figure 4-6. The Enter credentials window is displayed, as shown in Figure 4-7. Figure 8 Enter credentials 8. Enter your credentials in the given fields, as shown in Figure 4-7. Click Next. The Configure or Repair settings window is displayed, as shown in Figure 4-8. Figure 9 Configure or Repair settings 9. On the Configure or Repair settings window, click Run Now.
Figure 10 Task Results 10. To obtain a printable report of the indication subscription details, click View Printable Report at the bottom of the window. The report is displayed, as shown in Figure 4-10. Figure 11 Printable Report of the indication Subscription NOTE: For more information, see the HP Systems Insight Manager 6.3 Installation and Configuration Guide for HP-UX at: http://www.hp.
1. Select All Events in the left pane of the HP SIM window. The list of events is displayed, as shown in Figure 4-11. Figure 12 Events list 2. To view the details of an event, select the event. The details are displayed at the bottom of the same window, as shown in Figure 4-12.
3. To obtain the printable version of the event details, click View Printable Details at the bottom of the window. The printable report is displayed in a new window, as shown in Figure 4-13.
Table 5 EMS, WBEM and Evweb events severity values (continued) EMS severity WBEM severity Evweb severity 4 Serious 6 Critical 7 Critical 5 Critical 7 Fatal/Non-recoverable 7 Critical NOTE: • Perceived severities in Syslog is same as WBEM severities. • The WBEM severities are standard. Their number can be seen as the severity value for the actual events recorded in /var/opt/sfm/log/event.log. The Evweb severity numbering matches the HP SMH system status.
Table 7 Property Representation EMS Hardware Monitors EMS wrapper provider / Native indication provider Event Time EventTime Severity PerceivedSeverity Event EventID System SystemName Summary Summary Description of Error Description Probable Cause/ Recommended Action ProbableCauseDescription and RecommendedAction (these two are separate fields) System Serial Number SystemSerialNumber InquiryVendorID HWManufacturer Physical Device Path HWLogicalLocation InquiryProductID DeviceModel Ph
NOTE: The Severity levels in Table 4-5 indicate EMS severity. Table 10 (page 42) displays the default event destinations for SysFaultMgmt. Table 10 Default monitoring requests for each monitor Default notification method Severity levels SysFaultMgmt Textlog All textlog: /var/opt/sfm/log/event.log Syslog MAJOR Available CRITICAL FATAL/NON-RECOVERABLE E-MAIL None Not Available Evweb DB All Available (evweb eventviewer -L) NOTE: The Severity levels in Table 4-6 indicate WBEM severity.
The sfmconfig -a -r command is used to change the state of a subsystem. When this command is not working with processor, the user should check for errors. When a CPU is deactivated on a system due to an action taken against an error symptom, the user tries to use sfmconfig command to make the CPU state back to OK The change does not happen unless the processor which is faulty is replaced or it is acquitted from the Onboard Administrator on HP Superdome 2.
5 Administering indications and instances using HP SMH This chapter describes the SFM administration tasks that you can perform using HP SMH on a local system.
NOTE: Starting September 2009 release, in HP SMH GUI, you can refer to “The equivalent command line” option, to view command line information about processors. For more information, view cprop manpage. See "man cprop" 1. Select Show All under System on the HP SMH home page. The system page is displayed. Figure 16 System Management Homepage 2. Select Processors under System on the HP SMH home page. Information about the processors is displayed. 3. To return to the HP SMH home page, click on Home.
Viewing information about System Summary To obtain information about system summary, such as the model, role, UUID, UUID (Logical), Serial number, Serial number (Logical) and many more, complete the following steps: 1. Select System Summary under System on the HP SMH home page. System summary information is displayed. 2. To return to the HP SMH home page, click on Home.
Viewing information about Management Processor To obtain information about the Management Processor (MP), such as its IP address, status, and URL, complete the following steps: 1. Select Management Processor under System on the HP SMH home page. Information about the management processor is displayed. 2. To return to the HP SMH home page, click on Home.
1. Select Blade under System on the HP SMH home page. Information about the Blade is displayed. 2. To return to the HP SMH home page, click on Home. Viewing information about Cell Blade To obtain information about the Cell Blade, such as the Status, Hardware Path and OA partition Information of the enclosures, complete the following steps: 1. Select Cell Blade under System on the HP SMH home page. Information about the Cell Blade is displayed. 2. To return to the HP SMH home page, click on Home.
For more information, see HP System Management Homepage Online Help. In HP SMH, go to the Help menu. Administering indications using Evweb This section provides an overview of Evweb and describes how to use Evweb for administrative tasks, such as creating and managing subscriptions for indications.
HPUXStorageNativeProviderModule These native indication provider support is available on the HP Integrity Servers. Launching Evweb for administration You can launch Evweb either through the CLI or through the HP SMH GUI. To launch Evweb for administering event subscriptions using the CLI, enter the following command at the HP-UX prompt: # evweb subscribe To use HP SMH GUI to launch Evweb for administering event subscription, complete the following steps: 1. Log in to HP SMH.
Creating an event subscription using the GUI To create a new event subscription, complete the following steps: 1. Repeat steps 1-5 from “Launching Evweb for administration” (page 50). 2. Select Create subscription in the action pane on the top right corner of the Event subscription administration page. The Create subscription page is displayed. 3. Provide appropriate information in the fields present in the Create subscription page. NOTE: 4. 5.
IMPORTANT: The subscription criteria is not copied when you copy an HP Advised event subscription. Therefore, ensure that you specify the subscription criteria in the Copy and create subscription page. NOTE: 5. It is mandatory to specify a unique name in the Subscription name. Select Create on the Copy and create subscription page. Evweb creates the event subscription and displays a confirmation message. 6. Click OK on the confirmation message window. NOTE: There is no CLI equivalent for this action.
6. Select Modify in the Modify subscription page. Evweb modifies the event subscription and displays a confirmation message. 7. Click OK on the confirmation message window. For more information on modifying an event subscription using the HP SMH GUI, select Help on the action pane of the Modify event subscription page.
Example 1 # evweb subscribe -L Subscription Name HP Known Is Deprecated Event Archive Email Syslog ====================== ======== ============== ============== ======== ======== HP_defaultSyslog FALSE FALSE FALSE FALSE TRUE test FALSE FALSE TRUE FALSE FALSE FALSE TRUE FALSE FALSE HP_General Filter@1_V1 TRUE # evweb subscribe -Mn test -r The execution of 'evweb subscribe' command was successful.
IMPORTANT: The subscription criteria are not copied when you copy an HP Advised event subscription. Therefore, ensure that you specify the subscription criteria in the Copy and modify subscription page. For more information on copying and modifying an event subscription using the HP SMH GUI, select Help on the action pane of the Copy and Modify event Subscription page. Deleting Evweb event subscriptions You must periodically delete event subscriptions that are not required.
Configuring E-mail Consumer The E-mail Consumer is a component of Evweb that receives indications from the WBEM Services and redirects them to an SMTP server. Normally, the local system itself is the e-mail server. In such cases, you need not configure the E-mail Consumer. If the e-mail server is not on the local system, you must configure the E-mail Consumer. To configure the E-mail Consumer, complete the following steps: 1. Open the /var/opt/sfm/conf/evweb.conf file on the system. 2.
The Event subscription administration page displays a summary of the the event subscriptions, in a tabular format. In this document, this table is referred to as the event subscription summary table. For more information on viewing a summary of an evweb event subscription using the HP SMH GUI, select Help on the action pane of the event subscription page.
2. Select the event subscription from the event subscription Table. The details of the event subscription is displayed at the end of the event subscription Table. For more information on viewing the details of an Evweb event subscription using the HP SMH GUI, select Help on the action pane of the event subscription page.
For more information on viewing external event subscription using the HP SMH GUI, select Help on the action pane of the View external event subscription page. Viewing external event subscriptions using the CLI To view the summary of an Evweb event subscription using the CLI, enter the following command at the HP-UX prompt: # evweb subscribe -L -b external Where: -L is an option used to list all the event subscriptions. -b is a switch used to display information about event subscriptions in brief.
NOTE: Evweb enables both administrators and non-administrators to search and view WBEM events. However, only administrators can delete WBEM events. The Event Viewer enables you to view, search, and delete WBEM events that are present in the Event Archive. It also enables you to view both detailed and summary information of WBEM events. You can also search for WBEM events stored in the Event Archive using the advanced search feature.
To filter the WBEM events based on their severity, complete the following steps: 1. Repeat steps 1-5 from “Launching Evweb for viewing WBEM indications” (page 60). 2. Select Critical to search for critical events. Similarly, select Major, Minor, Warning, Information, Other, Normal, or Unknown to search for the respective WBEM events. 3. To view all events, select All Events.
Viewing summary information using GUI To view summary information about WBEM events, repeat steps 1-5 from “Launching Evweb for viewing WBEM indications” (page 60). The List Events page is displayed. The List Events table displays summary information about the WBEM events that are stored in the Event Archive. For more information on viewing summary information of the WBEM events using the HP SMH GUI, select Help on the action pane of the List Events page.
1. 2. Repeat steps 1-5 from “Launching Evweb for viewing WBEM indications” (page 60). Select Delete Events in the action pane on the top right corner of the page. The Delete Events page is displayed. 3. Select the event you want to delete by selecting the appropriate check box. You can delete more than one WEBM event at a time. 4. Select Delete in the List Events page. Evweb deletes the event from the event archive database and displays a confirmation message. 5.
The Log Viewer enables you to view low level log information from the log database on a local HP-UX system. The low level log information include Log Id, Log Index, Device Id, Device Type, and Time of occurrence. You can find the low level log information in both Current Log Database and Archive Log Database. By default, the most recent low level log information is stored in the current log database.
For information on searching the log database using GUI, select Help on the action pane of the Log Viewer page. Searching Low Level Logs using Advanced Search To search the log database for low level logs using Log Viewer, complete the following steps: 1. Repeat steps 1-5 from “Searching Low Level Logs using Simple Search” (page 64). 2. Select Advanced Search on the right pane of the Log Viewer page. The Advanced Search page is displayed. 3. 4.
Viewing List Of Low Level Logs using GUI To view a list of low level logs using Log Viewer, complete the following steps: 1. Repeat steps 1-5 from “Searching Low Level Logs using Simple Search” (page 64). 2. Depending on the information you have, perform either a simple search or an advanced search. Based on the search criteria, the log records are displayed in a tabular format. For information on viewing the low level logs using GUI, select Help on the action pane of the Log Viewer page.
Viewing Details Of Low Level Logs using GUI To view details of low level logs using Log Viewer, complete the following steps: 1. Repeat steps 1-5 from “Searching Low Level Logs using Simple Search” (page 64). 2. Depending on the information you have, perform either a simple search or an advanced search. Based on the search criteria, the log records are displayed in the Log Summary Table. 3. Select the desired low level log from the Log Summary Table.
• “Disabling Tracing using the Evweb GUI” (page 70) • “Disabling Tracing using the Evweb CLI” (page 71) Tracing is a feature that enables you to log and report errors. You can use tracing to log information about problems encountered while using Evweb. In this document, this information is referred to as trace records. The trace records are stored in the /var/opt/sfm/log/sfm.log file. The logs are written when SFM is running.
Table 16 Trace Levels (continued) Trace Level Description Example: A warning is generated when the default setting is modified. Both critical and error situations are logged at the Warning trace level. 4-Information The system logs situations that result in information messages. Example: An information message is generated when a non-administrator attempts to perform a task that can be performed only by an administrator. Critical, error, and warning situations are logged at the Information trace level.
# export EVWEB_TRACE_LEVEL= Tracing is now enabled. The trace value is the trace level that you have set. Modifying Tracing using the Evweb GUI To modify tracing, complete the following steps: 1. Log in to the System Management Homepage. To log in to HP SMH, enter http://:2301 in the address bar of a Web browser. The HP SMH login screen is displayed. 2. 3. Enter your user name and password in the appropriate text boxes. Click Sign In on the login screen.
NOTE: 5. The Disable Tracing option is not displayed if tracing is not enabled. Select Disable Tracing available on the top right corner of the page. The tracing is disabled and a confirmation message is displayed. 6. Click OK on the confirmation message window. For more information on disabling tracing using the HP SMH GUI, select Help on the action pane of either the Event Viewer or the Event Subscription Administration page.
http://www.hp.com/go/smh-docs Launching EMT You can launch EMT either through the CLI or the GUI. To launch EMT using the CLI, enter the following command at the HP-UX prompt: # /opt/sfm/bin/emtui To launch EMT using the HP SMH GUI, complete the following steps: 1. Log in to the HP SMH. To log in to the HP SMH, enter http://:2301 in the address bar of the Web browser. The HP SMH login screen is displayed. 2. 3. Enter your user name and password in the appropriate text boxes.
Following are the match types: -i • any (default) - Searches for at least one word specified in the query string. • all - Searches for all words specified in the query string. • phrase - Searches for the exact phrase specified in the query string. is an option that enables you to specify the error number. Error number is a unique identification number for the errors present in CER.
# emtui -b Where: -b is an option used to view information about events in brief. A list of events in CER is displayed. For more information on viewing summary information about an event in CER using the CLI, see emtui(1). Viewing Detailed Information using the GUI You can view detailed information about an event by clicking on any row in the Error Summary Table. To view detailed information about an event stored in CER using the HP SMH GUI, complete the following steps: 1.
3. Click on the desired event from the Event Summary Table. The details of the event is displayed in the Detailed Error Information (Administrative View) pane. 4. If there is no cause associated with the event, skip to Step 5. If a cause is associated with the event, select the cause for the event from the Detailed Error Information (Administrative View) pane. You can select multiple causes for an event. 5.
6. Click OK on the confirmation message window. The modified custom solution is displayed on the Detailed Error Information (Administrative View) pane. For more information about modifying a custom solution using the HP SMH GUI, select Help on the action pane of the Modify a Custom Solution page.
Tracing EMT This section provides an overview of tracing and information about the various trace levels in EMT. This section also describes how to enable, modify, and disable tracing.
For more information about enabling tracing using the HP SMH GUI, select Help on the action pane of the Enable Tracing page. Enabling Tracing using the EMT CLI To enable tracing using the EMT CLI, you must export the environment variable, EMTUI_TRACE. To export EMTUI_TRACE, enter the following command at the HP-UX prompt: # export EMTUI_TRACE= Tracing is now enabled. The trace value is the trace level that you want to set.
6 Troubleshooting SFM This chapter describes how to troubleshoot SFM providers and EVWEB. This chapter addresses the following topics: • “Troubleshooting instance providers” (page 79) • “Troubleshooting indication providers” (page 84) • “Troubleshooting EVWEB” (page 89) NOTE: For information on issues with oserrorlogd, see the Using PSB Components section of the ProviderSvcsBase administrator guide.
Table 18 Troubleshooting instance providers (continued) Problem Cause Solution by entering the following command at the HP-UX prompt: On Itanium-based systems, enter: # ln -s /opt/sfm/lib/libsfmproviders.1\ /opt/wbem/providers/lib/libsfmproviders.so On PA-RISC-based systems, enter: # ln -s /opt/sfm/lib/libsfmproviders.1\ /opt/wbem/providers/lib/libsfmproviders.sl 7.
Table 18 Troubleshooting instance providers (continued) Problem Cause Solution Alternatively, you can enter the following command at the HP-UX prompt to start SFMProviderModule: # sh /opt/sfm/bin/restart_sfm.sh The logs to /var/opt/sfm/log/state.log are written when SFM is in Degraded State and is getting re-enabled by the script. NOTE: The script restarts SFM only if SFMProviderModule is in Degraded state. 3.
Table 20 Troubleshooting instance providers (continued) Problem: Requests for instances do not return any value. Causes Solution 3. If the output displayed is different from this output, the provider module is not registered. To register the provider module, enter the following command at the HP-UX prompt: # cimmof -nroot/PG_InterOp /opt/sfm/schemas/mof/SFMProvidersR.mof 4. If no errors are displayed, the provider module is registered successfully. Ignore this step and move to step 6.
Table 20 Troubleshooting instance providers (continued) Problem: Requests for instances do not return any value. Causes Solution 1. Enter the following command at the HP-UX prompt to disable SFMProviderModule: # cimprovider –d –m SFMProviderModule 2. Enter the following command at the HP-UX prompt to enable SFMProviderModule: # cimprovider –e –m SFMProviderModule Alternatively, you can enter the following command at the HP-UX prompt to start SFMProviderModule: # sh /opt/sfm/bin/restart_sfm.
Table 22 Troubleshooting instance providers (continued) Problem: Indications fulfilling the conditions defined in the HP-Known HP-Defined filters, are not logged in the Event Archive. Cause Solution To execute the file, enter the following command at the HP-UX prompt: # wbemexec /EnumerateInstances.xml The full path is the absolute path of the EnumerateInstances.xml file.
Table 24 Troubleshooting indication providers (continued) Problem: Indications corresponding to events generated by the Event Monitoring Service (EMS) monitors, are not logged in the Events List. Causes Solution # cimserver After the CIMOM restarts, enter the following command at the HP-UX prompt to register the provider module: # cimmof -nroot/PG_InterOp /opt/sfm/schemas/mof/SFMProvidersR.
Table 24 Troubleshooting indication providers (continued) Problem: Indications corresponding to events generated by the Event Monitoring Service (EMS) monitors, are not logged in the Events List.
Table 24 Troubleshooting indication providers (continued) Problem: Indications corresponding to events generated by the Event Monitoring Service (EMS) monitors, are not logged in the Events List. Causes Solution To enumerate instances, enter the following command at the HP-UX prompt: # wbemexec /enumerateInstances_sub.
Table 24 Troubleshooting indication providers (continued) Problem: Indications corresponding to events generated by the Event Monitoring Service (EMS) monitors, are not logged in the Events List. Causes Solution Cause 5 The indication providers are not loaded properly. To check if the indication providers are loaded properly, complete the following steps : 1. Open the /var/opt/sfm/conf/FMLoggerConfig.xml file. 2.
Table 24 Troubleshooting indication providers (continued) Problem: Indications corresponding to events generated by the Event Monitoring Service (EMS) monitors, are not logged in the Events List. Causes Solution If the status of SFMProviderModule is Degraded as displayed in the given output, SFMProviderModule is not running. To enable SFMProviderModule, complete the following steps: 1. Enter the following command at the HP-UX prompt to disable SFMProviderModule: # cimprovider –d –m SFMProviderModule 2.
Table 25 Troubleshooting EVWEB (continued) Problem Cause Solution The Event Archive Database service is not running properly. If the Event Archive Database service is running and the problem persists, check for error logs in the /var/opt/sfm/log/sfm.log file. For further assistance, contact the HP support center. WBEM Indications are not Cause 1 logged in to the Event SFMProviderModule is Archive. not running.
Table 25 Troubleshooting EVWEB (continued) Problem Cause Solution IPProviderModule SFMProviderModule OK Degraded If the status of SFMProviderModule is Degraded as displayed in the given output, SFMProviderModule is not running. To enable SFMProviderModule, complete the following steps: 1. Enter the following command at the HP-UX prompt to disable SFMProviderModule: # cimprovider –d –m SFMProviderModule 2.
Table 25 Troubleshooting EVWEB (continued) Problem Cause Solution Version C.06.00.07.01, 1. Delete the monitoring request using monconfig. Enter the September 2009 following command to delete the monitoring request: release, to provide a # /etc/opt/resmon/lbin/monconfig summary of event 2. Select the delete option to delete the request. information of critical and serious events. The (D)elete a monitoring request default subscription to 3.
A EMT Message Definition Following is a sample EMT Message file: $ Descriptor Header begins $ <> DescriptorID=0000023100000010800000AA006D2EA3 $ <> ProductName = myProduct $ <> ProductID = ID $ <> ProductEmailAlias=myproduct@abc.com $ <> OrgName = myorg $ <> OrgType = ISV $ <> Subsystem={(Type=EMS, Name= dm_chassis),(Type=WBEM, Name=FileSystemProvider)} $ <> ProductCategory=Kernel $ <> MsgCat={ID=1,(Path=./ lvmcommonmessages.cat, Locale= ja_JP.
Table 26 EMT Message File Description (continued) Tag Description Usage ProductCategory Specify one or more of the following product ProductCategory= categories that best describes your product: Kernel,IO,Network • Hardware • Network • IO • Kernel • Commands • Others MsgCat Specify a list of message catalogs. The MsgCat tag has the following attribute: ID – A unique number used to identify an error message. MsgCat={(ID=1, Path=../../../bin/cat/en_US.iso88591/module1.cat,LocaleName= en_US.
Table 26 EMT Message File Description (continued) Tag Description Usage only one cause and one or more corrective action for a given error message, the Action tags are associated with the Cause tag. In such a situation, the Cause_Action tag is not mandatory. For any error, a cause can be specified without specifying the corrective action. However, a corrective action cannot be specified without specifying a cause. WBEMDetail Specify WBEM specific details of a message.
B Interpretation of HP SMH instances This appendix describes the fields and enables you to interpret the instances in the HP SMH property pages.
Processor instances This section describes the processor instances. Figure 20 Sample Processors property page Table 27 (page 97) describes the fields and enables you to interpret the values displayed in Figure 20 (page 97). Table 27 Description of the Processors Fields and Values Fields and Values Description Status Indicates the status of the processors. An OK status indicates that all the processors are functioning properly. Click Events to see the details of the errors.
Memory instances This section describes the memory instances. Figure 21 Sample Memory property page Table 28 (page 98) and Table 29 (page 99) describes the fields and enables you to interpret the values displayed in Figure 21 (page 98). Table 28 Description of the Memory Slots Fields and Values 98 Fields and Values Description Status Indicates the status of the memory module. An OK status indicates that all the modules are configured properly.
Table 28 Description of the Memory Slots Fields and Values (continued) Fields and Values Description Part Number Indicates the part number of the memory. HashID Identifies an instance of the device. Table 29 Description of the Empty Slots Fields and Values Fields and Values Description Location Indicates the location of the memory. Attributes such as Cabinet Number, Cell Slot, and DIMM Slot help narrow down the location of the memory module.
Table 30 Description of the Memory Slots Fields and Values (continued) Fields and Values Description Logical memory information Physical memory information Device Bay Information NOTE: Indicates the URL to launch the blade information page on the OA. Memory information displayed is as viewed from a hard partition (nPar).
System Summary instances This section describes the system summary instances. Figure 23 Sample System Summary property page Table 31 (page 101), Table 32 (page 102) and Table 33 (page 102) describes the fields and enables you to interpret the values displayed in Figure 23 (page 101). Table 31 Description of the General Information Fields and Values Fields and Values Description Model Describes the system model.
Table 31 Description of the General Information Fields and Values (continued) Fields and Values Description UUID UUID (Logical) Universally Unique ID (UUID) indicates the asset number of the system. Indicates the UUID of the logical server. A logical server is a software configuration that can be applied to a server blade or a virtual machine. Also, you can move a logical server from one server blade or a virtual machine to another.
Cooling Device instances This section describes the cooling device instances. Figure 24 Sample Cooling device property page Table 34 (page 103) describes the fields and enables you to interpret the values displayed in Figure 24 (page 103). Table 34 Description of the Cooling Device Fields and Values Fields and Values Description Status Indicates the status of the fans. An OK status indicates that all the modules are configured properly.
Power supply instances This section describes the power supply instances. Figure 25 Sample Power property page Table 35 (page 104) describes the fields and enables you to interpret the values displayed in Figure 25 (page 104). Table 35 Description of the Power Supply Fields and Values Fields and Values Description Status Indicates the status of the power supply. An OK status indicates that the power supplies are configured properly.
Temperature instances This section describes the temperature instances. Figure 26 Sample Temperature property page Table 36 (page 105) describes the fields and enables you to interpret the values displayed in Figure 26 (page 105). Table 36 Description of the Temperature Fields and Values Fields and Values Description Status Indicates whether the sensor temperature in the system is normal or not. However, the status of the sensor temperature does not reflect the status of the cooling devices.
Voltage instances This section describes the voltage instances. Figure 27 Sample Voltage property page Table 37 (page 106) describes the fields and enables you to interpret the values displayed in Figure 27 (page 106). Table 37 Description of the Voltage Fields and Values Fields and Values Description Status Indicates whether the sensor voltage in the system is normal or not. An OK status indicates that the sensor voltage in the system is normal. HashID Identifies an instance of the device.
FRU Information instances This section describes the FRU Information instances. Figure 28 Sample FRU Information property page Table 38 (page 107) describes the fields and enables you to interpret the values displayed in Figure 28 (page 107). Table 38 Description of the MP Fields and Values Fields and Values Description Name Indicates the FRU Name of the Physical Element. Serial Number Indicates the serial number of the FRU. HashID Identifies an instance of the device.
Management Processor instances This section describes the Management Processor (MP) instances. Figure 29 Sample MP property page Table 39 (page 108) describes the fields and enables you to interpret the values displayed in Figure 29 (page 108). Table 39 Description of the MP Fields and Values Fields and Values Description Status Indicates whether the Management Processor (MP) is functioning properly or not. An OK status indicates that the MP is functioning properly.
Firmware Information instances This section describes the Firmware Information instances. Figure 30 Sample Firmware Information property page Table 40 (page 109) describes the fields and enables you to interpret the values displayed in Figure 30 (page 109). Table 40 Description of the Firmware Information Fields and Values Fields and Values Description Name Indicates the name of the entity, such as the system firmware, MP, or the system backplane cell, whose firmware information is displayed.
Enclosure Information instances This section describes the Enclosure instances. Figure 31 Sample Enclosure property page Table 41 (page 110) describes the fields and enables you to interpret the values displayed in Figure 31 (page 110). Table 41 Description of the Enclosure Information Fields and Values 110 Fields and Values Description Status Indicates the status of the enclosure. An OK status indicates that the components of the enclosure are functioning properly.
Complex-wide Info instances This section describes the Complex-wide Info instances. Figure 32 Sample Complex-wide Info property page Table 42 (page 112), Table 43 (page 112) and Table 44 (page 112) describes the fields and enables you to interpret the values displayed in Figure 32 (page 111).
Table 42 Description of the Complex Information Fields and Values Fields and Values Description Complex Name Describes user defined name for the complex. Model Defines Model identification string. Serial Number Indicates the serial number of the complex as assigned by the original manufacturer. Revision Displays string for the revision number of the profile, consisting of the major and minor revision numbers concatenated with a period as a separator.
Cell Board instances This section describes the Cell Board instances. Figure 33 Sample Cell Board property page Table 45 (page 113) describes the fields and enables you to interpret the values displayed in Figure 33 (page 113). Table 45 Description of the Cabinet Fields and Values Fields and Values Description Firmware Version Displays string for the firmware revision number, consisting of the major number separated from the minor number by a period. Status Indicates the status of the component.
Table 45 Description of the Cabinet Fields and Values (continued) Fields and Values Description Total Processor Slots Indicates the number of processor module slots on the cell. Total Empty Processor Slots Indicates the number of all empty processor slots. Processors Per Module Indicates the number of processors per processor module on the cell. Total Installed Processor Modules Indicates the number of all installed processor modules in the cell.
Partition Information instances This section describes the Partition Information instances. Figure 34 Sample Partition Information property page Table 46 (page 115) describes the fields and enables you to interpret the values displayed in Figure 34 (page 115). Table 46 Description of the Partition Fields and Values Fields and Values Description Partition Name Describes user defined name with the numeric label for the Partition. nPartition ID Indicates the ID of the nPartition in the complex.
Table 46 Description of the Partition Fields and Values (continued) 116 Fields and Values Description Total Deconfigured Processor Modules Indicates the number of all deconfigured processor modules in the partition. Total Installed Memory Displays the total amount of memory installed in the partition, in megabytes. Total Installed Cells Indicates the number of all cells installed in the partition. Total Active Cells Indicates the number of all active cells in the partition.
Blade instances This section describes the Blade instances. Figure 35 Sample Blade property page Table 47 (page 117) describes the fields and enables you to interpret the values displayed in Figure 35 (page 117). Table 47 Description of the Blade Fields and Values Fields and Values Description Status Indicates the status of the blade. Hardware Path Indicates the hardware path of the blade. Serial Number Indicates the serial number of the blade.
Cell Blade instances This section describes the Cell Blade instances. Figure 36 Sample Cell Blade property page Table 48 (page 118) describes the fields and enables you to interpret the values displayed in Figure 36 (page 118). Table 48 Description of the Cell Blade Fields and Values 118 Fields and Values Description Status Indicates the status of the blade. Hardware Path Indicates the hardware path of the blade.
Launch the Onboard Administrator To access the Onboard Administrator (OA) from the property pages, complete the following steps: 1. Click on the Onboard Administrator link from the property page. Figure 37 Onboard Administrator 2. The OA login page opens in a new browser window. Figure 38 OA login page 3. Enter the Onboard Administrator User name and Password.
C Syslog property order This appendix describes the order for the properties (IndicationIdentifier, EventID, PerceivedSeverity, ProviderName and Summary) in the event message which is written in syslog by the HP_defaultSyslog subscription. NOTE: The term legacy refers to HP Integrity Servers with Intel(R) Itanium(R) processors older than 9300. The term HP Integrity Servers refers to Intel(R) Itanium(R) 9300 processors.
Glossary A-B Admin-defined event subscription Subscriptions created by the administrator using the CLI. These subscriptions cannot be deleted. Admin-defined filters Filters that can be created, deleted, and modified to set the criteria for indications that must be logged. C Central Management Server (CMS) The server monitoring the client systems in the network using SFM. CIM client An entity in WBEM architecture which sends CIM Operation requests and receives CIM Operation responses.
External subscriptions These are subscriptions created by tools other than EVWEB. H HP System Management Homepage (HP SMH) HP's management application installed on the local system that uses WBEM instrumentation on operating systems such as HP-UX, Linux, and Windows. HP Systems Insight Manager (HP SIM) HP's management application installed on the CMS that uses WBEM instrumentation on operating systems such as HP-UX, Linux, and Windows.
SysFaultMgmt The name of the bundle that includes the SFM software. T-V Tracing Tracing is an error-logging and reporting facility provided by EVWEB and EMT. W-Z WBEM (Web-Based Enterprise Management) A collection of standards that aid large-scale systems management. WBEM allows management applications to monitor systems in a network.
Index A administrator, 49, 71 Autoselect dependency, 19 B benefits SFM, 7 C Central Management Server see CMS CER, 71 CIMOM, 16 cimserver, 79, 81, 82 -s option, 79, 81 cimserver -s, 79, 82 CMS, 15 command-line interface, 19 Common Information Model Object Manager see CIMOM configuration monitor mode, 30 SFM, 20 cooling devices on a system, 46 creation subscription, 32 cron, 15 custom solution adding, 74 deleting, 76 modifying, 75 error metadata, 71 Event Archive, 29, 90 HP-Known HP-Defined filter, 84 tro
IPMI Event Viewer slview, 9 J jobid, 23 L Log Viewer, 64 Archive Log Database, 64 Current Log Database, 64 Logfile, 23 logs /var/opt/sfm/log/sfm.log file, 68, 77 /var/sam/log/samlog.log file, 69 M SysFaultMgmt, 21, 27 System Fault Management see SFM System Page of HP SIM, 43 T temperature status, 46 terminal user interface, 19 tracing, 67, 77 troubleshooting, 90 EMS Wrapper Provider, 86, 88 Event Archive, 89, 91 module, 85 V O /var/sam/log/samlog.
7 Support and other resources About this document This document describes how to install, administer, and troubleshoot the System Fault Management (SFM) software and its components. Document updates may be issued between editions to correct errors or to document product changes. To ensure that you receive the updated or new editions, subscribe to the appropriate product support service. Contact your local HP sales representative for more information. This document can also be found online at: http://www.hp.
Chapter 5 Administering Indications and instances using HP SMH Describes how to use the HP System Management Homepage (HP SMH) GUI to administer indications and view instances on the local system. Chapter 6 Troubleshooting SFM Describes how to troubleshoot SFM providers and EVWEB. Appendix A Appendix A Describes the EMT message file. Appendix B Appendix B Interpretation of HP SMH instances. Appendix C Appendix C Describes the Syslog property order.
New and changed information in this edition • The Table 3 (page 14) lists the instance and indication providers support on different platforms. • A new appendix, “ Syslog property order” (page 120) describes the order for the three properties (EventID, PerceivedSeverity and ProviderName) in the event message which is written in syslog by the HP_defaultSyslog subscription. Related information Additional information about SFM is available at: http://www.hp.