NonStop S-Series Operations Guide (G06.27+)

ManualsBrandsHP ManualsServerHP NonStop G-Series

HP NonStop S-Series

Operations Guide

Abstract

This guide describes how to perform routine system hardware operations for HP

NonStop™ S-series servers. These tasks include monitoring the system, performing

common operations tasks, and performing routine hardware maintenance. This guide

is written for system operators.

Product Version

N.A.

Supported Release Version Updates (RVUs)

This guide supports G06.27 and all subsequent G-series RVUs until otherwise

indicated by its replacement publication.

Part Number Published

522459-008 September 2005

Summary of content (296 pages)

PAGE 1
HP NonStop S-Series Operations Guide Abstract This guide describes how to perform routine system hardware operations for HP NonStop™ S-series servers. These tasks include monitoring the system, performing common operations tasks, and performing routine hardware maintenance. This guide is written for system operators. Product Version N.A. Supported Release Version Updates (RVUs) This guide supports G06.27 and all subsequent G-series RVUs until otherwise indicated by its replacement publication.
PAGE 2
Document History Part Number Product Version Published 522459-008 N.A. September 2005 522459-007 N.A. September 2004 522459-005 N.A. September 2003 522459-004 N.A. August 2002 522459-003 N.A.
PAGE 3
HP NonStop S-Series Operations Guide Index Examples What’s New in This Manual xiii Manual Information xiii New and Changed Information Figures Tables xiii About This Guide xv Who Should Use This Guide xv What Is in This Guide xvi Where to Get More Information xvii Support and Service Library xviii Notation Conventions xviii 1.
PAGE 4
1. Introduction to NonStop S-Series Operations (continued) Contents 1. Introduction to NonStop S-Series Operations (continued) Launching OSM and TSM Applications 1-11 Troubleshooting OSM and TSM Sessions 1-11 Guided Procedures 1-13 2.
PAGE 5
4. Monitoring EMS Event Messages Contents 4. Monitoring EMS Event Messages When to Use This Section 4-1 What Is the Event Management Service (EMS)? Tools for Monitoring EMS Event Messages 4-1 EMSDIST 4-1 OSM Event Viewer 4-2 TSM Event Viewer 4-2 ViewPoint 4-2 Related Reading 4-3 4-1 5.
PAGE 6
7. ServerNet/DA: Monitoring and Recovery Contents 7. ServerNet/DA: Monitoring and Recovery When to Use This Section 7-1 Overview of the ServerNet/DA 7-1 Monitoring the ServerNet/DA 7-1 Identifying Problems With the ServerNet/DA 7-2 Recovery Operations for the ServerNet/DA 7-3 Related Reading 7-3 8.
PAGE 7
Contents 10. Tape Drives: Monitoring and Recovery 10.
PAGE 8
11. Processors: Monitoring and Recovery (continued) Contents 11. Processors: Monitoring and Recovery (continued) Backing Up a Processor Dump to Tape 11-25 Replacing Processor Memory or a PMF CRU 11-26 Submitting Information to Your Service Provider 11-26 Related Reading 11-29 12.
PAGE 9
Contents 15. Power Failures: Preparation and Recovery 15.
PAGE 10
16. Starting and Stopping the System (continued) Contents 16.
PAGE 11
B. Tools and Utilities for Operations (continued) Contents B.
PAGE 12
Figures Contents Figures Figure 1-1. Figure 2-1. Figure 2-2. Figure 2-3. Figure 2-4. Figure 2-5. Figure 2-6. Figure 2-7. Figure 2-8. Figure 2-9. Figure 2-10. Figure 2-11. Figure 2-12. Figure 2-13. Figure 2-14. Figure 3-1. Figure 3-2. Figure 3-3. Figure 3-4. Figure 3-5. Figure 9-1. Figure 9-2. Figure 10-1. Figure 10-2. Figure 10-3. Figure 11-1. Figure 11-2. Figure 11-3. Figure 16-1. Figure 16-2. Figure 16-3. Figure D-1. Figure D-2. Figure D-3.
PAGE 13
Tables Contents Tables Table 1-1. Table 2-1. Table 2-2. Table 2-3. Table 2-4. Table 2-5. Table 2-6. Table 2-7. Table 2-8. Table 2-9. Table 2-10. Table 2-11. Table 2-12. Table 3-1. Table 3-2. Table 3-3. Table 3-4. Table 3-5. Table 3-6. Table 4-1. Table 6-1. Table 7-1. Table 9-1. Table 9-2. Table 9-3. Table 10-1. Table 10-2. Table 11-1. Table 11-2. Table 11-3. Table 11-4. Table 13-1. Table 15-1. Table 16-1.
PAGE 14
Tables (continued) Contents Tables (continued) Table C-1. Table D-1.
PAGE 15
What’s New in This Manual Manual Information HP NonStop S-Series Operations Guide Abstract This guide describes how to perform routine system hardware operations for HP NonStop™ S-series servers. These tasks include monitoring the system, performing common operations tasks, and performing routine hardware maintenance. This guide is written for system operators. Product Version N.A. Supported Release Version Updates (RVUs) This guide supports G06.
PAGE 16
What’s New in This Manual • • New and Changed Information Modified Figure 2-14, SCF LISTDEV Output, on page 2-29, changing the LISTDEV example to show blanks in the BPID field where it previously showed 0,0. The 0,0 is a valid value, different from the value of blanks. Added new and changed information to Section 9, Disk Drives: Monitoring and Recovery.
PAGE 17
About This Guide This guide describes how to perform routine system hardware operations for NonStop S-series servers. Note. S-series refers to the hardware that makes up the server. G-series refers to the software that runs on the server. The term NonStop Sxx000 represents the NonStop S70000, NonStop S72000, NonStop S74000, NonStop S76000, and NonStop S86000 servers. The term NonStop S7x00 represents the NonStop S7400 and higher numbered servers.
PAGE 18
What Is in This Guide About This Guide What Is in This Guide Section or Appendix Section and Appendix Titles Section 1 Introduction to NonStop S-Series Operations Section 2 Determining Your System Configuration Section 3 Overview of Monitoring and Recovery Section 4 Monitoring EMS Event Messages Section 5 Processes: Monitoring and Recovery Section 6 Communications Subsystems: Monitoring and Recovery Section 7 ServerNet/DA: Monitoring and Recovery Section 8 Fibre Channel ServerNet Adapter:
PAGE 19
Where to Get More Information About This Guide Where to Get More Information Operations planning and operations management practices appear in these manuals: • • • • Introduction to NonStop Operations Management Availability Guide for Application Design Availability Guide for Change Management Availability Guide for Problem Management For comprehensive information about performing operations tasks for a NonStop S-series server, you need both this guide and the Guardian User’s Guide.
PAGE 20
Notation Conventions About This Guide OSM is the required system management tool for servers that use 6780 switches in ServerNet clusters, but OSM also provides system management for earlier versions of ServerNet clusters. For other documentation related to operations tasks, refer to Appendix C, Related Reading.
PAGE 21
General Syntax Notation About This Guide UPPERCASE LETTERS. Uppercase letters indicate keywords and reserved words; enter these items exactly as shown. Items not enclosed in brackets are required. For example: MAXATTACH lowercase italic letters. Lowercase italic letters indicate variable items that you supply. Items not enclosed in brackets are required. For example: file-name computer type.
PAGE 22
Notation for Messages About This Guide … Ellipsis. An ellipsis immediately following a pair of brackets or braces indicates that you can repeat the enclosed sequence of syntax items any number of times. For example: M address [ , new-value ]… [ - ] {0|1|2|3|4|5|6|7|8|9}… An ellipsis immediately following a single syntax item indicates that you can repeat that syntax item any number of times. For example: "s-char…" Punctuation.
PAGE 23
Notation for Messages About This Guide The user must press the Return key after typing the input. Nonitalic text. Nonitalic letters, numbers, and punctuation indicate text that is displayed or returned exactly as shown. For example: Backup Up. lowercase italic letters. Lowercase italic letters indicate variable items whose values are displayed or returned. For example: p-register process-name [ ] Brackets. Brackets enclose items that are sometimes, but not always, displayed.
PAGE 24
Change Bar Notation About This Guide Change Bar Notation Change bars are used to indicate substantive differences between this edition of the manual and the preceding edition. Change bars are vertical rules placed in the right margin of changed portions of text, figures, tables, examples, and so on. Change bars highlight new or revised information. For example: The message types specified in the REPORT clause are different in the COBOL85 environment and the Common Run-Time Environment (CRE).
PAGE 25
1 Introduction to NonStop S-Series Operations When to Use This Section on page 1-1 Understanding the Operational Environment on page 1-2 What Are the Operator Tasks? on page 1-2 Monitoring the System and Performing Recovery Operations on page 1-2 Preparation and Recovery for Power Failures on page 1-3 Stopping and Powering Off the System on page 1-3 Powering On and Starting the System on page 1-3 Performing Preventive Maintenance on page 1-3 Operating Tape Drives on page 1-3 Responding to Spooler Problems o
PAGE 26
Introduction to NonStop S-Series Operations Understanding the Operational Environment Understanding the Operational Environment To understand the operational environment: • • • If you are already familiar with other NonStop systems, see Appendix A, Operational Differences Between Systems Running D-Series and G-Series RVUs. For a brief introduction to the system organization and the location of system components in a NonStop S-series server, see Section 2, Determining Your System Configuration.
PAGE 27
Introduction to NonStop S-Series Operations • • Preparation and Recovery for Power Failures Section 13, Applications: Monitoring and Recovery Section 14, Printers and Terminals: Monitoring and Recovery Recovery operations for a system console are not discussed in this guide. For recovery procedures for a system console and the applications installed on the system console, see the NonStop S-Series Hardware Installation and FastPath Guide.
PAGE 28
Introduction to NonStop S-Series Operations Determining the Cause of a Problem: A Systematic Approach Determining the Cause of a Problem: A Systematic Approach Continuous availability of your NonStop system is important to system users, and your problem-solving processes can help make such availability a reality. To determine the cause of a problem on your system, start by trying the easiest, least expensive possibilities. Move to more complex, expensive possibilities only if the easier solutions fail.
PAGE 29
A Problem-Solving Worksheet Introduction to NonStop S-Series Operations Table 1-1.
PAGE 30
Task 1: Get the Facts Introduction to NonStop S-Series Operations Task 1: Get the Facts The first step in solving any problem is to get the facts. Although it is tempting to speculate about causes, your time is better spent in first understanding the symptoms of the problem. Task 1a: Determine the Facts About the Problem To get a clear, complete description of problem symptoms, ask questions to determine the facts about the problem.
PAGE 31
Introduction to NonStop S-Series Operations Task 2: Find and Eliminate the Cause of the Problem Task 2: Find and Eliminate the Cause of the Problem After you collect the facts, you are ready to begin considering the possible causes of a problem. Using these facts and relying on your knowledge and experience, begin to list possible causes of the problem. Task 2a: Identify the Most Likely Cause To evaluate the possible causes of any problem, you must compare each cause with the problem symptoms.
PAGE 32
Introduction to NonStop S-Series Operations Task 3: Escalate the Problem If Necessary Task 2b: Fix the Most Probable Cause of the Problem For the example in the worksheet, the most likely cause of the hung terminal is a security problem. Ask yourself what would be the fastest, least expensive, safest, and surest way of verifying that this is the most probable cause of the problem. Once you have determined the most likely cause, try to fix it. Follow through and implement the appropriate solution.
PAGE 33
Introduction to NonStop S-Series Operations Task 4: Prevent Future Problems Task 4: Prevent Future Problems Solving problems that occur with your system can be exciting because it is active and stimulating. Preventing problems is often less dramatic. But in the end, prevention is more productive than solving problems. The more work you do to prevent problems before they arise, the fewer problems that will arise at potentially critical times.
PAGE 34
Introduction to NonStop S-Series Operations System Consoles System Consoles A system console is a personal computer approved by HP to run maintenance and diagnostic software for NonStop S-series servers. New system consoles are preconfigured with the required HP and third-party software. When upgrading to the latest RVU, software upgrades can be installed from the HP NonStop System Console Installer CD.
PAGE 35
Introduction to NonStop S-Series Operations Launching OSM and TSM Applications a. In the Host name or IP address and port box, type the IP address, followed by a space and the port number. For example: 172.17.22.187 23 The port number is 23 for a TACL prompt and 301 for a Startup TACL prompt. In general, you should use port number 23 to perform operations tasks. b. Click OK. 5. From the New Session Properties dialog box, click OK. A TACL window appears. 6. Log on to the TACL prompt.
PAGE 36
Introduction to NonStop S-Series Operations • Troubleshooting OSM and TSM Sessions TSM Event Viewer Troubleshooting OSM and TSM Sessions A message is usually sent if you lose a TSM session. If you experience problems or errors during a TSM session, use the following procedures to help determine the problem. Verifying the State of the TSM Connection To use the TSM package to monitor system components, you must have a TSM session that connects the system console to a NonStop S-series server.
PAGE 37
Introduction to NonStop S-Series Operations • Guided Procedures Perform these steps: 1. Launch the TSM Low-Level Link. (You can cancel the logon dialog and proceed from a blank Management window.) 2. In the toolbar, click Status Log. 3. The Windows Event Viewer appears. • • From the Summary menu of the TSM Service Application, select Status Log. Select Start>Programs>Administrative Tools>Event Viewer.
PAGE 38
Introduction to NonStop S-Series Operations HP NonStop S-Series Operations Guide—522459-008 1-14 Guided Procedures
PAGE 39
2 Determining Your System Configuration When to Use This Section on page 2-1 System Organization on page 2-1 Terms Used to Describe System Hardware Components on page 2-2 Identifying System Enclosures in a NonStop S-Series Server on page 2-3 Locating System Components in an Enclosure on page 2-4 Recording Your System Configuration on page 2-15 Maintaining Hard-Copy Forms on page 2-15 Using OSM or TSM to Inventory Your System on page 2-24 Using SCF to Determine Your System Configuration on page 2-26 Displayi
PAGE 40
Determining Your System Configuration Terms Used to Describe System Hardware Components Terms Used to Describe System Hardware Components The terms used to describe system-hardware components vary. These terms include: • • • • Device Resource Customer-replaceable unit (CRU) Field-replaceable unit (FRU) Device A device can be a physical device or a logical device.
PAGE 41
Determining Your System Configuration Identifying System Enclosures in a NonStop S-Series Server Identifying System Enclosures in a NonStop S-Series Server The three types of system enclosures: • • • Processor enclosures contain processors and other system components. I/O enclosures are similar to processor enclosures but do not contain processors. IOAM enclosures contain I/O adapter modules. For the specific types of system enclosures and the locations of system components, see Table 2-1. Table 2-1.
PAGE 42
Determining Your System Configuration Locating System Components in an Enclosure Locating System Components in an Enclosure System components within an enclosure are identified by their physical location. To identify the location of a system component within an enclosure, you need to know: Group number The group number identifies the enclosure in which a system component is located. The group number of an enclosure is indicated by the group ID label on the enclosure.
PAGE 43
Locating System Components in an Enclosure Determining Your System Configuration Figure 2-1. Identification Numbers and Labels Group 02, Module 01, Slot 03 02 MODULE 01 01 SLOT 01 02 03 04 05 06 03 CDT 602.
PAGE 44
Determining Your System Configuration Locating System Components in an Enclosure Table 2-2 lists slot numbers for each system component in a processor enclosure. Table 2-2.
PAGE 45
Locating System Components in an Enclosure Determining Your System Configuration Figure 2-2 shows a diagram of a NonStop S7000 processor enclosure. Figure 2-2. NonStop S7000 Processor Enclosure Organization Appearance Side (Door Open) Service Side 50 Group 55 01 02 03 04 05 06 07 08 Module 51 52 53 54 11 12 13 14 15 16 17 18 09 19 10 20 21 22 23 28 24 25 26 56 Slots 27 CDT 790.
PAGE 46
Locating System Components in an Enclosure Determining Your System Configuration Figure 2-3 shows a diagram of a NonStop Sxx000 or S7x00 processor enclosure. Some PMF CRUs look slightly different from those shown in the figure. Figure 2-3.
PAGE 47
Determining Your System Configuration Locating System Components in an Enclosure Table 2-3 lists the slot numbers for each system component in an I/O enclosure. Table 2-3.
PAGE 48
Locating System Components in an Enclosure Determining Your System Configuration Figure 2-4 shows an example of a NonStop S-series I/O enclosure. IOMF 2 CRUs look slightly different from the IOMF CRUs shown installed in slots 50 and 55 in the figure. Also, if IOMF 2 CRUs are present, power supplies are installed at the bottom of the enclosure in slots 31 and 32, below the fans. See Figure 2-3 on page 2-8. Figure 2-4.
PAGE 49
Locating System Components in an Enclosure Determining Your System Configuration Figure 2-5.
PAGE 50
Locating System Components in an Enclosure Determining Your System Configuration Identifying the Location of a Processor This table identifies the physical location of each processor: Processor Group Number Module Number Slot Number 0 01 01 50 1 2 55 02 01 3 4 55 03 01 5 6 04 01 05 01 06 01 15 50 55 07 01 13 14 50 55 11 12 50 55 9 10 50 55 7 8 50 50 55 08 01 50 55 HP NonStop S-Series Operations Guide—522459-008 2- 12
PAGE 51
Locating System Components in an Enclosure Determining Your System Configuration Locating the Power-On Push Button Figure 2-6 illustrates where to find the power-on push button on some models of a PMF CRU. Figure 2-6. Locating the Power-On Push Button on a PMF CRU Amber Service LED Green Power-On LED Power-On Push Button POWER ON 50 55 51 52 53 54 Even-Numbered Processor Odd-Numbered Processor 56 Group ID Label on Cable Support 01 Processor Enclosure (Service Side) CDT 799.
PAGE 52
Locating System Components in an Enclosure Determining Your System Configuration Locating the Group ID Switches Group identification for a system enclosure is set with two group ID switches, located on the inside of the enclosure, on the appearance side near the fans. See Figure 2-7. Both group ID switches in an enclosure must display the same value. The service processors (SPs) read the switches when the enclosure is powered on and monitor them for changes. Figure 2-7.
PAGE 53
Determining Your System Configuration Recording Your System Configuration Recording Your System Configuration As a system operator, you need to understand how your system is configured so you can confirm that the hardware and system software are operating normally. If problems do occur, knowing your configuration allows you to pinpoint problems more easily. If your system configuration is corrupted, documentation about your configuration is essential for recovery.
PAGE 54
Determining Your System Configuration Maintaining Hard-Copy Forms Table 2-4.
PAGE 55
Determining Your System Configuration Maintaining Hard-Copy Forms Sample Forms for Recording Your System Configuration Examples of some of the forms available for recording your system configuration are listed next. You are authorized by HP to reproduce these forms only for use in documenting a NonStop S-series system: • • • • • • Figure 2-8 on page 2-18 is a blank form for documenting a PMF CRU configuration. Figure 2-9 on page 2-19 is a blank form for documenting a PMF 2 CRU configuration.
PAGE 56
Maintaining Hard-Copy Forms Determining Your System Configuration Figure 2-8. PMF CRU Configuration Form PMF CRU Configuration Form Date System Name Shaded areas indicate nonconfigurable components Group Module 01 / / Slot SCSI Port Product Number: SCF Name: SCSI Cable: POWER ON Ethernet Port SCSI IP Address: SERIAL CONSOLE ETHERNET Adapter Name: MODEM SAC Name: AC Power (S7000) DC Power (S7x000) AUX POWER-ON CABLE SAC Access List: PIF Name: LIF Name: VST304.
PAGE 57
Maintaining Hard-Copy Forms Determining Your System Configuration Figure 2-9.
PAGE 58
Maintaining Hard-Copy Forms Determining Your System Configuration Figure 2-10.
PAGE 59
Maintaining Hard-Copy Forms Determining Your System Configuration Figure 2-11.
PAGE 60
Maintaining Hard-Copy Forms Determining Your System Configuration Figure 2-12.
PAGE 61
Maintaining Hard-Copy Forms Determining Your System Configuration Figure 2-13.
PAGE 62
Determining Your System Configuration Using OSM or TSM to Inventory Your System Using OSM or TSM to Inventory Your System Both the OSM Service Connection and the TSM Service Application provide you with hierarchical, physical (graphical representation), and inventory views of your system and cluster resources.
PAGE 63
Using OSM or TSM to Inventory Your System Determining Your System Configuration Table 2-5. Naming Conventions for SCF Objects Object Convention Example Description WANBoot processes $ZWB $ZWBA9 WANBoot process associated with TCP/IP SWAN concentrators S S19 Twentieth SWAN concentrator SS7 Telco $C $C Telco process associated with SS7 protocol where is a 2-digit number that identifies the enclosure.
PAGE 64
Using SCF to Determine Your System Configuration Determining Your System Configuration is the slot number and port number mapped as: Slot Number Port Number Slot Number Port Number 51 0 0 53 0 8 51 1 1 53 1 9 51 2 2 53 2 A 51 3 3 53 3 B 52 0 4 54 0 C 52 1 5 54 1 D 52 2 6 54 2 E 52 3 7 54 3 F is a 2-digit number in the range 00 through 99.
PAGE 65
Determining Your System Configuration Using SCF to Determine Your System Configuration SCF Configuration Files Your system is delivered with a standard set of configuration files: • • • The $SYSTEM.SYSnn.CONFBASE file contains the minimal configuration required to load the system. The $SYSTEM.ZSYSCONF.CONFIG file contains a standard system configuration created by HP.
PAGE 66
Determining Your System Configuration Using SCF to Determine Your System Configuration if you type this command and then enter the INFO command without specifying and object, SCF displays only the information for the workstation called $Ll.#TERM1: > SCF ASSUME WS $L1.
PAGE 67
Using SCF to Determine Your System Configuration Determining Your System Configuration Figure 2-14.
PAGE 68
Using SCF to Determine Your System Configuration Determining Your System Configuration The columns in Figure 2-14 mean: LDev The logical device number Name The logical device name PPID The primary processor number and process identification number (PIN) of the specified device BPID The backup processor number and PIN of the specified device Type The device type and subtype RSize The record size the device is configured for Pri The priority level of the I/O process Program The fully qualifie
PAGE 69
Determining Your System Configuration Using SCF to Determine Your System Configuration To display information about a particular device: > SCF LISTDEV TYPE n where n is a number for the device type. For example, if n is 3, the device type is disks and tapes. For the \MS9 system, entering LISTDEV TYPE 3 would display information for $DATA10, $DATA04, $DATA02, and $DATA01.
PAGE 70
Determining Your System Configuration Using SCF to Determine Your System Configuration Kernel Subsystem Before using commands listed in Table 2-8, type this command to make the Kernel subsystem the default object: > SCF ASSUME PROCESS $ZZKRN Generic processes are part of the SCF Kernel subsystem. Generic processes can be created by the operating system or by a user.
PAGE 71
Determining Your System Configuration Using SCF to Determine Your System Configuration When displaying configuration files for disk and tape devices in the storage subsystem, you can use the OBEYFORM option with the INFO command to display currently defined attribute values in the format that you would use to set up a configuration file.
PAGE 72
Determining Your System Configuration Using SCF to Determine Your System Configuration configuration file. Each attribute appears as a syntactically correct system configuration command. For example: ADD ADAPTER $ZZLAN.
PAGE 73
Determining Your System Configuration Using SCF to Determine Your System Configuration Additional Subsystems Controlled by SCF Table 2-12 lists the names associated with additional subsystems that can be controlled by SCF, along with its device types. You can use SCF commands to display the current attribute values for these objects. Some SCF commands are available only to some subsystems. The objects that each command affects and the attributes of those objects are subsystem specific.
PAGE 74
Determining Your System Configuration Using SCF to Determine Your System Configuration Table 2-12.
PAGE 75
Displaying Configuration Information—Examples Determining Your System Configuration Displaying Configuration Information—Examples These examples show SCF commands that display subsystem configuration information, along with the information that is returned. These commands are not preceded by an ASSUME command. To display all the processes running in the Kernel subsystem: -> INFO PROC $ZZKRN.#* The system displays a listing similar to: -> INFO PROC $ZZKRN.#* NONSTOP KERNEL - Info PROCESS \COMM.
PAGE 76
Determining Your System Configuration Displaying Configuration Information—Examples To display detailed information about the ATM subsystem manager: -> INFO PROCESS $ZZKRN.#ZZATM, DETAIL The system displays a listing similar to: -> INFO PROC #ZZATM, DETAIL NONSTOP KERNEL - Detailed Info PROCESS \COCO.$ZZKRN.#ZZATM *AutoRestart...............10 *BackupCPU.................1 *CPU.......................Not Specified *DefaultVolume.............$SYSTEM.SYSTEM *ExtSwap...................Not Specified *Highpin...
PAGE 77
Determining Your System Configuration Displaying Configuration Information—Examples To display detailed information about an ATM 3 ServerNet adapter: -> INFO ADAPTER $ZZATM.$adapter-name, DETAIL where $adapter-name is the logical process name for the adapter. The system displays a listing similar to this example for the adapter $AM1: -> info adapter $zzatm.$am1, detail ATM Detailed Info ADAPTER \TAHITI.$AM1 LOCATION (grp,mod,slot).. 11 ,1 ,53 ACCESSLIST............... 0, 1, 2, 3 AMP Filename (in use)....
PAGE 78
Displaying Configuration Information—Examples Determining Your System Configuration To display a list of all SAC names with their associated owners and access lists: -> info sac $zzlan.* The system displays a listing similar to: -> INFO SAC $ZZLAN.* SLSA Info SAC Name $ZZLAN.E0353.0 $ZZLAN.E0353.1 $ZZLAN.E0354.0 $ZZLAN.E0354.1 $ZZLAN.E0553.0 $ZZLAN.E0553.1 $ZZLAN.E0554.0 $ZZLAN.E0554.1 $ZZLAN.FE1154.0 $ZZLAN.MIOE0.0 $ZZLAN.MIOE1.
PAGE 79
Determining Your System Configuration Displaying Configuration Information—Examples To display configuration attribute values for all the WAN subsystem configuration managers, TCP/IP processes, and WANBoot processes: -> INFO PROCESS $ZZWAN.* The system displays a listing similar to: -> INFO PROCESS $ZZWAN.* WAN MANAGER Detailed Info Process \COMM.$ZZWAN.#5 RecSize........... 0 *Type............. (50,00) Preferred Cpu..... 5 Alternate Cpu..... 65535 *IOPOBJECT........ \COMM.$SYSTEM.SYS00.
PAGE 80
Determining Your System Configuration Displaying Configuration Information—Examples To display detailed information about an Expand line-handler process: ->INFO LINE $line-name, DETAIL where $line-name is the logical line-handler process name. The system displays a listing similar to this example for the line $ATMBAT: -> info line $atmbat, detail EXPAND Detailed Info LINE $ATMBAT (LDEV 219) L2Protocol Net^Atm TimeFactor... 570K *SpeedK.. NOT_SET Framesize....... 132 -Rsize........... 3 -Speed........
PAGE 81
3 Overview of Monitoring and Recovery When to Use This Section on page 3-1 Importance of Monitoring on page 3-2 Monitoring Tasks on page 3-2 Working With a Daily Checklist on page 3-2 Tools for Checking the Status of System Hardware on page 3-3 Additional Monitoring Tasks on page 3-7 Monitoring and Resolving Problems—An Approach on page 3-8 Using OSM or TSM to Monitor the System on page 3-8 Using the OSM Service Connection or TSM Service Application on page 3-8 Checking for Problems and Alarms on page 3-10
PAGE 82
Overview of Monitoring and Recovery Importance of Monitoring Importance of Monitoring You must monitor a system to ensure that it is operating properly and to recognize when corrective action is required.
PAGE 83
Overview of Monitoring and Recovery Tools for Checking the Status of System Hardware An example of a checklist you might use to standardize your routine daily monitoring tasks is: Task Operator’s name Date & time Notes and questions Check phone messages Check faxes Check e-mail Check shift log Check EMS event messages Check status of terminals Check comm.
PAGE 84
Tools for Checking the Status of System Hardware Overview of Monitoring and Recovery Table 3-1. Monitoring System Components (page 1 of 3) Monitored Using These Tools Resource Adapters for communications subsystems: ATM3SA CCSA OSM Service Connection or TSM Service Application SCF interface to various subsystems E4SA FESA See...
PAGE 85
Tools for Checking the Status of System Hardware Overview of Monitoring and Recovery Table 3-1. Monitoring System Components (page 2 of 3) Monitored Using These Tools Resource Disk drives, external, attached to ServerNet/DA or FCSA OSM Service Connection or TSM Service Application SCF interface to the storage subsystem DSAP See...
PAGE 86
Tools for Checking the Status of System Hardware Overview of Monitoring and Recovery Table 3-1. Monitoring System Components (page 3 of 3) Monitored Using These Tools Resource See...
PAGE 87
Additional Monitoring Tasks Overview of Monitoring and Recovery Additional Monitoring Tasks Table 3-2 provides an example of additional areas you should monitor daily. Table 3-2.
PAGE 88
Overview of Monitoring and Recovery Monitoring and Resolving Problems—An Approach Monitoring and Resolving Problems—An Approach A useful approach to identifying and resolving problems in your system is to first use OSM or TSM to locate the focal point of a hardware problem and then use SCF to gather all the related data from the subsystems that control or act on the hardware.
PAGE 89
Using the OSM Service Connection or TSM Service Application Overview of Monitoring and Recovery Overview Pane Displays a high-level view of system objects, such as internal fabrics, groups, and external devices (external disks and tapes), and of ServerNet Cluster objects, such as external fabrics, local nodes, and remote nodes.
PAGE 90
Checking for Problems and Alarms Overview of Monitoring and Recovery • The Alarms tab lists the alarms for the resource selected in the tree pane. Figure 3-2. Attributes Tab of Management Window VST716.vsd Checking for Problems and Alarms For most system components, you can use the OSM Service Connection or the TSM Service Application to quickly identify problems.
PAGE 91
Overview of Monitoring and Recovery Checking for Problems and Alarms In the details pane, if the Service State value is Service Required and shows a red triangle, the resource is not functioning. If the Service State value is Attention Required and shows a yellow triangle, the resource is not functioning normally. Table 3-3.
PAGE 92
Overview of Monitoring and Recovery Recovery Operations for Problems Detected by OSM or TSM Monitoring Alarms 1. Log on to the OSM Service Connection or the TSM Service Application. 2. From the tree pane, locate and select the resource. 3. From the details pane, click the Alarms tab. 4. Double-click a specific alarm to display the Alarm Detail dialog box. To get a summary of all outstanding alarms on the system: 1. In the OSM Management window, select Summary>Alarm.
PAGE 93
Using SCF to Monitor the System Overview of Monitoring and Recovery components called customer-replaceable units (CRUs). For more information, contact your service provider. Using SCF to Monitor the System Use the Subsystem Control Facility (SCF) to display information and current status for all the devices on your system known to SCF. Some SCF commands are available only to some subsystems. The objects that each command affects and the attributes of those objects are subsystem specific.
PAGE 94
Determining Device States Overview of Monitoring and Recovery Some other examples of the SCF STATUS command are: -> STATUS LINE $LAM3 -> STATUS WS $LAM3.#WS1 -> STATUS WS $LAM3.* -> STATUS WINDOW $LAM3.#WS1.* -> STATUS WINDOW $LAM3.*, SEL STOPPED The general format of the STATUS display follows. However, the format varies depending on the subsystem.
PAGE 95
Determining Device States Overview of Monitoring and Recovery SCF Object States Table 3-4 lists and explains the possible object states that the SCF STATUS command can report. Table 3-4. SCF Object States (page 1 of 2) State Substate Explanation ABORTING The object is being aborted. The object is responding to an ABORT command or some type of malfunction. In this state, no new links are allowed, and drastic measures might be underway to reach the STOPPED state. This state is irrevocable.
PAGE 96
Determining Device States Overview of Monitoring and Recovery Table 3-4. SCF Object States (page 2 of 2) State Substate Explanation STOPPING The object is in transition to the STOPPED state. No new links are allowed to or from the object. Existing links are in the process of being deleted. SUSPENDED The flow of information to and from the object is restricted. (It is typically prevented.
PAGE 97
Overview of Monitoring and Recovery Monitoring and Recovery—Example Monitoring and Recovery—Example This subsection describes a hypothetical situation in which you use the system tools available—event log, TSM applications, and SCF—to identify, analyze, and solve a hardware problem. Note. The cabling and topology diagrams in this subsection identify all ServerNet expansion boards as SEBs. Your system might instead have modular ServerNet expansion boards (MSEBs) in the slots designated for SEBs.
PAGE 98
A Problem Occurs Overview of Monitoring and Recovery Figure 3-3.
PAGE 99
A Problem Occurs Overview of Monitoring and Recovery Figure 3-4.
PAGE 100
Overview of Monitoring and Recovery • • • • Using OSM or TSM to Locate the Problem Four “port error” messages from group 03: three from the PMF CRU in slot 55, and one from the SEB in slot 52 Two “domain deletion” error messages from processor group 01, both from the SEB in slot 52 A “Path change on device $ZZLAN.E3153.0.
PAGE 101
Overview of Monitoring and Recovery Using SCF to Locate the Problem You decide to do some more research, this time using SCF.
PAGE 102
Using SCF to Locate the Problem Overview of Monitoring and Recovery The system displays: NONSTOP KERNEL X-FABRIC TO 0 1 FROM 00 UP UP 01 UP UP 02 UP UP 03 UP UP 04 UP UP 05 UP UP 06 UP UP 07 UP UP 08 <- DOWN 09 <- DOWN 10 <- DOWN 11 <- DOWN 12 <- DOWN 13 <- DOWN 14 <- DOWN 15 <- DOWN Y-FABRIC TO FROM 00 01 02 03 04 05 06 07 08 <09 <10 <11 <12 <13 <14 <15 <- 0 1 UP UP UP UP UP UP UP UP DN DN DN DN UP UP UP UP DOWN DOWN DOWN DOWN DOWN DOWN DOWN DOWN Status SERVERNET 2 3 4 5 6 7 8 9 UP UP UP UP U
PAGE 103
Overview of Monitoring and Recovery Using SCF to Locate the Problem processors in the system, you conclude that all the PMF CRUs are functioning normally. Next, to help eliminate more possibilities, check the physical cable between port 2 of the SEB in 01.5.52 and port 2 of the SEB in 03.1.52. and ensure that the new cable is attached securely. When you again issue the SCF STATUS SERVERNET command, the result is the same. Therefore, the cable is not causing the problem.
PAGE 104
Overview of Monitoring and Recovery Using SCF to Locate the Problem This partial listing shows some of the disk configuration information: STORAGE - Configuration Information Magnetic DISK \GATE8.$D3101 Common Disk Configuration Information: Primary Path Information: Adapter............................... $ZZSTO.#IOMF.GRP-31.MOD-1.SLOT-50 Disk Device ID........................ 4 Location (Group,Module,Slot).......... (31,1,1) SAC Name..........................$ZZSTO.#IOMF.SAC-2.GRP-31.MOD-1.
PAGE 105
Using SCF to Locate the Problem Overview of Monitoring and Recovery The system displays: STORAGE - Status DISK \GATE8.$D3101-* LDev Path Status State 96 96 96 96 PRIMARY BACKUP MIRROR MIRROR-BACKUP ACTIVE INACTIVE INACTIVE ACTIVE Substate STARTED STARTED STARTED STARTED Primary PID 4,263 4,263 4,263 4,263 Backup PID 5,263 5,263 5,263 5,263 The primary path is currently active.
PAGE 106
Calling the Service Provider Overview of Monitoring and Recovery Figure 3-5 shows the affected communications path. Figure 3-5.
PAGE 107
Overview of Monitoring and Recovery Automating Routine System Monitoring Automating Routine System Monitoring You can automate many of the monitoring procedures. Automation saves you time and helps you to perform many routine tasks more efficiently. Your operations environment might be using TACL macros, TACL routines, or command files to perform routine system monitoring and other tasks.
PAGE 108
Automating Routine System Monitoring Overview of Monitoring and Recovery Example 3-2. System Monitoring Output File (page 1 of 3) COMMENT THIS IS THE FILE SYSCHK COMMENT THIS CHECKS ALL DISKS: SCF STATUS DISK $* STORAGE - Status DISK \SHARK.$DATA12 LDev Primary Backup Mirror 52 *STARTED STARTED *STARTED STORAGE - Status DISK \SHARK.$DATA01 LDev Primary Backup Mirror 63 *STARTED STARTED *STARTED STORAGE - Status DISK \SHARK.
PAGE 109
Automating Routine System Monitoring Overview of Monitoring and Recovery Example 3-2. System Monitoring Output File (page 2 of 3) COMMENT THIS CHECKS ALL SACS: SCF STATUS SAC $* SLSA Status SAC Name $ZZLAN.E4SA1.0 $ZZLAN.E4SA1.1 $ZZLAN.E4SA1.2 $ZZLAN.E4SA1.3 Owner 1 0 0 1 State STARTED STARTED STARTED STARTED COMMENT THIS CHECKS ALL ADAPTERS SCF STATUS ADAPTER $* SLSA Status ADAPTER Name $ZZLAN.MIOE0 $ZZLAN.E4SA0 $ZZLAN.MIOE1 $ZZLAN.
PAGE 110
Automating Routine System Monitoring Overview of Monitoring and Recovery Example 3-2.
PAGE 111
Using the Status LEDs to Monitor the System Overview of Monitoring and Recovery Using the Status LEDs to Monitor the System Status LEDs on the various enclosures and system components light during certain operations, such as when the system performs a series of power-on self-tests (POSTs) when a server is first powered on. Table 3-5 lists some of the status light-emitting diodes (LEDs) and their functions. Table 3-5.
PAGE 112
Using the Status LEDs to Monitor the System Overview of Monitoring and Recovery Table 3-5. Status LEDs and Their Functions (page 2 of 2) Location LED Name Color Function Gigabit Ethernet 4-port ServerNet adapater (G4SA) Power-on Green Lights when the adapter is receiving power. Service Amber Lights to indicate internal failure or service action required. Power-on Green Lights when the adapter is receiving power. Service Amber Lights to indicate internal failure or service action required.
PAGE 113
Related Reading Overview of Monitoring and Recovery Related Reading For more information about monitoring, see the documentation listed in Table 3-6. Table 3-6. Related Reading for Monitoring Task Tool For information, see...
PAGE 114
Overview of Monitoring and Recovery HP NonStop S-Series Operations Guide—522459-008 3- 34 Related Reading
PAGE 115
4 Monitoring EMS Event Messages When to Use This Section on page 4-1 What Is the Event Management Service (EMS)? on page 4-1 Tools for Monitoring EMS Event Messages on page 4-1 EMSDIST on page 4-1 OSM Event Viewer on page 4-2 TSM Event Viewer on page 4-2 ViewPoint on page 4-2 Related Reading on page 4-3 When to Use This Section Use this section for a brief description of the Event Management Service (EMS) and the tools used to monitor EMS event messages.
PAGE 116
Monitoring EMS Event Messages OSM Event Viewer OSM Event Viewer The OSM Event Viewer is a browser-based event viewer, more like Web ViewPoint than the TSM Event Viewer. The OSM Event Viewer allows you to retrieve and view events from any EMS formatted log files ($0, $ZLOG, or an alternate collector) for rapid assessment of operating system problems. To access the OSM Event Viewer, refer to Launching OSM and TSM Applications on page 1-11.
PAGE 117
Related Reading Monitoring EMS Event Messages Related Reading For more information about monitoring EMS event messages, see the documentation in Table 4-1. Table 4-1. Related Reading for Monitoring EMS Event Messages Task Tool For information, see...
PAGE 118
Monitoring EMS Event Messages HP NonStop S-Series Operations Guide—522459-008 4 -4 Related Reading
PAGE 119
5 Processes: Monitoring and Recovery When to Use This Section on page 5-1 Types of Processes on page 5-1 System Processes on page 5-1 I/O Processes (IOPs) on page 5-2 Generic Processes on page 5-2 Monitoring Processes on page 5-2 Monitoring System Processes on page 5-3 Monitoring IOPs on page 5-3 Monitoring Generic Processes on page 5-4 Recovery Operations for Processes on page 5-6 Related Reading on page 5-6 When to Use This Section This section provides basic information about the different types of proc
PAGE 120
Processes: Monitoring and Recovery I/O Processes (IOPs) I/O Processes (IOPs) An I/O process (IOP) is a system process that manages communications between a processor and I/O devices. IOPs are often configured as fault-tolerant process pairs, and they typically control one or more I/O devices or communications lines. Each IOP is configured in a maximum of two processors, typically a primary processor and a backup processor.
PAGE 121
Processes: Monitoring and Recovery Monitoring System Processes Monitoring System Processes Check that the system processes are up and running in the processors as you intended. At a TACL prompt: > STATUS * This example shows output produced by the TACL STATUS * command: $SYSTEM STARTUP 2> status * Process Pri PFR %WT Userid Program file 0,0 201 P R 000 255,255 $SYSTEM.SYS00.OSIMAGE 0,1 210 P 040 255,255 $SYSTEM.SYS00.OSIMAGE 0,2 210 P 051 255,255 $SYSTEM.SYS00.OSIMAGE 0,4 211 P 017 255,255 $SYSTEM.SYS00.
PAGE 122
Monitoring Generic Processes Processes: Monitoring and Recovery Monitoring Generic Processes Because generic processes are configured using the SCF interface to the Kernel subsystem, you specify the $ZZKRN Kernel subsystem manager process when monitoring a generic process.
PAGE 123
Monitoring Generic Processes Processes: Monitoring and Recovery This example shows the output produced by this command: 1-> STATUS PROCESS $ZZKRN.#* NONSTOP KERNEL - Status PROCESS \BACH.$ZZKRN Symbolic Name CEV-SERVER-MANAGER-P0 CEV-SERVER-MANAGER-P1 CLCI-TACL FOX MSGMON MSGMON MSGMON MSGMON MSGMON MSGMON MSGMON MSGMON MSGMON MSGMON MSGMON MSGMON MSGMON MSGMON MSGMON MSGMON OSM-APPSRVR OSM-CIMOM OSM-CONFLH-RD OSM-OEV QIOMON-0 QIOMON-1 QIOMON-10 QIOMON-11 QIOMON-12 . . .
PAGE 124
Processes: Monitoring and Recovery Recovery Operations for Processes The asterisks (*) indicate files that do not appear if only OSM (and not TSM) is installed. OSM renames some TSM-related files for use by both applications. For example, $TSMM0 and $TSMM1 become $OSMM0 and $OSMM1 after OSM is installed. You can still run TSM even though $TSMM0 and $TSMM1 no longer appear by those names. $ZLOG is another file that is used by both OSM and TSM. (The symbolic name no longer contains TSM.
PAGE 125
6 Communications Subsystems: Monitoring and Recovery When to Use This Section on page 6-1 Communications Subsystems on page 6-1 Local Area Networks (LANs) and Wide Area Networks (WANs) on page 6-2 Monitoring Communications Subsystems and Their Objects on page 6-3 Monitoring the SLSA Subsystem on page 6-4 Monitoring the WAN Subsystem on page 6-6 Monitoring the NonStop TCP/IP Subsystem on page 6-9 Monitoring Other Communications Subsystems on page 6-10 Monitoring Line-Handler Process Status on page 6-12 Traci
PAGE 126
Communications Subsystems: Monitoring and Recovery Local Area Networks (LANs) and Wide Area Networks (WANs) networks (LANs) or wide area networks (WANs), respectively. Similarly, multiple higher-level components can use the services of a single lower-level component. Local Area Networks (LANs) and Wide Area Networks (WANs) Two important communications interfaces for LANs and WANs on NonStop S-series servers are the SLSA subsystem and the WAN subsystem.
PAGE 127
Communications Subsystems: Monitoring and Recovery Monitoring Communications Subsystems and Their Objects Processes, user applications, and subsystems that use the SLSA subsystem and related LAN providers to connect to an Ethernet, a Fast Ethernet, a Gigabit Ethernet, a Gigabit 4-port Ethernet or a token-ring LAN attached to a NonStop S-series server are called LAN clients.
PAGE 128
Communications Subsystems: Monitoring and Recovery Monitoring the SLSA Subsystem Detailed monitoring and recovery techniques for devices and processes related to communications subsystems are discussed in detail in the manuals for each subsystem. For more information, refer to Related Reading on page 6-15. This guide provides some basic commands you can use to identify and resolve common problems.
PAGE 129
Communications Subsystems: Monitoring and Recovery Monitoring the SLSA Subsystem 2. The SAC object corresponds directly to the hardware on an adapter. A SAC is a component of an adapter and can support one or more PIFs. To monitor the status of a SAC: > SCF STATUS SAC sac-name A listing similar to this example is sent to your home terminal: 1->STATUS SAC $ZZLAN.E0353.O SLSA Status SAC Name $ZZLAN.E0353.0 Owner 1 State STARTED This example shows a listing of the status of all SACs on $ZZLAN.
PAGE 130
Communications Subsystems: Monitoring and Recovery Monitoring the WAN Subsystem 4. The LIF provides an interface to the PIF. The LIF object corresponds to logical processes that handle data transferred between the LAN and a system using the ServerNet architecture. To monitor the status of a LIF: > SCF STATUS LIF lif-name A listing similar to this example is sent to your home terminal: ->STATUS LIF $ZZLAN.L038 SLSA Status LIF Name $ZZLAN.
PAGE 131
Communications Subsystems: Monitoring and Recovery Monitoring the WAN Subsystem To display the status for all SWAN concentrators configured for your system: > SCF STATUS ADAPTER $ZZWAN.* The system displays a listing similar to: 1-> STATUS ADAPTER $ZZWAN.* WAN Manager STATUS ADAPTER for ADAPTER State........... STARTED \COMM.$ZZWAN.#SWAN1 Number of clips. 3 Clip 1 status : CONFIGURED Clip 2 status : CONFIGURED Clip 3 status : CONFIGURED WAN Manager STATUS ADAPTER for ADAPTER State...........
PAGE 132
Communications Subsystems: Monitoring and Recovery Monitoring the WAN Subsystem The system displays a listing similar to: -> STATUS PROCESS $ZZWAN.* WAN Manager STATUS PROCESS for PROCESS State :......... STARTED LDEV Number..... 66 PPIN............ 5 ,264 WAN Manager STATUS PROCESS for PROCESS State :......... STARTED LDEV Number..... 67 PPIN............ 4 \COMM.$ZZWAN.#ZTF00 \COMM.$ZZWAN.#SWB1 BPIN............ 5 ,302 \COMM.$ZZWAN.#ZTF01 ,340 WAN Manager STATUS PROCESS for PROCESS State :........
PAGE 133
Communications Subsystems: Monitoring and Recovery Monitoring the NonStop TCP/IP Subsystem The system displays a listing similar to: -> status server $zzwan.#s01.1 WAN Manager STATUS SERVER for CLIP \COWBOY.$ZZWAN.#S01.1 STATE :..........STARTED PATH A...........: CONFIUGRED PATH B...........: CONFIGURED NUMBER of lines. 2 Line...............0 Line...............
PAGE 134
Communications Subsystems: Monitoring and Recovery Monitoring Other Communications Subsystems The system displays a listing similar to: 1-> Status Route $ZTCO.* TCPIP Status ROUTE \SYSA.$ZTCO.* Name #ROU11 #ROU9 #ROU12 #ROU8 #ROU3 Status RefCnt STARTED STARTED STARTED STARTED STOPPED 0 0 0 1 0 Monitoring NonStop TCP/IP Subnets To obtain the status of a NonStop TCP/IP subnet: > SCF STATUS SUBNET #SN2 The system displays a listing similar to: 1-> STATUS SUBNET #SN2 TCPIP Status SUBNET \SYSA.$ZTC0.
PAGE 135
Communications Subsystems: Monitoring and Recovery Monitoring Other Communications Subsystems The system displays a listing similar to: 1-> STATUS LBU $ZZFOX.#X,D FOX Detailed Status LBU \COMM.$ZZFOX.#X Summary State..... STARTED Physical State.... LOADED Controller Type... FXSA Serial Links: Quality Phy-Neighbor Left: 1.0000 8 Right: 1.0000 12 Bootcode ID: Firmware ID: At 09 Jul 2001, 19:24:27.959 Logical State..... STARTED System Number..... 116 Cluster Number....
PAGE 136
Communications Subsystems: Monitoring and Recovery Monitoring Line-Handler Process Status Monitoring Line-Handler Process Status A line-handler process is a component of a data communications subsystem. It is an I/O process that transmits and receives data on a communications line, either directly or by communicating with another I/O process. This subsection explains how to monitor the status of a line-handler process on your system or on another system in your network to which you have remote access.
PAGE 137
Communications Subsystems: Monitoring and Recovery Monitoring Line-Handler Process Status Examples To check the detailed status of line $LHCS6S: > SCF STATUS LINE $LHCS6S, DETAIL A listing such as this output is sent to your home terminal: -> STATUS LINE $LHCS6S, DETAIL PPID.................... ( 3, 24) BPID................ ( 2, 24) State................... STOPPED Path LDEV........... 50 Trace Status............ OFF Clip Status......... UNLOADED ConMgr-LDEV.............
PAGE 138
Communications Subsystems: Monitoring and Recovery Tracing a Communications Line The system displays a listing similar to this output.
PAGE 139
Communications Subsystems: Monitoring and Recovery Recovery Operations for Communications Subsystems Recovery Operations for Communications Subsystems Some general troubleshooting guidelines are: • • Examine the contents of the event message log for the subsystem. For example, the WAN subsystem or Kernel subsystem might have been issued an event message that provides information about the process failure.
PAGE 140
Communications Subsystems: Monitoring and Recovery Related Reading Table 6-1. Related Reading for Communications Lines and Devices (page 2 of 2) For Information About... Refer to...
PAGE 141
7 ServerNet/DA: Monitoring and Recovery When to Use This Section on page 7-1 Overview of the ServerNet/DA on page 7-1 Monitoring the ServerNet/DA on page 7-1 Identifying Problems With the ServerNet/DA on page 7-2 Recovery Operations for the ServerNet/DA on page 7-3 Related Reading on page 7-3 When to Use This Section Use this section for monitoring and recovery information for the 6760 ServerNet device adapter (ServerNet/DA).
PAGE 142
ServerNet/DA: Monitoring and Recovery Identifying Problems With the ServerNet/DA Identifying Problems With the ServerNet/DA When monitoring the ServerNet/DA using the OSM Service Connection or the TSM Service Application, the Power State and the Subcomponent State of the ServerNet/DA should indicate normal operation. Table 7-1 lists the possible states for the ServerNet/DA. Table 7-1.
PAGE 143
ServerNet/DA: Monitoring and Recovery Recovery Operations for the ServerNet/DA Recovery Operations for the ServerNet/DA Refer to the 6760 ServerNet/DA Manual.
PAGE 144
ServerNet/DA: Monitoring and Recovery HP NonStop S-Series Operations Guide—522459-008 7 -4 Related Reading
PAGE 145
8 Fibre Channel ServerNet Adapter: Monitoring and Recovery When to Use This Section on page 8-1 Overview of the FCSA on page 8-1 Monitoring the FCSAs on page 8-1 Identifying Problems With FCSAs on page 8-2 Recovery Operations for the FCSA on page 8-2 Related Reading on page 8-2 When to Use This Section Use this section for monitoring and recovery information for the Fibre Channel ServerNet adapters (FCSAs).
PAGE 146
Fibre Channel ServerNet Adapter: Monitoring and Recovery Identifying Problems With FCSAs The SCF Reference Manual for the Storage Subsystem provides reference details and examples for using the SCF INFO and SCF STATUS commands. Identifying Problems With FCSAs When monitoring FCSAs using the OSM Service Connection, the Service State and the Subcomponent State of the FCSAs should indicate normal operation.
PAGE 147
9 Disk Drives: Monitoring and Recovery When to Use This Section on page 9-1 Overview of Disk Drives on page 9-2 External Disk Drives on page 9-2 Internal SCSI Disk Drives on page 9-2 M8xxx Fibre Channel Disk Drives on page 9-3 Enterprise Storage System (ESS) Disks on page 9-4 Monitoring Disk Drives on page 9-4 Monitoring Disk Drives With OSM on page 9-5 Monitoring Disk Drives With SCF on page 9-6 Monitoring the State of Disk Drives on page 9-10 Monitoring the Use of Space on a Disk Volume on page 9-10 Monit
PAGE 148
Overview of Disk Drives Disk Drives: Monitoring and Recovery Overview of Disk Drives The NonStop S-series server supports: • • • • External Disk Drives Internal SCSI Disk Drives M8xxx Fibre Channel Disk Drives Enterprise Storage System (ESS) Disks A system enclosure can contain different types of disk drives. However, both disk drives in a mirrored volume must always be the same type of drive.
PAGE 149
M8xxx Fibre Channel Disk Drives Disk Drives: Monitoring and Recovery Each disk drive CRU has a part number/bar code label, a write-on label (for the logical device name), and these indicator light-emitting diodes (LEDs): • Green power-on LED When lit, the green power-on LED indicates that the disk drive is receiving power. • Yellow or amber activity LED When lit, the yellow or amber activity LED indicates that the disk drive is executing a read or write command. These disk drives are Class-1 CRUs.
PAGE 150
Enterprise Storage System (ESS) Disks Disk Drives: Monitoring and Recovery Fibre Channel disk drives are field-replaceable units (FRUs). Any physical action on a FRU, including installation and replacement, must be performed only by a qualified HP service provider.
PAGE 151
Monitoring Disk Drives With OSM Disk Drives: Monitoring and Recovery Monitoring Disk Drives With OSM Task See Monitor the status of disk drives • • OSM Service Connection OSM Event Viewer Inventory the entire system, including disk drives OSM Inventory View Use: OSM Online Help • • • OSM Service Connection OSM Event Viewer OSM Inventory View Determine: • • • You can save this view as a file in Microsoft Excel.
PAGE 152
Monitoring Disk Drives With SCF Disk Drives: Monitoring and Recovery Monitoring Disk Drives With SCF This subsection explains how to list the disk volumes and determine their status. 1. To list the status of all magnetic disk volumes on your system, issue this command from SCF: > SCF STATUS DISK $*, SUB MAGNETIC 1-> STATUS DISK $*, SUB MAGNETIC STORAGE - Status DISK \COMM.$SYSTEM LDev Primary Backup Mirror 6 *STARTED STARTED *STARTED STORAGE - Status DISK \COMM.
PAGE 153
Monitoring Disk Drives With SCF Disk Drives: Monitoring and Recovery 2. Get information about a disk with SCF STATUS DISK, DETAIL. For example: -> STATUS DISK $DATA09, DETAIL The output from this example shows that $DATA09 is in the STOPPED state, HARDDOWN substate. 65-> SCF STATUS DISK $DATA09, DETAIL SCF - T9082G02 - (30JUN97) (14MAY97) - 11/05/98 13:24:10 System \SHARK STORAGE - Detailed Status DISK \SHARK.
PAGE 154
Monitoring Disk Drives With SCF Disk Drives: Monitoring and Recovery To display the status of all disks: -> STATUS DISK $* 1-> STATUS DISK $* STORAGE - Status DISK \COMM.$SYSTEM LDev Primary Backup Mirror 6 *STARTED STARTED *STARTED MirrorBackup STARTED Primary PID 0,257 Backup PID 1,257 Primary PID 2,288 Backup PID 3,267 STORAGE - Status VIRTUAL DISK \COMM.$VIEWPT LDev State Primary Backup Type Subtype PID PID 147 STARTED 9,22 8,53 3 36 STORAGE - Status VIRTUAL DISK \COMM.
PAGE 155
Monitoring Disk Drives With SCF Disk Drives: Monitoring and Recovery To display the detailed status of the disk $DATA01: -> STATUS $DATA01, DETAIL 35-> STATUS $DATA01, DETAIL Disk Path Information: LDev Path PathStatus 63 63 63 63 PRIMARY BACKUP MIRROR MIRROR-BACKUP ACTIVE INACTIVE ACTIVE INACTIVE State SubState Primary PID 0,267 0,267 0,267 0,267 STARTED STARTED STARTED STARTED Backup PID 1,266 1,266 1,266 1,266 General Disk Information: Device Type........... 3 Device Subtype...........
PAGE 156
Disk Drives: Monitoring and Recovery Monitoring the State of Disk Drives Monitoring the State of Disk Drives Each disk drive is configured to have two paths, the primary and the backup. (Each M8xxx disk drive is forced to have two paths.) The two path states are represented separately. Table 9-1 lists possible values for the current state of a disk path. Table 9-1. States for Disk Drive Paths Path State Description Degraded This path of this disk drive has a state other than Up.
PAGE 157
Disk Drives: Monitoring and Recovery Monitoring Disk Configuration and Performance A report similar to this one is sent to your home terminal: $DATA.FILES.FILEA 10 Jul 1993, 14:05 ENSCRIBE TYPE U CODE 100 EXT ( 224 PAGES, 14 PAGES ) ODDUNSTR MAXEXTENTS 370 BUFFERSIZE 4096 OWNER 8,255 SECURITY (RWEP): NUNU, LICENSED DATA MODIF: 10 Jul 1994, 14:04 CREATION DATE: 10 Jan 1994, 14:04 LAST OPEN: 10 Jul 1994, 14:04 EOF 267022 (58.2% USED) FILE LABEL: 822 (20.
PAGE 158
Identifying Disk Drive Problems Disk Drives: Monitoring and Recovery Identifying Disk Drive Problems The most common disk drive problems on a NonStop S-series server include: • • • • Space problems such as full disks or free-space fragmentation Stopped disks Performance problems Defective tracks or sectors Table 9-2 lists the most common disk drive problems and their possible symptoms. For recovery operations, refer to Recovery Operations for Disk Drives on page 9-13. Table 9-2.
PAGE 159
Disk Drives: Monitoring and Recovery M8xxx Fibre Channel Disk Drives M8xxx Fibre Channel Disk Drives The most common disk problems when Fibre Channel disk drives are connected through an IOAM are intm-errors-exceeded and slow-IOs-threshold-exceeded errors on the Fibre Channel loop. Such errors are often normal. However, if they cause problems on a Fibre Channel loop, power the affected disk down and up again. This procedure can solve the problem temporarily.
PAGE 160
Disk Drives: Monitoring and Recovery Recovery Operations for Disk Drives Table 9-3. Common Recovery Operations for Disk Drives (page 2 of 3) Problem Recovery Disk full 1. Use DSAP to identify large, old, and little used files. 2. If you are authorized: • • • Use the BACKUP utility to back up these disk files to tape and then purge them from the disk. Do not purge important system files. Move files to another disk. Do not move important system files. Ask users to purge files.
PAGE 161
Disk Drives: Monitoring and Recovery Recovery Operations for a Down Disk or Down Disk Path Table 9-3. Common Recovery Operations for Disk Drives (page 3 of 3) Problem Recovery Corrupt $SYSTEM disk If both halves of your mirrored system volume become corrupted, use an alternate system disk if one is available. For how to create an alternate system disk, see the DSM/SCM User’s Guide and the NonStop S-Series Hardware Installation and FastPath Guide.
PAGE 162
Recovery Operations for a Nearly Full Database File Disk Drives: Monitoring and Recovery STORAGE - Status DISK \ALPHA12.$DATA06-* LDev Path Status State 116 116 116 116 PRIMARY BACKUP MIRROR MIRROR-BACKUP ACTIVE INACTIVE INACTIVE INACTIVE STARTED STARTED STOPPED STOPPED STORAGE - Status DISK \ALPHA12.$WD8-* LDev Path Status State 96 96 96 96 PRIMARY BACKUP MIRROR MIRROR-BACKUP ACTIVE INACTIVE INACTIVE INACTIVE STARTED STARTED STOPPED STOPPED STORAGE - Status DISK \ALPHA12.
PAGE 163
Related Reading Disk Drives: Monitoring and Recovery A report such as this one is sent to your home terminal: $DATA.DATA1.MEMOS 12 Jul 1993, 14:05 ENSCRIBE TYPE U CODE 101 EXT ( 2 PAGES, 2 PAGES ) ODDUNSTR MAXEXTENTS 20 BUFFERSIZE 4096 OWNER 8,255 SECURITY (RWEP): NUNU DATA MODIF: 12 Jul 1993, 14:04 CREATION DATE: 12 Jan 1993, 14:04 LAST OPEN: 12 Jul 1993, 14:24 EOF 567022 (78.5% USED) FILE LABEL: 649 (22.
PAGE 164
Disk Drives: Monitoring and Recovery HP NonStop S-Series Operations Guide—522459-008 9- 18 Related Reading
PAGE 165
10 Tape Drives: Monitoring and Recovery When to Use This Section on page 10-1 Overview of Tape Drives on page 10-2 Monitoring Tape Drives on page 10-3 Monitoring Tape Drive Status on page 10-3 Monitoring the Status of Labeled-Tape Operations on page 10-9 Identifying Tape Drive Problems on page 10-9 Recovery Operations for Tape Drives on page 10-10 Recovery Operations Using SCF on page 10-10 Recovery Operations Using the OSM Service Connection on page 10-10 Recovery Operations Using the TSM Service Applicati
PAGE 166
Overview of Tape Drives Tape Drives: Monitoring and Recovery Overview of Tape Drives Tape drives are external devices that connect to a NonStop S-series server using one of these methods: • • Through a 6760 ServerNet device adapter (ServerNet/DA) for G06.01 and subsequent G-series RVUs. For more information about ServerNet/DA, refer to Section 7, ServerNet/DA: Monitoring and Recovery.
PAGE 167
Tape Drives: Monitoring and Recovery Monitoring Tape Drives Monitoring Tape Drives These tools are available to monitor tape drives: • • Use the SCF interface to the storage subsystem, the OSM Service Connection, or the TSM Service Application to monitor and get status information about tape drives. Use MEDIACOM to monitor the use of tape drives and to write tape labels. Monitoring Tape Drive Status This subsection explains how to list the tape drives on your system and determine their status. Note.
PAGE 168
Tape Drives: Monitoring and Recovery Monitoring Tape Drive Status Figure 10-2. Monitoring Tape Drives With OSM VST811.
PAGE 169
Tape Drives: Monitoring and Recovery Monitoring Tape Drive Status Monitoring Tape Drive Status With TSM To check the status of all tape drives on your system with the TSM Service Application: 1. Log on to the TSM Service Application. 2. From the tree pane (Figure 10-3): a. Double-click Tape Drives. b. Click the tape drive whose status you want to check. 3. From the Attributes tab in the details pane: a. Check that the Service State is OK.
PAGE 170
Tape Drives: Monitoring and Recovery Monitoring Tape Drive Status Figure 10-3. Monitoring Tape Drives With TSM CDT 810.
PAGE 171
Monitoring Tape Drive Status Tape Drives: Monitoring and Recovery Monitoring Tape Drive Status With SCF To check the status of all tape drives on your system with SCF: > SCF STATUS TAPE $* A listing similar to this one is sent to your home terminal: STORAGE - Status TAPE \MINDEN.$XTAPE LDev State Primary Backup PID PID 93 STOPPED 1,287 0,279 STORAGE - Status TAPE \MINDEN.
PAGE 172
Monitoring Tape Drive Status Tape Drives: Monitoring and Recovery Monitoring Tape Drive Status With MEDIACOM The MEDIACOM command STATUS TAPEDRIVE displays the current status of a tape drive. Among other things, this command tells you whether a tape is mounted on the drive, the name of the DEFINE associated with the tape, and which volume catalog and pool owns it. Note. Manual unloading of a tape is not detected by a tape drive, so information from STATUS TAPEDRIVE can be out of date.
PAGE 173
Tape Drives: Monitoring and Recovery Monitoring the Status of Labeled-Tape Operations Monitoring the Status of Labeled-Tape Operations Use the MEDIACOM STATUS TAPEDRIVE and STATUS TAPEMOUNT commands to determine the current status of labeled-tape operations on your system.
PAGE 174
Tape Drives: Monitoring and Recovery Recovery Operations for Tape Drives Recovery Operations for Tape Drives You can perform recovery operations on tape drives using either the SCF interface to the storage subsystem, the OSM Service Connection, or the TSM Service Application.
PAGE 175
Related Reading Tape Drives: Monitoring and Recovery b. Right-click the tape drive. c. Select Actions from the menu. The Actions dialog box appears. d. You can select up or down, which correspond to the SCF commands START and STOP, or select various tests to perform on the tape drive. For information on recovery operations, refer to the TSM online help or suggested Repair Actions text (listed under Alarm Details) for specific tape-related alarms in the TSM Service Application.
PAGE 176
Tape Drives: Monitoring and Recovery HP NonStop S-Series Operations Guide—522459-008 10 -12 Related Reading
PAGE 177
11 Processors: Monitoring and Recovery When to Use This Section on page 11-2 Monitoring and Maintaining Processors on page 11-2 Monitoring Processor Status Using OSM or TSM on page 11-2 Monitoring Event Messages on page 11-3 Monitoring the State of PMF CRUs on page 11-3 Monitoring Processor Performance Using ViewSys on page 11-4 Identifying Processor Problems on page 11-5 Hardware Error Freezes on page 11-5 Processor Hangs on page 11-5 Processor Halts on page 11-5 Recovery Operations for Processors on page
PAGE 178
When to Use This Section Processors: Monitoring and Recovery When to Use This Section Use this section to monitor processors and to perform recovery operations such as processor dumps. Monitoring and Maintaining Processors Use OSM, TSM, the ViewSys product, and other tools to monitor processors.
PAGE 179
Processors: Monitoring and Recovery Monitoring Event Messages A graph next to each processor name also shows the recent history of busy levels for that processor. You can save a history of processor busy percentages with the Save History to file button. The history period is determined in the Time Frame slider in the Processor History Options box. Monitoring Event Messages For more information, refer to Monitoring EMS Event Messages on page 4-1.
PAGE 180
Processors: Monitoring and Recovery Monitoring Processor Performance Using ViewSys Monitoring Processor Performance Using ViewSys Use the ViewSys product to view system resources online and to see information on system performance. ViewSys provides information about processor activity. Using ViewSys, you can list the processors on your system and determine their status. For more information, refer to ViewSys on page B-8.
PAGE 181
Processors: Monitoring and Recovery Identifying Processor Problems Identifying Processor Problems Abnormal processor states include hardware error freezes, system hangs, and processor halts. Hardware Error Freezes A hardware error freeze occurs when a processor cannot continue processing due to the risk of using corrupt data from a hardware error. Contact your service provider before dumping a frozen processor.
PAGE 182
Processors: Monitoring and Recovery Recovery Operations for Processors If system freeze is enabled, the status for all other freeze-enabled processors becomes: Frozen by other processor The Processor Halt Codes Manual documents processor halt codes. Note. Do not freeze-enable a processor unless instructed to do so by your service provider. Recovery Operations for Processors Processor halts can sometimes be confused with other types of errors.
PAGE 183
Processors: Monitoring and Recovery Halting One or More Processors Halting One or More Processors To place a selected processor or processors in a halt state and set the status and registers of the processor or processors to an initial state: 1. Log on to the OSM or TSM Low-Level Link. 2. On the toolbar, click Processor Status. 3. In the Processor Status dialog box, select the processor to be halted or select all the processors to halt all of them. 4. Select Processor Actions>Halt. 5.
PAGE 184
Processors: Monitoring and Recovery Recovery Operations for a Hardware Error Freeze 7. Reload the remaining processors. Note. After reloading the remaining processors, run your startup scripts if any. Send the dumps to your service provider. Recovery Operations for a Hardware Error Freeze Contact your service provider. Depending on the circumstances, a hardware error freeze might require the PMF CRU or a memory unit to be replaced. See Replacing Processor Memory or a PMF CRU on page 11-25.
PAGE 185
Processors: Monitoring and Recovery Recovery Operations for a Processor Halt 3. Dump (copy) the contents of its memory to disk or tape unless otherwise indicated. Dumping the contents of a halted processor (its registers and entire memory contents) can be a useful diagnostic tool for analyzing and resolving the problem.
PAGE 186
Processors: Monitoring and Recovery Dumping a Processor to Disk Dumping a Processor to Disk A processor dump to disk occurs while the system is running. The dump occurs over either the X or Y ServerNet fabric. When a processor is dumped to disk, the RCVDUMP utility begins copying the dump in a compressed format from the specified processor into a disk file called dumpfile. If dumpfile does not exist, the RCVDUMP utility creates it.
PAGE 187
Processors: Monitoring and Recovery Dumping a Processor to Disk You will need this information when you notify your system manager or service provider about this dump. Procedure to Dump a Processor to Disk Complete syntax and considerations for RECEIVEDUMP and RCVDUMP, as well as the error and informational messages that they generate, are described in the Guardian User’s Guide. For an explanation of the messages generated by RCVDUMP, refer to the TACL Reference Manual.
PAGE 188
Processors: Monitoring and Recovery Enabling or Disabling System Freeze Enabling or Disabling System Freeze The Enable System Freeze tool is for debugging purposes only. Its intent is for use only under the direction of a service provider. Upon activation of Enable System Freeze, when one freeze-enabled processor halts, all other freeze-enabled processors also halt. The default setting is Disable System Freeze. Caution. Do not Enable System Freeze if you are using the server in a production environment.
PAGE 189
Processors: Monitoring and Recovery Enabling or Disabling Freeze on a Processor 4. Click Perform Action. If you selected Disable System Freeze, the action begins immediately. If you selected Enable System Freeze, a message prompts you to confirm the action. 5. If you selected Enable System Freeze, click OK or Cancel. If you clicked OK, the status of the action appears in the Action Status box. Enabling or Disabling System Freeze After System Discovery 1.
PAGE 190
Processors: Monitoring and Recovery Enabling or Disabling Freeze on a Processor Checking If Freeze Is Enabled or Disabled on One or More Processors Using the Processor Status Dialog Box 1. Log on to the OSM or TSM Low-Level Link. 2. Do one of the following: • • From the Summary menu, choose Processor Status. On the toolbar, click Processor Status. The Processor Status dialog box appears. If “F” appears next to a processor, freeze is enabled on that processor.
PAGE 191
Processors: Monitoring and Recovery Freezing the System or Processor 6. In the Action status box, monitor the status of the Enable Freeze or Disable Freeze action: • • After the Enable Freeze action has successfully finished, a completed message appears, and an “F” appears next to the processor in the Processor Status dialog box. After the Disable Freeze action has successfully finished, a completed message appears, and an “F” next to the processor disappears from the Processor Status dialog box.
PAGE 192
Processors: Monitoring and Recovery Freezing the System or Processor Freezing a Processor 1. Check the Processor Freeze attribute for each processor in the system: a. In the tree pane, click the system tab. b. Select the processor. c. Click the Attributes tab in the details pane and check the value of the Processor Freeze attribute. If you want a processor to freeze, make sure its Processor Freeze attribute is Enabled.
PAGE 193
Processors: Monitoring and Recovery Dumping a Processor to Tape (Down System Only) Dumping a Processor to Tape (Down System Only) If the entire system is down (all processors are halted), you can perform a tape dump using the OSM or TSM Low-Level Link. Your service provider can use the memory dump to troubleshoot your system. For more information on determining processor problems, see Monitoring Processor Status Using OSM or TSM on page 11-2.
PAGE 194
Processors: Monitoring and Recovery Dumping a Processor to Tape (Down System Only) Before You Begin To prepare for a tape dump: 1. Log on to the OSM or TSM Low-Level Link. 2. On the toolbar, click Processor Status. The Processor Status dialog box appears. 3. Write down the status message displayed in the Processor Status dialog box (Figure 11-2) for the processor to be dumped. You will need this information when you notify your system manager or service provider about this dump. Figure 11-2.
PAGE 195
Processors: Monitoring and Recovery Dumping a Processor to Tape (Down System Only) example, you can select processors 2, 3, and 4, but not 2 and 4. To select processors 2 and 4, use the Ctrl key with the left mouse button. b. In the Processor Action menu, scroll to Halt. c. Click Perform action. 5. Mount a tape that is not write-protected into that tape drive. For open-reel tapes, check that the write-enable ring is present. 6. Position the tape at the load point and put the tape drive online.
PAGE 196
Processors: Monitoring and Recovery Dumping a Processor to Tape (Down System Only) 2. In the Dump Processor-n to Tape dialog box, type: a. The SCSI ID of the tape drive. The default value is 4 or 5, which is the current software requirement. b. The location to which the tape drive is connected. 3. Click Dump. 4. Monitor the tape dump. In the Processor Status dialog box, verify that the processor that was dumped to tape shows halt code %1154. 5.
PAGE 197
Processors: Monitoring and Recovery Dumping All Processors in a System Dumping All Processors in a System Dump an entire server when you want to examine the contents of all processors on a frozen server. You must be logged on to the OSM or TSM Low-Level Link to perform this task. Note. Normally you do not perform system dumps. System dumps are performed primarily in development environments. Dumping an Entire Server 1. Enable system freeze. See Enabling or Disabling System Freeze on page 11-12. 2.
PAGE 198
Processors: Monitoring and Recovery Reloading a Single Processor on a Running Server Reloading a Single Processor on a Running Server Sometimes one or more processors in a running server are not operating. For information on how to determine whether a processor is operating, see Monitoring Processor Status Using OSM or TSM on page 11-2. After you have determined that a processor is not operating, check that the processor is halted. Dump (copy) its memory to disk.
PAGE 199
Processors: Monitoring and Recovery Loading a Processor From Disk 12. Either: • • If you selected Reset, type reload n,prime. If you selected Prime for Reload, type reload n. Note. n is the number of the processor you want to reload. 13. Check the OutsideView window for status messages, which will report successes or errors during the load. Monitor the state of the processor you are loading until it is executing the NonStop operating system. 14.
PAGE 200
Processors: Monitoring and Recovery Copying a Dump File From Tape to Disk 6. Type the RVU of the software you want to load in the SYSnn edit box. 7. Select the configuration file using the option buttons. 8. Click the CIIN disabled check box if you want to disable the CIIN file. 9. Type the disk information in the group, module, and slot boxes. The $SYSTEM-P disk is located in group 1, module 1, slot 11. The $SYSTEM-M disk is located in group 1, module 1, slot 12. 10.
PAGE 201
Processors: Monitoring and Recovery Backing Up a Processor Dump to Tape 2. Wait until this message appears in the terminal-emulation window: { $tape# | dumpfile } HAS BEEN COPIED (COMPRESSED) TO destfile For more information, refer to the Guardian User’s Guide. Backing Up a Processor Dump to Tape To back up a processor dump to tape, either: • Back up a processor dump to tape from the compressed disk file generated by the TACL RECEIVEDUMP command (or the RCVDUMP utility): 1.
PAGE 202
Processors: Monitoring and Recovery Submitting Information to Your Service Provider Submitting Tapes of Processor Dumps Use a separate tape for each processor dump.
PAGE 203
Submitting Information to Your Service Provider Processors: Monitoring and Recovery Additional Information Required by Your Service Provider In addition to the tapes previously discussed, submit the information listed in Table 10-3 to your service provider. Table 11-3.
PAGE 204
Related Reading Processors: Monitoring and Recovery Related Reading For more information about tools used to monitor and perform recovery operations on processors, refer to the documentation listed in Table 10-4. . Table 11-4.
PAGE 205
12 ServerNet Fabrics: Monitoring and Recovery When to Use This Section on page 12-1 Monitoring the Status of the ServerNet Fabrics on page 12-2 Monitoring the ServerNet Fabrics Using OSM or TSM on page 12-2 Monitoring the ServerNet Fabrics Using SCF on page 12-3 Identifying ServerNet Fabric Problems on page 12-5 Recovery Operations for the ServerNet Fabrics on page 12-6 Recovery Operations for a Down Disk Due to a Fabric Failure on page 12-6 Recovery Operations for a Down Path Between Processors on page 12-
PAGE 206
ServerNet Fabrics: Monitoring and Recovery Monitoring the Status of the ServerNet Fabrics Monitoring the Status of the ServerNet Fabrics To monitor the status of the ServerNet fabrics: • • Use the OSM Service Connection or the TSM Service Application to check the communication between processor enclosures, I/O enclosures, and systems. Use the Subsystem Control Facility (SCF) to check the status of interprocessor communication on the X and Y fabrics.
PAGE 207
Monitoring the ServerNet Fabrics Using SCF ServerNet Fabrics: Monitoring and Recovery Monitoring the ServerNet Fabrics Using SCF The SCF STATUS SERVERNET command displays a matrix for the ServerNet X fabric and a matrix for the ServerNet Y fabric. Each matrix shows the status of the paths between all pairs of processors. Use the SCF STATUS SERVERNET command to display current information about the ServerNet fabric.
PAGE 208
ServerNet Fabrics: Monitoring and Recovery ° Monitoring the ServerNet Fabrics Using SCF The status from processors 2 through 15 is displayed as down. Normal ServerNet Fabric States Normal states for a path on the ServerNet fabrics can be one of: • UP The path from the processor in the FROM row to the processor in the TO column is up. The status for all ServerNet connections between existing processors in a system should be UP.
PAGE 209
ServerNet Fabrics: Monitoring and Recovery Identifying ServerNet Fabric Problems Identifying ServerNet Fabric Problems Depending on how your system is configured, these states for a path on the ServerNet fabrics might indicate a problem: • DIS (disabled) The ServerNet fabric is down at the TO location.
PAGE 210
ServerNet Fabrics: Monitoring and Recovery Recovery Operations for the ServerNet Fabrics Recovery Operations for the ServerNet Fabrics For most recovery operations, refer to the SCF Reference Manual for the Kernel Subsystem.
PAGE 211
13 Applications: Monitoring and Recovery When to Use This Section on page 13-1 Monitoring TMF on page 13-1 Monitoring the Status of TMF on page 13-2 Monitoring Data Volumes on page 13-2 TMF States on page 13-4 Monitoring the Status of Pathway on page 13-5 PATHMON States on page 13-6 Related Reading on page 13-6 When to Use This Section This section explains how to monitor the status of the HP NonStop Transaction Transaction Management Facility (TMF) and Pathway transaction processing applications.
PAGE 212
Applications: Monitoring and Recovery Monitoring the Status of TMF Monitoring the Status of TMF To monitor TMF using TMFCOM: 1. At a TACL prompt: > TMFCOM 2. At the TMFCOM prompt: ~ STATUS TMF Note. The STATUS TMF command presents status information about the audit dump, audit trail, and catalog processes. Thus, in addition to the general TMF information, the STATUS TMF command combines information from the STATUS AUDITDUMP, STATUS AUDITTRAIL, and STATUS BEGINTRANS commands.
PAGE 213
Applications: Monitoring and Recovery Monitoring Data Volumes For example, to check the status of all data volumes, at a TMFCOM prompttype: ~ STATUS DATAVOLS TMFCOM responds with output similar to: Audit Recovery Volume Trail Mode State --------------------------------------------------$DATA1 MAT Online Started $DATA2 MAT Online Started $DATA3 MAT Online Recovering $DATA4 MAT Archive Recovering $DATA5 AUX01 Online Started $DATA6 AUX01 Online Started $DATA6 AUX01 Archive Recovering HP NonStop S-Series Ope
PAGE 214
TMF States Applications: Monitoring and Recovery TMF States The TMF subsystem can be in any of the states listed in Table 13-1. Table 13-1. TMF States State Meaning Configuring New Audit Trails The TMF subsystem has not yet been started with this configuration. Deleting The TMF subsystem is purging its current configuration, audit trails, and volume and file recovery information for the database in response to a DELETE TMF command.
PAGE 215
Monitoring the Status of Pathway Applications: Monitoring and Recovery Monitoring the Status of Pathway Pathway is a group of related software tools that enables businesses to develop, install, and manage online transaction processing applications. Several Pathway environments can exist for a system. As a system operator, you might check the status of Pathway in your routine system monitoring. This subsection explains how to check the status of the Pathway transaction processing applications. 1.
PAGE 216
PATHMON States Applications: Monitoring and Recovery PATHCOM responds with a output such as: PATHMON -PATHCTL LOG1 SE LOG2 REQNUM 1 2 STATE=RUNNING CPUS 6:1 (OPEN) $GROG.VIEWPT.PATHCTL (OPEN) $0 (CLOSED) FILE PATHCOM TCP PID $Y622 $Y898 PAID 8,001 WAIT PATHMON States The status of the PATHMON process can be either STARTING or RUNNING: • • STARTING indicates that a system load or cool start has not finished. RUNNING indicates that a system load or cool start has finished.
PAGE 217
14 Printers and Terminals: Monitoring and Recovery When to Use This Section on page 14-1 Overview of Printers and Terminals on page 14-1 Monitoring Printer and Collector Process Status on page 14-2 Monitoring Printer Status on page 14-2 Monitoring Collector Process Status on page 14-2 Recovery Operations for Printers and Terminals on page 14-3 Recovery Operations for a Full Collector Process on page 14-3 Related Reading on page 14-3 When to Use This Section This section provides a brief overview about moni
PAGE 218
Monitoring Printer and Collector Process Status Printers and Terminals: Monitoring and Recovery Monitoring Printer and Collector Process Status This subsection explains how to list the printers on your system and determine their status. It also explains how to check the status of the spooler subsystem collector processes, which accept output from applications and store the output on a disk.
PAGE 219
Printers and Terminals: Monitoring and Recovery Recovery Operations for Printers and Terminals This listing shows that the three collector processes, $S, $S1, and $S2, are active and none is approaching a full state.
PAGE 220
Printers and Terminals: Monitoring and Recovery HP NonStop S-Series Operations Guide—522459-008 14- 4 Related Reading
PAGE 221
15 Power Failures: Preparation and Recovery When to Use This Section on page 15-1 How an Enclosure Responds to Power Failures on page 15-1 How External Devices Respond to Power Failures on page 15-2 With an Uninterruptible Power Supply (UPS) on page 15-2 Without an Uninterruptible Power Supply (UPS) on page 15-2 Preparing for Power Failure on page 15-3 Monitoring Power Supplies on page 15-3 Maintaining Batteries on page 15-3 Recharging Spare Batteries on page 15-3 Monitoring Batteries on page 15-3 Causes of
PAGE 222
Power Failures: Preparation and Recovery • How External Devices Respond to Power Failures The power-fail delay time The default power-fail delay time is 30 seconds, but this time can vary depending on how your system is configured. In some circumstances, the operating system might shorten the power-fail delay time. With a shorter power-fail delay time, the batteries might be able to provide power to the memory for longer than the normal 45 minutes.
PAGE 223
Power Failures: Preparation and Recovery Preparing for Power Failure Preparing for Power Failure To prepare for power failures, regularly monitor power supplies and batteries. Monitoring Power Supplies Monitor power-generating equipment and run regular checks on any backup generators to make sure that you can handle extended power outages. Maintaining Batteries Make sure that the batteries in each enclosure and all spare batteries are always fully charged.
PAGE 224
Power Failures: Preparation and Recovery • Recharging Drained Batteries Hardware problems: ° ° ° A power monitor and control unit (PMCU) failure has occurred. A battery failure has occurred. A processor multifunction (PMF) CRU hardware failure has occurred. The failed hardware component might need to be replaced. Refer to the NTL Support and Service Library. Recharging Drained Batteries Batteries are automatically recharged when the system is running.
PAGE 225
Power Failures: Preparation and Recovery Setting System Time 2. Log on to the OSM Service Connection or the TSM Service Application, and then: a. Check the state of the batteries as described in Monitoring Batteries on page 15-3. b. Check the status of all system components in the enclosures to make sure they are started. 3. Use SCF commands to check the status of external devices and, if necessary, to restart any external devices to bring them back online.
PAGE 226
Power Failures: Preparation and Recovery HP NonStop S-Series Operations Guide—522459-008 15- 6 Related Reading
PAGE 227
16 Starting and Stopping the System When to Use This Section on page 16-2 Minimizing the Frequency of Planned Outages on page 16-2 Anticipating and Planning for Change on page 16-2 Performing a Change Online on page 16-3 Powering On the System on page 16-3 Before Powering On the System on page 16-3 System Power-On Procedure on page 16-4 Troubleshooting and Recovery Operations When Powering On the System on page 16-4 Starting the System on page 16-6 The System Startup Dialog Box on page 16-6 The Load Process
PAGE 228
Starting and Stopping the System When to Use This Section When to Use This Section You normally leave a system running. Therefore, powering the system on and off, or starting (performing a system load) and stopping the system, are not part of the daily operations routine. However, you do have to perform these procedures as part of some system operations.
PAGE 229
Starting and Stopping the System Performing a Change Online increase the maximum number of objects controlled by PATHMON objects without a system shutdown. Performing a Change Online You can perform many changes to a NonStop S-series system online. For information on hardware changes, application changes, and communications subsystem changes you can perform without shutting the system down, refer to the NonStop S-Series Planning and Configuration Guide and the Availability Guide for Change Management.
PAGE 230
Starting and Stopping the System System Power-On Procedure System Power-On Procedure To power on a system: 1. Locate the power-on push button above the handle on either processor multifunction (PMF) customer-replaceable unit (CRU) in group 01 (the group containing processors 0 and 1). Refer to Section 2, Determining Your System Configuration. 2. Press and hold down the power-on push button for at least one second. 3.
PAGE 231
Starting and Stopping the System Troubleshooting and Recovery Operations When Powering On the System Green LED Is Not Lit After POSTs Finish It can take several minutes for the green LEDs on all system components to light: 1. Check that fans are turning and that the AC power cords and power-on cables are properly connected. 2. Wait for the POSTs to finish. It might take as long as 10 minutes for all system components. 3. If the green LEDs still do not light: a.
PAGE 232
Starting the System Starting and Stopping the System Starting the System Starting a system involves loading the NonStop operating system into the memory of each processor in the server. Use the OSM or TSM Low-Level Link to start a system by either: • • Using the System Startup dialog box is the normal method for most circumstances, if you are performing a system load from the system disks located in slots 1.1.11 and 1.1.12.
PAGE 233
Starting and Stopping the System The System Startup Dialog Box first processor to be “Executing NonStop OS” after the system load finishes successfully. If the system load fails along all eight paths, refer to Troubleshooting and Recovery Operations When Starting the System on page 16-11. After the first processor is loaded, the initial TACL process automatically invokes the CIIN file unless the CIIN file is disabled.
PAGE 234
Starting and Stopping the System • The System Startup Dialog Box Base (CONFBASE) is the most basic configuration required for system startup. You will probably never need to load the system from the CONFBASE file. However, if the current configuration file has become corrupted and there is no other configuration file from which you can load the system, use this option. 4. Make sure that the CIIN disabled check box is not selected if you want the command in the CIIN file to execute. 5.
PAGE 235
Starting and Stopping the System The Load Processor-n From Disk Dialog Box For example, if you load the \EAST system from the CONFBASE file (which specifies \NONAME as the system name), an INFO SUBSYS $ZZKRN command displays \EAST as the current system and \NONAME as a pending change. Enter an ALTER SUBSYS command to change the system name to \EAST and cause the pending change to disappear. It is not displayed when you enter INFO SUBSYS again.
PAGE 236
Starting and Stopping the System The Load Processor-n From Disk Dialog Box 4. Select File>Start Terminal Emulator>For Event Streams. Procedure to Use the Load Processor-n From Disk Dialog Box To perform a system load into a specified processor, perform these steps from the OSM or TSM Low-Level Link: 1. From the toolbar, click Processor Status. The Processor Status dialog box appears. 2. In the Processor Status dialog box: a. Select the processor you want to load. b.
PAGE 237
Starting and Stopping the System Troubleshooting and Recovery Operations When Starting the System 3. In the Load Processor-n From Disk dialog box: a. Type the current SYSnn. b. Select the current configuration file (CONFIG), or if you are unable to load using the CONFIG file, select a saved version (CONFxxyy). c. Check the CIIN disabled option if you plan to dump processors. d. Type the group, module, and slot numbers of the disk from which you want to load.
PAGE 238
Troubleshooting and Recovery Operations When Starting the System Starting and Stopping the System Startup Event Stream and Startup TACL Windows Do Not Appear Although these windows would probably not appear during system startup, you can use the OutsideView product to configure or open a startup event stream window or startup TACL window. To open startup event stream windows and startup TACL windows: 1. Log on to the OSM or TSM Low-Level Link. The Management window appears. 2.
PAGE 239
Starting and Stopping the System Troubleshooting and Recovery Operations When Starting the System 2. Select Session>New. The New Session Properties dialog box appears. 3. On the Session tab, in the Session Caption box, type a session caption name such as Startup Events or Startup TACL. 4. Click IO Properties. The TCP/IP Properties dialog box appears. 5.
PAGE 240
Starting and Stopping the System • • Troubleshooting and Recovery Operations When Starting the System The System Status and Detailed Status boxes in the System Startup dialog box OSM or TSM Event Viewer 2. Record any event messages or halt codes, and refer to the appropriate documentation for recovery information.
PAGE 241
Starting and Stopping the System • • Troubleshooting and Recovery Operations When Starting the System If possible, look up event messages in the EMS logs ($0 and $ZLOG), and refer to the OSM or TSM Event Viewer online help or the Operator Messages Manual for further information about the cause, effect, and recovery procedure for this event. (If you configured your processor to print event messages to a hard-copy printer, you might be able to retrieve messages sent while the system was going down.
PAGE 242
Starting and Stopping the System Troubleshooting and Recovery Operations When Starting the System 2. In the Configuration File box, select Base (CONBASE) as the configuration file. 3. Click Start system. b. From the startup TACL prompt, issue this command for each of the processors to be reloaded: > RELOAD (nn), PRIME c. From the Startup TACL window, configure a tape drive. d. Restore a previously backed-up configuration file. e.
PAGE 243
Getting a Corrupt System Configuration File Analyzed Starting and Stopping the System Getting a Corrupt System Configuration File Analyzed If the current system configuration file is corrupted, you can send it to your service provider for an analysis. Follow these steps: 1. Return to a saved, stable configuration file by following the procedure outlined in Procedure to Use the System Startup Dialog Box on page 16-7. 2. Once the system is up and stable, copy to a backup tape the corrupt CONFSAVE file.
PAGE 244
Starting and Stopping the System Stopping the System You might able to automate the startup of many processes, lines, devices, and applications. For example, you can use the SCF interface to the Kernel subsystem to add process names to the system configuration database—typically monitor or manager processes such as $ZEXP, the Expand manager process; $ZNET, the Subsystem Control Point (SCP) process; or $ZPMON, the OSS monitor process. Refer to the SCF Reference Manual for the Kernel Subsystem.
PAGE 245
Starting and Stopping the System Preparing to Stop the System Preparing to Stop the System You must stop all applications, devices, and processes in an orderly fashion before you stop a system. You might be able to automate the shutdown of lines, devices, and applications by including commands in one or more shutdown command files that you invoke from either a TACL prompt or another shutdown file. Shutdown command files contain a series of commands that automatically execute when the file is executed.
PAGE 246
Starting and Stopping the System Procedure to Stop the System Using OSM or TSM b. Stop Distributed Systems Management/Software Configuration Manager (DSM/SCM) if it is running. At a TACL prompt: 1. Type this VOLUME command: > VOLUME $DSMSCM.ZDSMSCM 2. Stop DSM/SCM: > RUN STOPSCM 4. Stop communications lines, such as Expand lines. 5. Identify and stop any remaining processes that should be stopped individually: a. Use the TACL PPD and STATUS commands to help you identify running processes. b.
PAGE 247
Starting and Stopping the System Powering Off the System 6. A message box asks whether you are sure you want to perform a halt on the selected processors. Click OK. Powering Off the System The system powers off by powering off all system components and finally shutting down the power supplies. In this state, you can power up the system only by pressing the power-on push button on either PMF CRU in group 01.
PAGE 248
Starting and Stopping the System Emergency Power-Off Procedure 3. A message box prompts you to confirm the power off system action. To power off, click OK. 4. Shut off AC power to all peripherals and subsystems. The system is powered off, and you are automatically logged off of the Low-Level Link. • To power off a stopped system after system discovery: 1. Select Display>Actions. 2. The Actions dialog box appears. In the Actions box, select Power Off. 3. Click Perform Action. 4.
PAGE 249
Starting and Stopping the System Reducing Shutdown Time Reducing Shutdown Time An important component of a planned outage is the time required to start and stop your applications, devices, and processes. These general techniques can help reduce the time required to start up and shut down these objects: • • • Write efficient startup and shutdown command files. Use parallel processing to distribute startup and shutdown processes across multiple processors.
PAGE 250
Starting and Stopping the System Use Parallel Processing Multiple-line commands in a command file increase execution time. By using singleline commands, you can reduce the time required to execute a command file. Avoid Manual Intervention Write startup and shutdown files so that they execute correctly without requiring manual intervention. Any time an operator must intervene, startup and shutdown time increase.
PAGE 251
Investigate Product-Specific Techniques Starting and Stopping the System additional suggestions on using parallel processing to enhance operational efficiency, refer to the Availability Guide for Change Management. Investigate Product-Specific Techniques Certain products provide commands that can reduce the time required to start up or shut down their services.
PAGE 252
Related Reading Starting and Stopping the System Table 16-1.
PAGE 253
17 Preventive Maintenance When to Use This Section on page 17-1 Monitoring Physical Facilities on page 17-1 Checking Air Temperature and Humidity on page 17-1 Checking Physical Security on page 17-1 Maintaining Order and Cleanliness on page 17-2 Checking Fire-Protection Systems on page 17-2 Cleaning System Components on page 17-2 Cleaning an Enclosure on page 17-2 Cleaning and Maintaining Printers on page 17-2 Cleaning Tape Drives on page 17-3 Handling and Storing Cartridge Tapes on page 17-3 When to Use
PAGE 254
Maintaining Order and Cleanliness Preventive Maintenance Maintaining Order and Cleanliness Clutter and debris can cause accidents and fires. Dust, smoke, and spilled liquids can damage system hardware components. Depending on your company’s policies, you might be asked to keep the computer room clean; inspect air filters; keep printer dust under control through periodic vacuuming; and enforce a ban on smoking, eating, and drinking in the computer room.
PAGE 255
Cleaning Tape Drives Preventive Maintenance Cleaning Tape Drives Clean tape drive heads and sensors frequently. For detailed information on cleaning tape drives, refer to the documentation shipped with your tape drive. How often you clean a tape drive or the tape path depends on use, operating environment, and tape quality. Cleaning supplies are available from HP.
PAGE 256
Preventive Maintenance • • • Handling and Storing Cartridge Tapes Do not remove the leader block, pull out the tape, or press the reel lock. If the leader block is detached from the tape, contact the tape supplier for a leader block repair kit. When transporting cartridge tapes, do not stack the cartridges more than six high. Pack them carefully with the reel sides upright. The leader block edges can crack if they engage with each other.
PAGE 257
A Operational Differences Between Systems Running D-Series and G-Series RVUs Users familiar with systems running D-series RVUs will find several major differences in the operational environment of systems systems running G-series RVUs. Although many of the operations to be performed remain the same, the tools you use to execute these operations might differ significantly.
PAGE 258
Operational Differences Between Systems Running D-Series and G-Series RVUs HP NonStop S-Series Operations Guide—522459-008 A- 2
PAGE 259
B Tools and Utilities for Operations When to Use This Appendix on page B-1 BACKCOPY on page B-2 BACKUP on page B-2 Disk Compression Program (DCOM) on page B-2 Disk Space Analysis Program (DSAP) on page B-2 EMSDIST on page B-2 Event Management Service Analyzer (EMSA) on page B-2 File Utility Program (FUP) on page B-3 Measure on page B-3 MEDIACOM on page B-3 NSKCOM and the Kernel-Managed Swap Facility (KMSF) on page B-3 Object Monitoring Facility (OMF) on page B-3 OSM Package on page B-4 PATHCOM on page B-4 P
PAGE 260
Tools and Utilities for Operations BACKCOPY BACKCOPY Use the BACKCOPY utility to create one or two duplicate tapes for archive storage, distribution, or disaster recovery. You can also create one or two labeled (or unlabeled) tape sets from a labeled or unlabeled tape set. The BACKCOPY utility duplicates tapes that are made from a BACKUP utility file-mode operation, but it cannot duplicate tapes that are made from a BACKUP utility volume-mode operation.
PAGE 261
Tools and Utilities for Operations File Utility Program (FUP) File Utility Program (FUP) The File Utility Program (FUP) is a component of the standard software package for the NonStop operating system. FUP software is designed to help you manage disk files, nondisk devices (printers, terminals, and tape drives), and processes (running programs) on the NonStop S-series system. You can use FUP to create, display, and duplicate files; load data into files; alter file characteristics; and purge files.
PAGE 262
Tools and Utilities for Operations OSM Package OSM Package The HP Open System Management (OSM) product replaces TSM as the system management tool of choice for NonStop S-series systems. OSM applications perform all of the same functions that TSM does. However, OSM offers a browser-based interface that improves scalability and performance and overcomes other limitations that exist in TSM. TSM is still supported, but OSM is required to support new functionality in G06.21 and later. For G06.
PAGE 263
Tools and Utilities for Operations Subsystem Control Facility (SCF) Subsystem Control Facility (SCF) SCF configures and manages several subsystems that control system processes and hardware, including communications paths, disks, tapes, terminals, printers, and communications lines. You can run SCF from any workstation or terminal on the system after you are logged on.
PAGE 264
Tools and Utilities for Operations TSM Package TSM Package If you are running a version of TSM earlier than version 2000B, see the online documentation included with that version for a description of the TSM package. TSM provides troubleshooting, maintenance, and service tools for systems running G-series RVUs.
PAGE 265
Tools and Utilities for Operations TSM Service Application Notification Director also forwards information from NonStop S-series servers to the TSM Low-Level Link and the TSM Service Application to ensure that these applications display up-to-date information. TSM Service Application The TSM Service Application allows you to communicate with a NonStop S-series server when the operating system is running.
PAGE 266
Tools and Utilities for Operations ViewSys ViewSys ViewSys is a system resource monitor that displays processor performance statistics and resource consumption for a set polling period. It updates the numbers automatically at the end of each polling period, which allows you to evaluate the effects of changes as those changes are made. ViewSys indicates the current allocation of a given resource and the percentage of that resource used.
PAGE 267
C Related Reading For more information about tools and utilities used for system operations, refer to the documentation listed in Table C-1. Table C-1. Related Reading for Tools and Utilities (page 1 of 6) Tool Documentation Description BACKCOPY Guardian Disk and Tape Utilities Reference Manual This manual describes these disk and tape utilities: BACKCOPY, BACKUP, DCOM, DSAP, RESTORE, and TAPECOM. This manual supports both Gseries and D-series RVUs; TAPECOM is not supported for G-series RVUs.
PAGE 268
Related Reading Table C-1. Related Reading for Tools and Utilities (page 2 of 6) Tool Documentation Description MEDIACOM (continued) Guardian User’s Guide This guide contains information explaining how to perform routine operations relating to the tapes and tape drives on your system. The guide explains the MEDIACOM utility and provides examples for using it.
PAGE 269
Related Reading Table C-1.
PAGE 270
Related Reading Table C-1. Related Reading for Tools and Utilities (page 4 of 6) Tool Documentation Description RESTORE Guardian Disk and Tape Utilities Reference Manual This manual describes these disk and tape utilities: BACKCOPY, BACKUP, DCOM, DSAP, RESTORE, and TAPECOM. This manual supports both G-series and D-series RVUs; TAPECOM is not supported for G-series RVUs.
PAGE 271
Related Reading Table C-1. Related Reading for Tools and Utilities (page 5 of 6) Tool Documentation Description SCF interface to the WAN subsystem WAN Subsystem Configuration and Management Manual This manual describes how to configure a ServerNet wide area network (SWAN) concentrator on a NonStop S-series server. It also describes how to monitor, modify, and control the WAN subsystem. It includes detailed descriptions of the SCF commands used with the WAN subsystem.
PAGE 272
Related Reading Table C-1. Related Reading for Tools and Utilities (page 6 of 6) Tool Documentation Description ViewPoint ViewPoint Manual This manual describes ViewPoint, a multifunction operations console application that allows the management of a network of systems. The manual contains information on installing, configuring, and starting ViewPoint for custom applications. It also describes the concepts underlying ViewPoint operation.
PAGE 273
D Converting Numbers When to Use This Appendix on page D-1 Overview of Numbering Systems on page D-2 Binary to Decimal on page D-3 Octal to Decimal on page D-4 Hexadecimal to Decimal on page D-5 Decimal to Binary on page D-7 Decimal to Octal on page D-8 Decimal to Hexadecimal on page D-9 When to Use This Appendix Refer to this appendix if you need to convert numbers from one numbering system to another.
PAGE 274
Overview of Numbering Systems Converting Numbers Overview of Numbering Systems Internally, a computer stores data as a series of off and on values represented symbolically by the binary digits, or bits, 0 and 1, respectively. Because numbers represented as strings of binary 0s and 1s are difficult to read, binary numbers are generally converted into octal, decimal, or hexadecimal form. Table D-1 describes the binary, octal, decimal, and hexadecimal number systems. Table D-1.
PAGE 275
Binary to Decimal Converting Numbers Binary to Decimal To convert a binary number to a decimal number: 1. Starting from the right, multiply the least significant (rightmost) binary digit by the first placeholder value. Moving towards the left, multiply each new binary digit by its corresponding placeholder value until the binary number is exhausted. To establish placeholder values, the first placeholder value (on the far right) is 1.
PAGE 276
Converting Numbers Octal to Decimal Octal to Decimal To convert an octal number to a decimal number: 1. Starting from the right, multiply the least significant (rightmost) octal digit by the first placeholder value. Moving towards the left, multiply each new octal digit by its corresponding placeholder value until the octal number is exhausted. To establish placeholder values, the first placeholder value on the far right is 1.
PAGE 277
Converting Numbers Hexadecimal to Decimal Hexadecimal to Decimal To convert a hexadecimal number to a decimal number: 1. Starting from the right, multiply the least significant (rightmost) hexadecimal digit by the first placeholder value. Moving towards the left, multiply each new hexadecimal digit by its corresponding placeholder value until the hexadecimal number is exhausted. To establish placeholder values, the first placeholder value (on the far right) is 1.
PAGE 278
Converting Numbers Hexadecimal to Decimal Figure D-3. Hexadecimal to Decimal Conversion Placeholder values ... 4096 256 16 1 ... B A 1 0 Hexadecimal number 0 * 1 1 * 16 10 * 256 11 * 4096 = 0 = 16 = 2560 = 45056 47632 CDT 609.CDD 1. Take the rightmost hexadecimal digit and multiply it by the rightmost placeholder value. 2. Moving to the left, take the next hexadecimal digit and multiply it by the next placeholder value. Continue to do this until the hexadecimal number has been exhausted.
PAGE 279
Converting Numbers Decimal to Binary Decimal to Binary To convert a decimal number to a binary number: 1. Divide the decimal number by 2. The remainder of this first division becomes the least significant (rightmost) digit of the binary value. 2. Divide the quotient from Step 1 by 2, and use the remainder of the next division as the next digit (to the left) of the binary value. Continue to divide the quotients by 2 until the decimal number is exhausted.
PAGE 280
Converting Numbers Decimal to Octal Decimal to Octal To convert a decimal number to an octal number: 1. Divide the decimal number by 8. The remainder of this first division becomes the least significant (rightmost) digit of the octal value. 2. Divide the quotient from Step 1 by 8, and use the remainder of the next division as the next digit (to the left) of the octal value. Continue to divide the quotients by 8 until the decimal number is exhausted.
PAGE 281
Converting Numbers Decimal to Hexadecimal Decimal to Hexadecimal To convert a decimal number to a hexadecimal number: 1. Divide the decimal number by 16. The remainder of this first division becomes the least significant (rightmost) digit of the hexadecimal value. If the remainder exceeds 9, convert the 2-digit remainder to its hexadecimal letter equivalent. Use this table for conversion. Decimal Hexadecimal 10 A 11 B 12 C 13 D 14 E 15 F 2.
PAGE 282
Converting Numbers Decimal to Hexadecimal HP NonStop S-Series Operations Guide—522459-008 D -10
PAGE 283
Safety and Compliance This sections contains three types of required safety and compliance statements: • • • Regulatory compliance Waste Electrical and Electronic Equipment (WEEE) Safety Regulatory Compliance Statements The following regulatory compliance statements apply to the products documented by this manual. FCC Compliance This equipment has been tested and found to comply with the limits for a Class A digital device, pursuant to part 15 of the FCC Rules.
PAGE 284
Safety and Compliance Regulatory Compliance Statements Korea MIC Compliance Taiwan (BSMI) Compliance Japan (VCCI) Compliance This is a Class A product based on the standard or the Voluntary Control Council for Interference by Information Technology Equipment (VCCI). If this equipment is used in a domestic environment, radio disturbance may occur, in which case the user may be required to take corrective actions.
PAGE 285
Safety and Compliance Regulatory Compliance Statements European Union Notice Products with the CE Marking comply with both the EMC Directive (89/336/EEC) and the Low Voltage Directive (73/23/EEC) issued by the Commission of the European Community.
PAGE 286
Safety and Compliance SAFETY CAUTION SAFETY CAUTION The following icon or caution statements may be placed on equipment to indicate the presence of potentially hazardous conditions: DUAL POWER CORDS CAUTION: “THIS UNIT HAS MORE THAN ONE POWER SUPPLY CORD. DISCONNECT ALL POWER SUPPLY CORDS TO COMPLETELY REMOVE POWER FROM THIS UNIT." "ATTENTION: CET APPAREIL COMPORTE PLUS D'UN CORDON D'ALIMENTATION. DÉBRANCHER TOUS LES CORDONS D'ALIMENTATION AFIN DE COUPER COMPLÈTEMENT L'ALIMENTATION DE CET ÉQUIPEMENT".
PAGE 287
Safety and Compliance Waste Electrical and Electronic Equipment (WEEE) HIGH LEAKAGE CURRENT To reduce the risk of electric shock due to high leakage currents, a reliable grounded (earthed) connection should be checked before servicing the power distribution unit (PDU).
PAGE 288
Safety and Compliance Safety Safety Safety information can be accessed from the left navigation area of the NTL home page: select NonStop Computing>Important Safety Information. A document window containing a binder of safety information, in several languages, appears. In the document window, click a document title to open the safety information in another language. Local HP support can also help direct you to your safety information.
PAGE 289
Index A Asynchronous Terminal Process 6100 (ATP6100) 6-3 ATM 3 ServerNet adapter (ATM3SA) 6-2 ATM3SA 6-2 ATP6100 6-3 B BACKCOPY utility B-2 BACKUP utility backing up configuration and operations files 11-26 description of B-2 Batteries charging 15-3 maintaining 15-3 monitoring 15-3 recharging drained 15-4 Battery ride-through 16-22 Binary number system D-2 Binary to decimal conversion D-3 Bus dumps See Dumps C Cartridge tape, handling and storing 17-3 Cleaning enclosures 17-2 CMI, replaced by SCF A-1 Coll
PAGE 290
Index E Dump Processor-n to Tape dialog box dumping a processor to tape (down system only) 11-17 screen capture 11-19 Dumps completed message 11-10, 11-11 dump file checking with FUP 11-11 compressing 11-20/11-25 submitting to service provider 11-24/11-27 processor to disk 11-10/11-11 processor to tape 11-17/11-20 E E4SA 6-2 EMS Analyzer (EMSA) B-2 EMS event messages, monitoring 4-1/4-3 EMSA B-2 EMSDIST description of B-2 using to monitor EMS event messages 4-1 EMSLOG file 11-26 Enclosures cleaning 17-2
PAGE 291
Index G Freeze (continued) recovery operations for a hardware error freeze 11-8 system freeze 11-15 FRU 2-2 FUP See File Utility Program (FUP) I G K G4SA 2-10 GESA 6-2 Gigabit Ethernet 4-port adapter (G4SA) 6-2 Gigabit Ethernet 4-port ServerNet adapter 2-10 Gigabit Ethernet ServerNet adapter 6-2 Group numbering 2-3, 2-14 Group, in a system 2-1 Guided procedures, OSM 1-13 Guided procedures, TSM 1-13 G-series -xv Kernel-Managed Swap Facility (KMSF) B-3 KMSF B-3 H Halting processors 11-7 See also Proce
PAGE 292
Index N Monitoring (continued) overview 3-1/3-33 printers 14-1 processes 5-1/5-6 processors 11-2/11-6 ServerNet fabrics 12-1/12-4 ServerNet/DA 7-1 tape drives 10-1/10-9 terminals 14-1 MSP 0 or 1 16-13 Multifunction I/O board (MFIOB) 6-2 N NonStop IPX/SPX 6-2 NonStop S7000 processor enclosure 2-7 NonStop S7400 processor enclosure 2-8 NonStop S7x00 -xv NonStop Sxx000 -xv NonStop Sxx000 processor enclosure 2-8 NonStop TCP/IP 6-2 NSKCOM B-3 Number conversion binary to decimal D-3 decimal to binary D-7 decima
PAGE 293
Index R Power-on push button, locating 2-13 Printers monitoring 14-1 recovery operations for 14-2 Problems, common disk drive 9-12, 9-13 tape drive 10-9 Processes generic 5-2 I/O 5-2 monitoring 5-2/5-6 recovery operations for 5-6 system 5-1 Processor halts halt code = %nn message 11-5 recovery operations for 11-8 Processor multifunction (PMF) CRU status LEDs 3-31 Processor Status dialog box 11-18 Processors dumps See Dumps freeze See Freeze halt See Processor halts halting processors 11-7 hang 11-5 loadin
PAGE 294
Index T Setting system time 15-5 Slot 2-1 SNAX/APN 6-3 SPOOLCOM B-4 Starting the system 16-6/16-16 Stopping the system 16-18/16-21 Storing cartridge tapes 17-3 Subsystem Control Facility (SCF) See SCF Subsystems displaying configuration of 2-31 Kernel 2-32 SLSA 2-33, 6-2 storage 2-32 TCP/IP 2-31 WAN 2-34, 6-2 Sxx000 -xv System organization 2-1 performance 16-2 powering off 16-21/16-22 powering on 16-3/16-5 recording configuration of 2-15 starting 16-6/16-16 stopping 16-18/16-21 System console, recovery op
PAGE 295
Index W W Windows Event Viewer 1-12 Special Characters $SYSTEM, recovery operations for 16-15 (G4SA) 6-2 HP NonStop S-Series Operations Guide—522459-008 Index -7
PAGE 296
Index Special Characters HP NonStop S-Series Operations Guide—522459-008 Index -8