NonStop S-Series Operations Guide (G06.24+)

ManualsBrandsHP ManualsServerHP NonStop G-Series

HP NonStop S-Series

Operations Guide

Abstract

This guide describes how to perform routine system hardware operations for HP

NonStop™ S-series servers. These tasks include monitoring the system, performing

common operations tasks, and performing routine hardware maintenance. This guide

is written for system operators.

Product Version

N.A.

Supported Release Version Updates (RVUs)

This guide supports G06.24 and all subsequent G-series RVUs until otherwise

indicated by its replacement publication.

Part Number Published

522459-007 September 2004

Summary of content (308 pages)

PAGE 1
HP NonStop S-Series Operations Guide Abstract This guide describes how to perform routine system hardware operations for HP NonStop™ S-series servers. These tasks include monitoring the system, performing common operations tasks, and performing routine hardware maintenance. This guide is written for system operators. Product Version N.A. Supported Release Version Updates (RVUs) This guide supports G06.24 and all subsequent G-series RVUs until otherwise indicated by its replacement publication.
PAGE 2
Document History Part Number Product Version Published 522459-002 N.A. February 2002 522459-003 N.A. May 2002 522459-004 N.A. August 2002 522459-005 N.A. September 2003 522459-007 N.A.
PAGE 3
HP NonStop S-Series Operations Guide Index Examples What’s New in This Guide xiii Guide Information xiii New and Changed Information Figures xiii About This Guide xv Who Should Use This Guide xv What Is in This Guide xvi Where to Get More Information xvii Notation Conventions xviii 1.
PAGE 4
2. Determining Your System Configuration Contents Guided Procedures 1-13 2.
PAGE 5
4. Monitoring EMS Event Messages Contents 4. Monitoring EMS Event Messages When to Use This Section 4-1 What Is the Event Management Service (EMS)? Tools for Monitoring EMS Event Messages 4-2 EMSDIST 4-2 OSM Event Viewer 4-2 TSM Event Viewer 4-2 ViewPoint 4-3 Related Reading 4-3 4-2 5.
PAGE 6
7. ServerNet/DA: Monitoring and Recovery Contents 7. ServerNet/DA: Monitoring and Recovery When to Use This Section 7-1 Overview of the ServerNet/DA 7-1 Monitoring the ServerNet/DA 7-1 Identifying Problems With the ServerNet/DA 7-2 Recovery Operations for the ServerNet/DA 7-3 Related Reading 7-3 8.
PAGE 7
10. Tape Drives: Monitoring and Recovery Contents Related Reading 9-16 10.
PAGE 8
12. ServerNet Fabrics: Monitoring and Recovery Contents Copying a Dump File From Tape to Disk 11-26 Backing Up a Processor Dump to Tape 11-26 Replacing Processor Memory or a PMF CRU 11-27 Submitting Information to Your Service Provider 11-27 Related Reading 11-30 12.
PAGE 9
Contents 15. Power Failures: Preparation and Recovery 15.
PAGE 10
17. Preventive Maintenance Contents Powering Off the System 16-22 System Power-Off Procedure Using SCF 16-22 System Power-Off Procedures Using OSM or TSM 16-22 Emergency Power-Off Procedure 16-23 Recovery Operations for Stopping or Powering Off the System 16-24 Reducing Shutdown Time 16-24 Write Efficient Startup and Shutdown Command Files 16-24 Use Parallel Processing 16-25 Investigate Product-Specific Techniques 16-26 Related Reading 16-27 17.
PAGE 11
C. Related Reading Contents OSM Package B-4 PATHCOM B-4 PEEK B-4 RESTORE B-4 SPOOLCOM B-4 Subsystem Control Facility (SCF) B-5 HP Tandem Advanced Command Language (TACL) TMFCOM B-5 TSM Package B-5 TSM Event Viewer B-6 TSM Low-Level Link B-6 TSM Notification Director B-6 TSM Service Application B-6 ViewPoint B-7 ViewSys B-7 Windows Event Viewer B-8 B-5 C. Related Reading D.
PAGE 12
Figures Contents Figures Figure 1-1. Figure 2-1. Figure 2-2. Figure 2-3. Figure 2-4. Figure 2-5. Figure 2-6. Figure 2-7. Figure 2-8. Figure 2-9. Figure 2-10. Figure 2-11. Figure 2-12. Figure 2-13. Figure 2-14. Figure 3-1. Figure 3-2. Figure 3-3. Figure 3-4. Figure 3-5. Figure 9-1. Figure 10-1. Figure 10-2. Figure 10-3. Figure 11-1. Figure 11-2. Figure 11-3. Figure 16-1. Figure 16-2. Figure 16-3. Figure D-1. Figure D-2. Figure D-3.
PAGE 13
Tables Contents Tables Table 1-1. Table 2-1. Table 2-2. Table 2-3. Table 2-4. Table 2-5. Table 2-6. Table 2-7. Table 2-8. Table 2-9. Table 2-10. Table 2-11. Table 2-12. Table 3-1. Table 3-2. Table 3-3. Table 3-4. Table 3-5. Table 3-6. Table 4-1. Table 6-1. Table 7-1. Table 9-1. Table 9-2. Table 10-1. Table 10-2. Table 11-1. Table 11-2. Table 11-3. Table 11-4. Table 13-1. Table 15-1. Table 16-1. Table C-1.
PAGE 14
Contents Table D-1.
PAGE 15
What’s New in This Guide Guide Information HP NonStop S-Series Operations Guide Abstract This guide describes how to perform routine system hardware operations for HP NonStop™ S-series servers. These tasks include monitoring the system, performing common operations tasks, and performing routine hardware maintenance. This guide is written for system operators. Product Version N.A. Supported Release Version Updates (RVUs) This guide supports G06.
PAGE 16
New and Changed Information What’s New in This Guide The main technical changes to this manual are: • • • • • A new section, Section 8, Fibre Channel ServerNet Adapter: Monitoring and Recovery, has been inserted following Section 7, thus incrementing section numbers that follow these two sections. This new section provides operator information for the FCSA.
PAGE 17
About This Guide This guide describes how to perform routine system hardware operations for NonStop S-series servers. This guide describes information for NonStop S-series servers on G06.24 and subsequent G-series release version updates. Note. S-series refers to the hardware that makes up the server. G-series refers to the software that runs on the server. The term NonStop Sxx000 represents the NonStop S70000, NonStop S72000, NonStop S74000, NonStop S76000, and NonStop S86000 servers.
PAGE 18
What Is in This Guide About This Guide What Is in This Guide Section or Appendix Section and Appendix Titles Section 1 Introduction to NonStop S-Series Operations Section 2 Determining Your System Configuration Section 3 Overview of Monitoring and Recovery Section 4 Monitoring EMS Event Messages Section 5 Processes: Monitoring and Recovery Section 6 Communications Subsystems: Monitoring and Recovery Section 7 ServerNet/DA: Monitoring and Recovery Section 8 Fibre Channel ServerNet Adapter:
PAGE 19
Where to Get More Information About This Guide Where to Get More Information Operations planning and operations management practices appear in these manuals: • • • • Introduction to NonStop Operations Management Availability Guide for Application Design Availability Guide for Change Management Availability Guide for Problem Management For comprehensive information about performing operations tasks for a NonStop S-series server, you need both this guide and the Guardian User’s Guide.
PAGE 20
Notation Conventions About This Guide OSM is the required system management tool for servers that use 6780 switches in ServerNet clusters, but OSM also provides system management for earlier versions of ServerNet clusters. For other documentation related to operations tasks, refer to Appendix C, Related Reading. Notation Conventions Hypertext Links Blue underline is used to indicate a hypertext link within text. By clicking a passage of text with a blue underline, you are taken to the location described.
PAGE 21
General Syntax Notation About This Guide A group of items enclosed in brackets is a list from which you can choose one item or none. The items in the list may be arranged either vertically, with aligned brackets on each side of the list, or horizontally, enclosed in a pair of brackets and separated by vertical lines. For example: FC [ num ] [ -num ] [ text ] K [ X | D ] address { } Braces. A group of items enclosed in braces is a list from which you are required to choose one item.
PAGE 22
Notation for Messages About This Guide If there is no space between two items, spaces are not permitted. In the following example, there are no spaces permitted between the period and any other items: $process-name.#su-name Line Spacing. If the syntax of a command is too long to fit on a single line, each continuation line is indented three spaces and is separated from the preceding line by a blank line. This spacing distinguishes items in a continuation line from items in a vertical list of selections.
PAGE 23
Notation for Messages About This Guide { } Braces. A group of items enclosed in braces is a list of all possible items that can be displayed, of which one is actually displayed. The items in the list might be arranged either vertically, with aligned braces on each side of the list, or horizontally, enclosed in a pair of braces and separated by vertical lines.
PAGE 24
Change Bar Notation About This Guide Change Bar Notation Change bars are used to indicate substantive differences between this edition of the manual and the preceding edition. Change bars are vertical rules placed in the right margin of changed portions of text, figures, tables, examples, and so on. Change bars highlight new or revised information. For example: The message types specified in the REPORT clause are different in the COBOL85 environment and the Common Run-Time Environment (CRE).
PAGE 25
1 Introduction to NonStop S-Series Operations When to Use This Section 1-1 Understanding the Operational Environment 1-1 What Are the Operator Tasks? 1-2 Monitoring the System and Performing Recovery Operations 1-2 Preparation and Recovery for Power Failures 1-3 Stopping and Powering Off the System 1-3 Powering On and Starting the System 1-3 Performing Preventive Maintenance 1-3 Operating Tape Drives 1-3 Responding to Spooler Problems 1-3 Determining the Cause of a Problem: A Systematic Approach 1-4 A Probl
PAGE 26
Introduction to NonStop S-Series Operations • • What Are the Operator Tasks? For a brief introduction to the system organization and the location of system components in a NonStop S-series server, see Section 2, Determining Your System Configuration. For information about various software tools and utilities you can use to perform system operations on a NonStop S-series server, see Appendix B, Tools and Utilities for Operations.
PAGE 27
Introduction to NonStop S-Series Operations Preparation and Recovery for Power Failures Recovery operations for a system console are not discussed in this guide. For recovery procedures for a system console and the applications installed on the system console, see the NonStop S-Series Hardware Installation and FastPath Guide.
PAGE 28
Introduction to NonStop S-Series Operations Determining the Cause of a Problem: A Systematic Approach Determining the Cause of a Problem: A Systematic Approach Continuous availability of your NonStop system is important to system users, and your problem-solving processes can help make such availability a reality. To determine the cause of a problem on your system, start by trying the easiest, least expensive possibilities. Move to more complex, expensive possibilities only if the easier solutions fail.
PAGE 29
A Problem-Solving Worksheet Introduction to NonStop S-Series Operations Table 1-1.
PAGE 30
Task 1: Get the Facts Introduction to NonStop S-Series Operations Task 1: Get the Facts The first step in solving any problem is to get the facts. Although it is tempting to speculate about causes, your time is better spent in first understanding the symptoms of the problem. Task 1a: Determine the Facts About the Problem To get a clear, complete description of problem symptoms, ask questions to determine the facts about the problem.
PAGE 31
Introduction to NonStop S-Series Operations Task 2: Find and Eliminate the Cause of the Problem Task 2: Find and Eliminate the Cause of the Problem After you collect the facts, you are ready to begin considering the possible causes of a problem. Using these facts and relying on your knowledge and experience, begin to list possible causes of the problem. Task 2a: Identify the Most Likely Cause To evaluate the possible causes of any problem, you must compare each cause with the problem symptoms.
PAGE 32
Introduction to NonStop S-Series Operations Task 3: Escalate the Problem If Necessary Task 2b: Fix the Most Probable Cause of the Problem For the example in the worksheet, the most likely cause of the hung terminal is a security problem. Ask yourself what would be the fastest, least expensive, safest, and surest way of verifying that this is the most probable cause of the problem. Once you have determined the most likely cause, try to fix it. Follow through and implement the appropriate solution.
PAGE 33
Introduction to NonStop S-Series Operations Task 4: Prevent Future Problems Task 4: Prevent Future Problems Solving problems that occur with your system can be exciting because it is active and stimulating. Preventing problems is often less dramatic. But in the end, prevention is more productive than solving problems. The more work you do to prevent problems before they arise, the fewer problems that will arise at potentially critical times.
PAGE 34
Introduction to NonStop S-Series Operations Logging On to a NonStop S-Series Server Logging On to a NonStop S-Series Server Many operations and troubleshooting tasks are performed using the TACL command interpreter or one of the OSM or TSM applications. For example, the TACL command interpreter allows you to access SCF, which you use to configure, control, and collect information about objects within subsystems.
PAGE 35
Introduction to NonStop S-Series Operations Launching OSM and TSM Applications Opening a TACL Window Directly From OutsideView If you know the IP address of the NonStop server (not those of OSM or TSM), use this method: 1. Select Start>Programs>OutsideView32 7.1. 2. From the Session menu, select New. The New Session Properties dialog box appears. 3. From the New Session Properties dialog box, Session tab, click IO Properties. The TCP/IP Properties dialog box appears. 4.
PAGE 36
Introduction to NonStop S-Series Operations Troubleshooting OSM and TSM Sessions select the system of your choice from the list of bookmarks displayed in the left column of the page (available bookmarks include those that were user-created during previous sessions and those converted automatically from an existing TSM system list). If no bookmarks are available, the web page also contains instructions on how to access these applications by entering a system URL as an Internet Explorer address.
PAGE 37
Introduction to NonStop S-Series Operations Guided Procedures Figure 1-1. Completed SP Responsive Test TIF705 Checking Windows Event Log for Error Messages If error messages occur during a TSM Service Application or TSM Low-Level Link session, check the Windows Event Viewer. The Windows Event Viewer logs events from the TSM client software and the Windows environment. There are three different ways that you can access the Windows Event Viewer.
PAGE 38
Introduction to NonStop S-Series Operations HP NonStop S-Series Operations Guide—522459-007 1- 14 Guided Procedures
PAGE 39
2 Determining Your System Configuration When to Use This Section 2-1 System Organization 2-1 Terms Used to Describe System Hardware Components 2-2 Identifying System Enclosures in a NonStop S-Series Server 2-3 Locating System Components in an Enclosure 2-4 Recording Your System Configuration 2-15 Maintaining Hard-Copy Forms 2-15 Using OSM or TSM to Inventory Your System 2-24 Using SCF to Determine Your System Configuration 2-26 Displaying Configuration Information—Examples 2-37 When to Use This Section Thi
PAGE 40
Determining Your System Configuration Terms Used to Describe System Hardware Components Terms Used to Describe System Hardware Components The terms used to describe system-hardware components vary. These terms include: • • • • Device Resource Customer-replaceable unit (CRU) Field-replaceable unit (FRU) Device A device can be a physical device or a logical device.
PAGE 41
Determining Your System Configuration Identifying System Enclosures in a NonStop S-Series Server Identifying System Enclosures in a NonStop S-Series Server The three types of system enclosures: • • • Processor enclosures contain processors and other system components. I/O enclosures are similar to processor enclosures but do not contain processors. IOAM enclosures contain I/O adapter modules. For the specific types of system enclosures and the locations of system components, see Table 2-1. Table 2-1.
PAGE 42
Determining Your System Configuration Locating System Components in an Enclosure Locating System Components in an Enclosure System components within an enclosure are identified by their physical location. To identify the location of a system component within an enclosure, you need to know: Group number The group number identifies the enclosure in which a system component is located. The group number of an enclosure is indicated by the group ID label on the enclosure.
PAGE 43
Locating System Components in an Enclosure Determining Your System Configuration Figure 2-1. Identification Numbers and Labels Group 02, Module 01, Slot 03 02 MODULE 01 01 SLOT 01 02 03 04 05 06 03 CDT 602.
PAGE 44
Determining Your System Configuration Locating System Components in an Enclosure Table 2-2 lists slot numbers for each system component in a processor enclosure. Table 2-2.
PAGE 45
Locating System Components in an Enclosure Determining Your System Configuration Figure 2-2 shows a diagram of a NonStop S7000 processor enclosure. Figure 2-2. NonStop S7000 Processor Enclosure Organization Appearance Side (Door Open) Service Side 50 Group 55 01 02 03 04 05 06 07 08 Module 51 52 53 54 11 12 13 14 15 16 17 18 09 19 10 20 21 22 23 28 24 25 26 56 Slots 27 CDT 790.
PAGE 46
Locating System Components in an Enclosure Determining Your System Configuration Figure 2-3 shows a diagram of a NonStop Sxx000 or S7x00 processor enclosure. Some PMF CRUs look slightly different from those shown in the figure. Figure 2-3.
PAGE 47
Determining Your System Configuration Locating System Components in an Enclosure Table 2-3 lists the slot numbers for each system component in an I/O enclosure. Table 2-3.
PAGE 48
Locating System Components in an Enclosure Determining Your System Configuration Figure 2-4 shows an example of a NonStop S-series I/O enclosure. IOMF 2 CRUs look slightly different from the IOMF CRUs shown installed in slots 50 and 55 in the figure. Also, if IOMF 2 CRUs are present, power supplies are installed at the bottom of the enclosure in slots 31 and 32, below the fans. See Figure 2-3 on page 2-8. Figure 2-4.
PAGE 49
Locating System Components in an Enclosure Determining Your System Configuration Figure 2-5.
PAGE 50
Locating System Components in an Enclosure Determining Your System Configuration Identifying the Location of a Processor This table identifies the physical location of each processor: Processor Group Number Module Number Slot Number 0 01 01 50 1 2 55 02 01 3 4 55 03 01 5 6 04 01 05 01 06 01 15 50 55 07 01 13 14 50 55 11 12 50 55 9 10 50 55 7 8 50 50 55 08 01 50 55 HP NonStop S-Series Operations Guide—522459-007 2- 12
PAGE 51
Locating System Components in an Enclosure Determining Your System Configuration Locating the Power-On Push Button Figure 2-6 illustrates where to find the power-on push button on some models of a PMF CRU. Figure 2-6. Locating the Power-On Push Button on a PMF CRU Amber Service LED Green Power-On LED Power-On Push Button POWER ON 50 55 51 52 53 54 Even-Numbered Processor Odd-Numbered Processor 56 Group ID Label on Cable Support 01 Processor Enclosure (Service Side) CDT 799.
PAGE 52
Locating System Components in an Enclosure Determining Your System Configuration Locating the Group ID Switches Group identification for a system enclosure is set with two group ID switches, located on the inside of the enclosure, on the appearance side near the fans. See Figure 2-7. Both group ID switches in an enclosure must display the same value. The service processors (SPs) read the switches when the enclosure is powered on and monitor them for changes. Figure 2-7.
PAGE 53
Determining Your System Configuration Recording Your System Configuration Recording Your System Configuration As a system operator, you need to understand how your system is configured so you can confirm that the hardware and system software are operating normally. If problems do occur, knowing your configuration allows you to pinpoint problems more easily. If your system configuration is corrupted, documentation about your configuration is essential for recovery.
PAGE 54
Determining Your System Configuration Maintaining Hard-Copy Forms Table 2-4.
PAGE 55
Determining Your System Configuration Maintaining Hard-Copy Forms Sample Forms for Recording Your System Configuration Examples of some of the forms available for recording your system configuration are listed next. You are authorized by HP to reproduce these forms only for use in documenting a NonStop S-series system: • • • • • • Figure 2-8 on page 2-18 is a blank form for documenting a PMF CRU configuration. Figure 2-9 on page 2-19 is a blank form for documenting a PMF 2 CRU configuration.
PAGE 56
Maintaining Hard-Copy Forms Determining Your System Configuration Figure 2-8. PMF CRU Configuration Form PMF CRU Configuration Form Date System Name Shaded areas indicate nonconfigurable components Group Module 01 / / Slot SCSI Port Product Number: SCF Name: SCSI Cable: POWER ON Ethernet Port SCSI IP Address: SERIAL CONSOLE ETHERNET Adapter Name: MODEM SAC Name: AC Power (S7000) DC Power (S7x000) AUX POWER-ON CABLE SAC Access List: PIF Name: LIF Name: VST304.
PAGE 57
Maintaining Hard-Copy Forms Determining Your System Configuration Figure 2-9.
PAGE 58
Maintaining Hard-Copy Forms Determining Your System Configuration Figure 2-10.
PAGE 59
Maintaining Hard-Copy Forms Determining Your System Configuration Figure 2-11.
PAGE 60
Maintaining Hard-Copy Forms Determining Your System Configuration Figure 2-12.
PAGE 61
Maintaining Hard-Copy Forms Determining Your System Configuration Figure 2-13.
PAGE 62
Determining Your System Configuration Using OSM or TSM to Inventory Your System Using OSM or TSM to Inventory Your System Both the OSM Service Connection and the TSM Service Application provide you with hierarchical, physical (graphical representation), and inventory views of your system and cluster resources.
PAGE 63
Using OSM or TSM to Inventory Your System Determining Your System Configuration Table 2-5. Naming Conventions for SCF Objects Object Convention Example Description WANBoot processes $ZWB $ZWBA9 WANBoot process associated with TCP/IP SWAN concentrators S S19 Twentieth SWAN concentrator SS7 Telco $C $C Telco process associated with SS7 protocol where is a 2-digit number that identifies the enclosure.
PAGE 64
Using SCF to Determine Your System Configuration Determining Your System Configuration is the slot number and port number mapped as: Slot Number Port Number Slot Number Port Number 51 0 0 53 0 8 51 1 1 53 1 9 51 2 2 53 2 A 51 3 3 53 3 B 52 0 4 54 0 C 52 1 5 54 1 D 52 2 6 54 2 E 52 3 7 54 3 F is a 2-digit number in the range 00 through 99.
PAGE 65
Determining Your System Configuration Using SCF to Determine Your System Configuration SCF Configuration Files Your system is delivered with a standard set of configuration files: • • • The $SYSTEM.SYSnn.CONFBASE file contains the minimal configuration required to load the system. The $SYSTEM.ZSYSCONF.CONFIG file contains a standard system configuration created by HP.
PAGE 66
Determining Your System Configuration Using SCF to Determine Your System Configuration if you type this command and then enter the INFO command without specifying and object, SCF displays only the information for the workstation called $Ll.#TERM1: > SCF ASSUME WS $L1.
PAGE 67
Using SCF to Determine Your System Configuration Determining Your System Configuration Figure 2-14.
PAGE 68
Using SCF to Determine Your System Configuration Determining Your System Configuration The columns in Figure 2-14 mean: LDev The logical device number Name The logical device name PPID The primary processor number and process identification number (PIN) of the specified device BPID The backup processor number and PIN of the specified device Type The device type and subtype RSize The record size the device is configured for Pri The priority level of the I/O process Program The fully qualifie
PAGE 69
Determining Your System Configuration Using SCF to Determine Your System Configuration To display information about a particular device: > SCF LISTDEV TYPE n where n is a number for the device type. For example, if n is 3, the device type is disks and tapes. For the \MS9 system, entering LISTDEV TYPE 3 would display information for $DATA10, $DATA04, $DATA02, and $DATA01.
PAGE 70
Determining Your System Configuration Using SCF to Determine Your System Configuration Kernel Subsystem Before using commands listed in Table 2-8, type this command to make the Kernel subsystem the default object: > SCF ASSUME PROCESS $ZZKRN Generic processes are part of the SCF Kernel subsystem. Generic processes can be created by the operating system or by a user.
PAGE 71
Determining Your System Configuration Using SCF to Determine Your System Configuration When displaying configuration files for disk and tape devices in the storage subsystem, you can use the OBEYFORM option with the INFO command to display currently defined attribute values in the format that you would use to set up a configuration file.
PAGE 72
Determining Your System Configuration Using SCF to Determine Your System Configuration configuration file. Each attribute appears as a syntactically correct system configuration command. For example: ADD ADAPTER $ZZLAN.
PAGE 73
Determining Your System Configuration Using SCF to Determine Your System Configuration Additional Subsystems Controlled by SCF Table 2-12 lists the names associated with additional subsystems that can be controlled by SCF, along with its device types. You can use SCF commands to display the current attribute values for these objects. Some SCF commands are available only to some subsystems. The objects that each command affects and the attributes of those objects are subsystem specific.
PAGE 74
Determining Your System Configuration Using SCF to Determine Your System Configuration Table 2-12.
PAGE 75
Displaying Configuration Information—Examples Determining Your System Configuration Displaying Configuration Information—Examples These examples show SCF commands that display subsystem configuration information, along with the information that is returned. These commands are not preceded by an ASSUME command. To display all the processes running in the Kernel subsystem: -> INFO PROC $ZZKRN.#* The system displays a listing similar to: -> INFO PROC $ZZKRN.#* NONSTOP KERNEL - Info PROCESS \COMM.
PAGE 76
Determining Your System Configuration Displaying Configuration Information—Examples To display detailed information about the ATM subsystem manager: -> INFO PROCESS $ZZKRN.#ZZATM, DETAIL The system displays a listing similar to: -> INFO PROC #ZZATM, DETAIL NONSTOP KERNEL - Detailed Info PROCESS \COCO.$ZZKRN.#ZZATM *AutoRestart...............10 *BackupCPU.................1 *CPU.......................Not Specified *DefaultVolume.............$SYSTEM.SYSTEM *ExtSwap...................Not Specified *Highpin...
PAGE 77
Determining Your System Configuration Displaying Configuration Information—Examples To display detailed information about an ATM 3 ServerNet adapter: -> INFO ADAPTER $ZZATM.$adapter-name, DETAIL where $adapter-name is the logical process name for the adapter. The system displays a listing similar to this example for the adapter $AM1: -> info adapter $zzatm.$am1, detail ATM Detailed Info ADAPTER \TAHITI.$AM1 LOCATION (grp,mod,slot).. 11 ,1 ,53 ACCESSLIST............... 0, 1, 2, 3 AMP Filename (in use)....
PAGE 78
Displaying Configuration Information—Examples Determining Your System Configuration To display a list of all SAC names with their associated owners and access lists: -> info sac $zzlan.* The system displays a listing similar to: -> INFO SAC $ZZLAN.* SLSA Info SAC Name $ZZLAN.E0353.0 $ZZLAN.E0353.1 $ZZLAN.E0354.0 $ZZLAN.E0354.1 $ZZLAN.E0553.0 $ZZLAN.E0553.1 $ZZLAN.E0554.0 $ZZLAN.E0554.1 $ZZLAN.FE1154.0 $ZZLAN.MIOE0.0 $ZZLAN.MIOE1.
PAGE 79
Determining Your System Configuration Displaying Configuration Information—Examples To display configuration attribute values for all the WAN subsystem configuration managers, TCP/IP processes, and WANBoot processes: -> INFO PROCESS $ZZWAN.* The system displays a listing similar to: -> INFO PROCESS $ZZWAN.* WAN MANAGER Detailed Info Process \COMM.$ZZWAN.#5 RecSize........... 0 *Type............. (50,00) Preferred Cpu..... 5 Alternate Cpu..... 65535 *IOPOBJECT........ \COMM.$SYSTEM.SYS00.
PAGE 80
Displaying Configuration Information—Examples Determining Your System Configuration To display detailed information about an Expand line-handler process: ->INFO LINE $line-name, DETAIL where $line-name is the logical line-handler process name. The system displays a listing similar to this example for the line $ATMBAT: -> info line $atmbat, detail EXPAND Detailed Info LINE $ATMBAT (LDEV 219) L2Protocol Net^Atm TimeFactor... 570K *SpeedK.. NOT_SET Framesize....... 132 -Rsize........... 3 -Speed.......
PAGE 81
3 Overview of Monitoring and Recovery When to Use This Section 3-1 Importance of Monitoring 3-2 Monitoring Tasks 3-2 Working With a Daily Checklist 3-2 Tools for Checking the Status of System Hardware 3-3 Additional Monitoring Tasks 3-7 Monitoring and Resolving Problems—An Approach 3-8 Using OSM or TSM to Monitor the System 3-8 Using the OSM Service Connection or TSM Service Application Checking for Problems and Alarms 3-10 Recovery Operations for Problems Detected by TSM 3-12 Using SCF to Monitor the Syste
PAGE 82
Overview of Monitoring and Recovery Importance of Monitoring Importance of Monitoring You must monitor a system to ensure that it is operating properly and to recognize when corrective action is required.
PAGE 83
Overview of Monitoring and Recovery Tools for Checking the Status of System Hardware An example of a checklist you might use to standardize your routine daily monitoring tasks is: Task Operator’s name Date & time Notes and questions Check phone messages Check faxes Check e-mail Check shift log Check EMS event messages Check status of terminals Check comm.
PAGE 84
Tools for Checking the Status of System Hardware Overview of Monitoring and Recovery Table 3-1. Monitoring System Components (page 1 of 3) Monitored Using These Tools Resource Adapters for communications subsystems: ATM3SA CCSA OSM Service Connection or TSM Service Application SCF interface to various subsystems E4SA FESA See...
PAGE 85
Tools for Checking the Status of System Hardware Overview of Monitoring and Recovery Table 3-1. Monitoring System Components (page 2 of 3) Monitored Using These Tools Resource Disk drives, external, attached to ServerNet/DA or FCSA OSM Service Connection or TSM Service Application SCF interface to the storage subsystem DSAP See...
PAGE 86
Tools for Checking the Status of System Hardware Overview of Monitoring and Recovery Table 3-1. Monitoring System Components (page 3 of 3) Monitored Using These Tools Resource See...
PAGE 87
Additional Monitoring Tasks Overview of Monitoring and Recovery Additional Monitoring Tasks Table 3-2 provides an example of additional areas you should monitor daily. Table 3-2.
PAGE 88
Overview of Monitoring and Recovery Monitoring and Resolving Problems—An Approach Monitoring and Resolving Problems—An Approach A useful approach to identifying and resolving problems in your system is to first use OSM or TSM to locate the focal point of a hardware problem and then use SCF to gather all the related data from the subsystems that control or act on the hardware.
PAGE 89
Using the OSM Service Connection or TSM Service Application Overview of Monitoring and Recovery Overview Pane Displays a high-level view of system objects, such as internal fabrics, groups, and external devices (external disks and tapes), and of ServerNet Cluster objects, such as external fabrics, local nodes, and remote nodes.
PAGE 90
Checking for Problems and Alarms Overview of Monitoring and Recovery • The Alarms tab lists the alarms for the resource selected in the tree pane. Figure 3-2. Attributes Tab of Management Window VST716.vsd Checking for Problems and Alarms For most system components, you can use the OSM Service Connection or the TSM Service Application to quickly identify problems.
PAGE 91
Overview of Monitoring and Recovery Checking for Problems and Alarms In the details pane, if the Service State value is Service Required and shows a red triangle, the resource is not functioning. If the Service State value is Attention Required and shows a yellow triangle, the resource is not functioning normally. Table 3-3.
PAGE 92
Overview of Monitoring and Recovery Recovery Operations for Problems Detected by OSM or TSM Monitoring Alarms 1. Log on to the OSM Service Connection or the TSM Service Application. 2. From the tree pane, locate and select the resource. 3. From the details pane, click the Alarms tab. 4. Double-click a specific alarm to display the Alarm Detail dialog box. To get a summary of all outstanding alarms on the system: 1. In the OSM Management window, select Summary>Alarm.
PAGE 93
Using SCF to Monitor the System Overview of Monitoring and Recovery components called customer-replaceable units (CRUs). For more information, contact your service provider, or refer to the CSSI Web. Using SCF to Monitor the System Use the Subsystem Control Facility (SCF) to display information and current status for all the devices on your system known to SCF. Some SCF commands are available only to some subsystems.
PAGE 94
Determining Device States Overview of Monitoring and Recovery Some other examples of the SCF STATUS command are: -> STATUS LINE $LAM3 -> STATUS WS $LAM3.#WS1 -> STATUS WS $LAM3.* -> STATUS WINDOW $LAM3.#WS1.* -> STATUS WINDOW $LAM3.*, SEL STOPPED The general format of the STATUS display follows. However, the format varies depending on the subsystem.
PAGE 95
Determining Device States Overview of Monitoring and Recovery SCF Object States Table 3-4 lists and explains the possible object states that the SCF STATUS command can report. Table 3-4. SCF Object States (page 1 of 2) State Substate Explanation ABORTING The object is being aborted. The object is responding to an ABORT command or some type of malfunction. In this state, no new links are allowed, and drastic measures might be underway to reach the STOPPED state. This state is irrevocable.
PAGE 96
Determining Device States Overview of Monitoring and Recovery Table 3-4. SCF Object States (page 2 of 2) State Substate Explanation STOPPING The object is in transition to the STOPPED state. No new links are allowed to or from the object. Existing links are in the process of being deleted. SUSPENDED The flow of information to and from the object is restricted. (It is typically prevented.
PAGE 97
Overview of Monitoring and Recovery Monitoring and Recovery—Example Monitoring and Recovery—Example This subsection describes a hypothetical situation in which you use the system tools available—event log, TSM applications, and SCF—to identify, analyze, and solve a hardware problem. Note. The cabling and topology diagrams in this subsection identify all ServerNet expansion boards as SEBs. Your system might instead have modular ServerNet expansion boards (MSEBs) in the slots designated for SEBs.
PAGE 98
A Problem Occurs Overview of Monitoring and Recovery Figure 3-3.
PAGE 99
A Problem Occurs Overview of Monitoring and Recovery Figure 3-4.
PAGE 100
Overview of Monitoring and Recovery • • • • Using OSM or TSM to Locate the Problem Four “port error” messages from group 03: three from the PMF CRU in slot 55, and one from the SEB in slot 52 Two “domain deletion” error messages from processor group 01, both from the SEB in slot 52 A “Path change on device $ZZLAN.E3153.0.
PAGE 101
Using SCF to Locate the Problem Overview of Monitoring and Recovery You decide to do some more research, this time using SCF.
PAGE 102
Overview of Monitoring and Recovery Using SCF to Locate the Problem The boldface output in the display shows that the Y-fabric connection between the processors in group 01 (processors 0 and 1) and the processors in group 03 (processors 4 and 5) is down. The X fabric is functioning normally. Because the UP values in the display show that all the processors are able to communicate with other processors in the system, you conclude that all the PMF CRUs are functioning normally.
PAGE 103
Using SCF to Locate the Problem Overview of Monitoring and Recovery This partial listing shows some of the disk configuration information: STORAGE - Configuration Information Magnetic DISK \GATE8.$D3101 Common Disk Configuration Information: Primary Path Information: Adapter............................... $ZZSTO.#IOMF.GRP-31.MOD-1.SLOT-50 Disk Device ID........................ 4 Location (Group,Module,Slot).......... (31,1,1) SAC Name..........................$ZZSTO.#IOMF.SAC-2.GRP-31.MOD-1.
PAGE 104
Overview of Monitoring and Recovery Using SCF to Locate the Problem The primary path is currently active. Because the primary path is configured for the X fabric, you conclude that the disk first tried to use the Y fabric and failed before switching to the X fabric. Now you want to find out why the path switch was not transparent.
PAGE 105
Calling the Service Provider Overview of Monitoring and Recovery Figure 3-5 shows the affected communications path. Figure 3-5.
PAGE 106
Overview of Monitoring and Recovery Automating Routine System Monitoring Automating Routine System Monitoring You can automate many of the monitoring procedures. Automation saves you time and helps you to perform many routine tasks more efficiently. Your operations environment might be using TACL macros, TACL routines, or command files to perform routine system monitoring and other tasks.
PAGE 107
Automating Routine System Monitoring Overview of Monitoring and Recovery Example 3-2. System Monitoring Output File (page 1 of 3) COMMENT THIS IS THE FILE SYSCHK COMMENT THIS CHECKS ALL DISKS: SCF STATUS DISK $* STORAGE - Status DISK \SHARK.$DATA12 LDev Primary Backup Mirror 52 *STARTED STARTED *STARTED STORAGE - Status DISK \SHARK.$DATA01 LDev Primary Backup Mirror 63 *STARTED STARTED *STARTED STORAGE - Status DISK \SHARK.
PAGE 108
Automating Routine System Monitoring Overview of Monitoring and Recovery Example 3-2. System Monitoring Output File (page 2 of 3) COMMENT THIS CHECKS ALL SACS: SCF STATUS SAC $* SLSA Status SAC Name $ZZLAN.E4SA1.0 $ZZLAN.E4SA1.1 $ZZLAN.E4SA1.2 $ZZLAN.E4SA1.3 Owner 1 0 0 1 State STARTED STARTED STARTED STARTED COMMENT THIS CHECKS ALL ADAPTERS SCF STATUS ADAPTER $* SLSA Status ADAPTER Name $ZZLAN.MIOE0 $ZZLAN.E4SA0 $ZZLAN.MIOE1 $ZZLAN.
PAGE 109
Automating Routine System Monitoring Overview of Monitoring and Recovery Example 3-2.
PAGE 110
Using the Status LEDs to Monitor the System Overview of Monitoring and Recovery Using the Status LEDs to Monitor the System Status LEDs on the various enclosures and system components light during certain operations, such as when the system performs a series of power-on self-tests (POSTs) when a server is first powered on. Table 3-5 lists some of the status light-emitting diodes (LEDs) and their functions. Table 3-5.
PAGE 111
Related Reading Overview of Monitoring and Recovery Table 3-5. Status LEDs and Their Functions (page 2 of 2) Location LED Name Color Function Gigabit Ethernet 4-port ServerNet adapater (G4SA) Power-on Green Lights when the adapter is receiving power. Service Amber Lights to indicate internal failure or service action required. Power-on Green Lights when the adapter is receiving power. Service Amber Lights to indicate internal failure or service action required.
PAGE 112
Related Reading Overview of Monitoring and Recovery Table 3-6. Related Reading for Monitoring Task Tool For information, see...
PAGE 113
4 Monitoring EMS Event Messages When to Use This Section 4-1 What Is the Event Management Service (EMS)? Tools for Monitoring EMS Event Messages 4-2 EMSDIST 4-2 OSM Event Viewer 4-2 TSM Event Viewer 4-2 ViewPoint 4-3 Related Reading 4-3 4-2 When to Use This Section Use this section for a brief description of the Event Management Service (EMS) and the tools used to monitor EMS event messages.
PAGE 114
Monitoring EMS Event Messages What Is the Event Management Service (EMS)? What Is the Event Management Service (EMS)? The Event Management Service (EMS) is a collection of processes, tools, and interfaces that support the reporting and retrieval of event information.
PAGE 115
ViewPoint Monitoring EMS Event Messages problems. You can view events from any EMS formatted log files ($0, $ZLOG, or an alternate collector), including event logs saved on the server. To access the TSM Event Viewer, refer to Launching OSM and TSM Applications on page 1-11. This guide does not describe using the TSM Event Viewer. For more information, refer to online help for the TSM Event Viewer.
PAGE 116
Monitoring EMS Event Messages HP NonStop S-Series Operations Guide—522459-007 4 -4 Related Reading
PAGE 117
5 Processes: Monitoring and Recovery When to Use This Section 5-1 Types of Processes 5-2 System Processes 5-2 I/O Processes (IOPs) 5-2 Generic Processes 5-2 Monitoring Processes 5-3 Monitoring System Processes 5-3 Monitoring IOPs 5-4 Monitoring Generic Processes 5-4 Recovery Operations for Processes 5-6 Related Reading 5-6 When to Use This Section This section provides basic information about the different types of processes for NonStop S-series servers.
PAGE 118
Processes: Monitoring and Recovery Types of Processes Types of Processes Three types of processes are of major concern to a system operator of a NonStop S-series server: • • • System processes I/O processes (IOPs) Generic processes System Processes A system process is a privileged process that is created during system load and exists continuously for a given configuration for as long as the processor remains operable.
PAGE 119
Processes: Monitoring and Recovery ° ° ° Monitoring Processes OSM server processes TSM server processes The $ZZLAN ServerNet LAN Systems Access (SLSA) subsystem manager process Monitoring Processes This subsection briefly provides examples of some of the tools available to monitor processes. For some processes, such as IOPs, monitoring is more fully discussed in other manuals. In general, use this method to monitor processes: 1.
PAGE 120
Monitoring IOPs Processes: Monitoring and Recovery Monitoring IOPs For a list of manuals that provide more information about monitoring I/O processes (IOPs), refer to Section 6, Communications Subsystems: Monitoring and Recovery. Monitoring Generic Processes Because generic processes are configured using the SCF interface to the Kernel subsystem, you specify the $ZZKRN Kernel subsystem manager process when monitoring a generic process.
PAGE 121
Monitoring Generic Processes Processes: Monitoring and Recovery This example shows the output produced by this command: 1-> STATUS PROCESS $ZZKRN.#* NONSTOP KERNEL - Status PROCESS \BACH.$ZZKRN Symbolic Name CEV-SERVER-MANAGER-P0 CEV-SERVER-MANAGER-P1 CLCI-TACL FOX MSGMON MSGMON MSGMON MSGMON MSGMON MSGMON MSGMON MSGMON MSGMON MSGMON MSGMON MSGMON MSGMON MSGMON MSGMON MSGMON OSM-APPSRVR OSM-CIMOM OSM-CONFLH-RD OSM-OEV QIOMON-0 QIOMON-1 QIOMON-10 QIOMON-11 QIOMON-12 . . .
PAGE 122
Processes: Monitoring and Recovery Recovery Operations for Processes The asterisks (*) indicate files that do not appear if only OSM (and not TSM) is installed. OSM renames some TSM-related files for use by both applications. For example, $TSMM0 and $TSMM1 become $OSMM0 and $OSMM1 after OSM is installed. You can still run TSM even though $TSMM0 and $TSMM1 no longer appear by those names. $ZLOG is another file that is used by both OSM and TSM. (The symbolic name no longer contains TSM.
PAGE 123
6 Communications Subsystems: Monitoring and Recovery When to Use This Section 6-1 Communications Subsystems 6-2 Local Area Networks (LANs) and Wide Area Networks (WANs) Monitoring Communications Subsystems and Their Objects 6-4 Monitoring the SLSA Subsystem 6-4 Monitoring the WAN Subsystem 6-7 Monitoring the NonStop TCP/IP Subsystem 6-9 Monitoring Other Communications Subsystems 6-11 Monitoring Line-Handler Process Status 6-12 Tracing a Communications Line 6-14 Recovery Operations for Communications Subsyst
PAGE 124
Communications Subsystems: Monitoring and Recovery Communications Subsystems Communications Subsystems The software that provides users of NonStop S-series systems with access to a set of communications services is called a communications subsystem. Because connectivity is an important part of online transaction processing (OLTP), HP offers a variety of communications products that support a wide range of applications.
PAGE 125
Communications Subsystems: Monitoring and Recovery Local Area Networks (LANs) and Wide Area Networks (WANs) Processes that use the SLSA subsystem to send and receive data on a LAN attached to a NonStop S-series server are called LAN service providers. Three service providers—the NonStop TCP/IP subsystem and Parallel Library TCP/IP subsystem, the Port Access Method (PAM), and NonStop IPX/SPX—are currently supported.
PAGE 126
Communications Subsystems: Monitoring and Recovery Monitoring Communications Subsystems and Their Objects Object Connectivity By Expand Subsystem network control process and linehandler processes Line-handler processes Line-handler processes X25AM ServerNet cluster (Expandover-ServerNet) You can define these communications subsystem objects as WAN subsystem devices. Monitoring Communications Subsystems and Their Objects Monitoring and recovery operations for communications subsystems can be complex.
PAGE 127
Communications Subsystems: Monitoring and Recovery Monitoring the SLSA Subsystem A listing similar to this example is sent to your home terminal: ->STATUS ADAPTER $ZZLAN.E0353 SLSA Status ADAPTER Name $ZZLAN.E0353 State STARTED This example shows the listing displayed when checking all adapters on $ZZLAN: > SCF STATUS ADAPTER $ZZLAN.* 1->STATUS ADAPTER $ZZLAN.* SLSA Status ADAPTER Name $ZZLAN.E0353 $ZZLAN.E0354 $ZZLAN.E0553 $ZZLAN.E0554 $ZZLAN.FE1154 $ZZLAN.MIOE0 $ZZLAN.
PAGE 128
Communications Subsystems: Monitoring and Recovery Monitoring the SLSA Subsystem A listing similar to this example is sent to your home terminal: ->STATUS PIF $ZZLAN.E0353.0 SLSA Status PIF Name $ZZLAN.E0353.0.A State STARTED This example shows a listing of the status of all PIFs on $ZZLAN.E0353: > SCF STATUS PIF $ZZLAN.E0353.* ->STATUS PIF $ZZLAN.E0353.* SLSA Status PIF Name $ZZLAN.E0353.0.A $ZZLAN.E0353.0.B $ZZLAN.E0353.1.A $ZZLAN.E0353.1.B State STARTED STARTED STOPPED STARTED 4.
PAGE 129
Communications Subsystems: Monitoring and Recovery Monitoring the WAN Subsystem Monitoring the WAN Subsystem This subsection describes how to obtain the status of SWAN concentrators, data communications devices, processes, and CLIPs. For more information on the WAN subsystem, see the WAN Subsystem Configuration and Management Manual. Monitoring Status for a SWAN Concentrator To display the current status for a SWAN concentrator: > SCF STATUS ADAPTER $ZZWAN.
PAGE 130
Communications Subsystems: Monitoring and Recovery Monitoring the WAN Subsystem The system displays a listing similar to: -> status DEVICE $zzwan.#IP01 WAN Manager STATUS DEVICE for DEVICE STATE ...........STARTED \COWBOY.$ZZWAN.#IP01 LDEV number....173 PPIN...........2, 13 BPIN............3, 11 Monitoring WAN Processes To display the status of all WAN subsystem processes—configuration managers, TCP/IP processes, WANBoot processes: > SCF STATUS PROCESS $ZZWAN.
PAGE 131
Communications Subsystems: Monitoring and Recovery Monitoring the NonStop TCP/IP Subsystem The system displays a listing similar to: -> status PROCESS $ZZWAN.#ZB017 WAN Manager STATUS PROCESS for PROCESS \ICEBAT.$ZZWAN.#ZB017 STATE:...........STARTED PPIN.............0 ,278 BPIN.............0, 282 Monitoring CLIPs To display the current status for a CLIP: > SCF STATUS SERVER $ZZWAN.#concentrator-name.clip-num Values for the CLIP number are 1, 2, or 3.
PAGE 132
Communications Subsystems: Monitoring and Recovery Monitoring the NonStop TCP/IP Subsystem The system displays a listing similar to this output, which is for process $ZTCO: -> Status Process $ZTCO TCPIP Status PROCESS \SYSA.$ZTCO Status: STARTED PPID.................( 0,107) Proto TCP TCP TCP TCP State TIME-WAIT TIME-WAIT ESTAB TIME-WAIT Laddr 130.252.12.3 130.252.12.3 130.252.12.3 130.252.12.3 BPID.............. ( 1. 98) Lport ftp-data ftp-data ftp smtp Faddr 130.252.12.152 130.252.12.152 130.252.12.
PAGE 133
Communications Subsystems: Monitoring and Recovery Monitoring Other Communications Subsystems Monitoring Other Communications Subsystems Depending how your system is configured, you might need to monitor additional key subsystems such as the ServerNet/FX adapter subsystem and the Asynchronous Transfer Mode (ATM) subsystem. Examples of commands for these follow.
PAGE 134
Communications Subsystems: Monitoring and Recovery Monitoring Line-Handler Process Status ATM Subsystem The next command displays detailed status for the ATM subsystem manager process. (The ATM subsystem manager process is defined as a generic process in the NonStop Kernel.) > SCF STATUS PROCESS $ZZKRN.#ZZATM, DETAIL The system displays a listing similar to: 1-> STATUS PROC $ZZKRN.#ZZATM,D NONSTOP KERNEL - Detailed Status PROCESS \COCO.$ZZKRN.#ZZATM Backup PID........ 0 , 21 Creation Time.....
PAGE 135
Communications Subsystems: Monitoring and Recovery Monitoring Line-Handler Process Status The data shown in the report means: Name Specifies the name of the object State Indicates the summary state of the object, which is either STARTED, STARTING, DIAGNOSING (for SWAN concentrators only), or STOPPED PPID Specifies the primary process ID BPID Specifies the backup process ID ConMgr-LDEV Contains the LDEV of the concentrator manager process. This field applies only to SWAN concentrator lines.
PAGE 136
Communications Subsystems: Monitoring and Recovery Tracing a Communications Line The system displays a listing similar to this output.
PAGE 137
Communications Subsystems: Monitoring and Recovery Recovery Operations for Communications Subsystems Recovery Operations for Communications Subsystems Some general troubleshooting guidelines are: • • Examine the contents of the event message log for the subsystem. For example, the WAN subsystem or Kernel subsystem might have been issued an event message that provides information about the process failure.
PAGE 138
Communications Subsystems: Monitoring and Recovery Related Reading Table 6-1. Related Reading for Communications Lines and Devices (page 2 of 2) For Information About... Refer to...
PAGE 139
7 ServerNet/DA: Monitoring and Recovery When to Use This Section 7-1 Overview of the ServerNet/DA 7-1 Monitoring the ServerNet/DA 7-1 Identifying Problems With the ServerNet/DA 7-2 Recovery Operations for the ServerNet/DA 7-3 Related Reading 7-3 When to Use This Section Use this section for monitoring and recovery information for the 6760 ServerNet device adapter (ServerNet/DA).
PAGE 140
ServerNet/DA: Monitoring and Recovery Identifying Problems With the ServerNet/DA Identifying Problems With the ServerNet/DA When monitoring the ServerNet/DA using the OSM Service Connection or the TSM Service Application, the Power State and the Subcomponent State of the ServerNet/DA should indicate normal operation. Table 7-1 lists the possible states for the ServerNet/DA. Table 7-1.
PAGE 141
ServerNet/DA: Monitoring and Recovery Recovery Operations for the ServerNet/DA Recovery Operations for the ServerNet/DA Refer to the 6760 ServerNet/DA Manual.
PAGE 142
ServerNet/DA: Monitoring and Recovery HP NonStop S-Series Operations Guide—522459-007 7 -4 Related Reading
PAGE 143
8 Fibre Channel ServerNet Adapter: Monitoring and Recovery When to Use This Section 8-1 Overview of the FCSA 8-1 Monitoring the FCSAs 8-1 Identifying Problems With FCSAs 8-2 Recovery Operations for the FCSA 8-2 Related Reading 8-2 When to Use This Section Use this section for monitoring and recovery information for the Fibre Channel ServerNet adapters (FCSAs).
PAGE 144
Fibre Channel ServerNet Adapter: Monitoring and Recovery Identifying Problems With FCSAs The SCF Reference Manual for the Storage Subsystem provides reference details and examples for using the SCF INFO and SCF STATUS commands. Identifying Problems With FCSAs When monitoring FCSAs using the OSM Service Connection, the Service State and the Subcomponent State of the FCSAs should indicate normal operation.
PAGE 145
9 Disk Drives: Monitoring and Recovery When to Use This Section 9-1 Overview of Disk Drives 9-2 Monitoring Disk Drives 9-3 Monitoring Event Messages 9-3 Monitoring the Status of Disk Drives Using SCF 9-3 Monitoring Disk Drives Using the TSM Service Application or the OSM Service Connection 9-7 Monitoring the State of Disk Drives 9-8 Monitoring the Use of Space on a Disk Volume 9-8 Monitoring the Size of Database Files 9-9 Monitoring Disk Performance 9-10 Monitoring Disk Configuration Information 9-10 Identi
PAGE 146
Overview of Disk Drives Disk Drives: Monitoring and Recovery Overview of Disk Drives The NonStop S-series server supports both internal and external disk drives. A system enclosure can contain different types of disk drives. However, both disk drives in a mirrored volume must always be the same type of drive.
PAGE 147
Monitoring Disk Drives Disk Drives: Monitoring and Recovery Monitoring Disk Drives Several tools are available to monitor the current status, space usage, configuration, and performance of disk drives. For a description of the tools mentioned in this section, refer to Appendix B, Tools and Utilities for Operations. Monitoring Event Messages For information about displaying EMS events generated by storage devices and subsystems, refer to Section 4, Monitoring EMS Event Messages.
PAGE 148
Monitoring the Status of Disk Drives Using SCF Disk Drives: Monitoring and Recovery 2. To get more information about a specific disk, use the SCF STATUS DISK, DETAIL command. For example: -> STATUS DISK $DATA09, DETAIL The output from this example shows that $DATA09 is stopped in the STOPPED state, HARDDOWN substate. 65-> SCF STATUS DISK $DATA09, DETAIL SCF - T9082G02 - (30JUN97) (14MAY97) - 11/05/98 13:24:10 System \SHARK STORAGE - Detailed Status DISK \SHARK.
PAGE 149
Monitoring the Status of Disk Drives Using SCF Disk Drives: Monitoring and Recovery • To display the status of all disks: -> STATUS DISK $* 1-> STATUS DISK $* STORAGE - Status DISK \COMM.$SYSTEM LDev Primary Backup Mirror 6 *STARTED STARTED *STARTED MirrorBackup STARTED Primary PID 0,257 Backup PID 1,257 Primary PID 2,288 Backup PID 3,267 STORAGE - Status VIRTUAL DISK \COMM.$VIEWPT LDev State Primary Backup Type Subtype PID PID 147 STARTED 9,22 8,53 3 36 STORAGE - Status VIRTUAL DISK \COMM.
PAGE 150
Monitoring the Status of Disk Drives Using SCF Disk Drives: Monitoring and Recovery • To display the detailed status of the disk $DATA01: -> STATUS $DATA01, DETAIL 35-> STATUS $DATA01, DETAIL STORAGE - Detailed Status DISK \SHARK.
PAGE 151
Disk Drives: Monitoring and Recovery Monitoring Disk Drives Using the TSM Service Application or the OSM Service Connection The data shown in the output means: LDev The logical device number Path The disk path assignment PathStatus The status of the disk path; whether the disk path is the current path (ACTIVE) or not (INACTIVE) State The current SCF state of the disk path SubState The current SCF substate of the disk path Primary PID The primary processor number and process identification numbe
PAGE 152
Disk Drives: Monitoring and Recovery Monitoring the State of Disk Drives Monitoring the State of Disk Drives Each disk drive is configured to have two paths, the primary path and the backup path. Thus, for a disk drive, the states of the two disk paths are represented separately. Table 9-1 lists possible values for the current state of a disk path. Table 9-1. States for Disk Drive Paths Primary Path State or Backup Path State Description Degraded This path of this disk drive has a state other than Up.
PAGE 153
Disk Drives: Monitoring and Recovery Monitoring the Size of Database Files Monitoring the Size of Database Files This subsection explains how to monitor the size of critical database files to prevent a “file full” error (error 45) from occurring. To check the size of any file on your system: > FUP INFO filename, DETAIL A report similar to this one is sent to your home terminal: $DATA.FILES.
PAGE 154
Disk Drives: Monitoring and Recovery Monitoring Disk Performance Monitoring Disk Performance Monitoring disk performance is not discussed in this guide. See these manuals: • • SCF Reference Manual for the Storage Subsystem provides information about monitoring disk block and cache statistical information. The Measure User’s Guide is written for system analysts and system managers and describes how to use the Measure performance monitor to collect and examine system performance data.
PAGE 155
Identifying Disk Drive Problems Disk Drives: Monitoring and Recovery Identifying Disk Drive Problems The most common disk drive problems on a NonStop S-series server include: • • • • Space problems such as full disks or free-space fragmentation Stopped disks Performance problems Defective tracks or sectors Table 9-2 lists the most common disk drive problems and their possible symptoms. For recovery operations, refer to Recovery Operations for Disk Drives on page 9-12. Table 9-2.
PAGE 156
Disk Drives: Monitoring and Recovery Recovery Operations for Disk Drives Recovery Operations for Disk Drives These SCF commands are available for controlling DISK objects: SCF Command Description ABORT Terminates the operation of a disk drive immediately, leaving it in the STOPPED state, HARDDOWN substate. ALTER Changes attribute values for a storage device. CONTROL Issues disk-specific commands.
PAGE 157
Disk Drives: Monitoring and Recovery Recovery Operations for Free-Space Fragmentation of a Disk Recovery Operations for Free-Space Fragmentation of a Disk Use the Disk Compression Program (DCOM) to consolidate disk space usage. For a description of DCOM, refer to Disk Compression Program (DCOM) on page B-2. Recovery Operations for a Full Disk To prevent or recover from problems caused by a full disk: 1. Use the Disk Space Analysis Program (DSAP) utility to identify large, old, and little used files. 2.
PAGE 158
Recovery Operations for Defective Sectors Disk Drives: Monitoring and Recovery $SYSTEM SYS00 6> SCF STATUS DISK $*-* SCF - T9082G02 - (29JUN98) (27MAY98) - 10/22/98 15:12:51 System \ALPHA12 STORAGE - Status DISK \ALPHA12.$DATA06-* LDev Path Status State 116 116 116 116 PRIMARY BACKUP MIRROR MIRROR-BACKUP ACTIVE INACTIVE INACTIVE INACTIVE STARTED STARTED STOPPED STOPPED STORAGE - Status DISK \ALPHA12.
PAGE 159
Disk Drives: Monitoring and Recovery Recovery Operations for a Nearly Full Database File Recovery Operations for a Nearly Full Database File When a database file is 90 percent full or more, you can modify the file extents dynamically with FUP or perform other procedures as determined by your local system policies. Note. The allocation of additional extents to any file causes that file to take up more disk space.
PAGE 160
Disk Drives: Monitoring and Recovery Recovery Operations for Failed Disk Drives Guide. If you have an alternate system disk for emergency backup, you can minimize unplanned outage minutes. If an alternate system disk is not available and you are unable to load from the CONFBASE file, you might be able to perform a tape load from a system image tape (SIT) to restore the system image files to the $SYSTEM disk (SYSnn and CSSnn subvolumes) and then load that image into either processor 0 or 1.
PAGE 161
10 Tape Drives: Monitoring and Recovery When to Use This Section 10-1 Overview of Tape Drives 10-2 Monitoring Tape Drives 10-3 Monitoring Tape Drive Status 10-3 Monitoring the Status of Labeled-Tape Operations 10-9 Identifying Tape Drive Problems 10-9 Recovery Operations for Tape Drives 10-10 Recovery Operations Using SCF 10-10 Recovery Operations Using the OSM Service Connection 10-10 Recovery Operations Using the TSM Service Application 10-10 Related Reading 10-11 When to Use This Section This section pr
PAGE 162
Overview of Tape Drives Tape Drives: Monitoring and Recovery Overview of Tape Drives Tape drives are external devices that connect to a NonStop S-series server using one of these methods: • • Through a 6760 ServerNet device adapter (ServerNet/DA) for G06.01 and subsequent G-series RVUs. For more information about ServerNet/DA, refer to Section 7, ServerNet/DA: Monitoring and Recovery.
PAGE 163
Tape Drives: Monitoring and Recovery Monitoring Tape Drives Monitoring Tape Drives These tools are available to monitor tape drives: • • Use the SCF interface to the storage subsystem, the OSM Service Connection, or the TSM Service Application to monitor and get status information about tape drives. Use MEDIACOM to monitor the use of tape drives and to write tape labels. Monitoring Tape Drive Status This subsection explains how to list the tape drives on your system and determine their status. Note.
PAGE 164
Tape Drives: Monitoring and Recovery Monitoring Tape Drive Status Figure 10-2. Monitoring Tape Drives With OSM VST811.
PAGE 165
Tape Drives: Monitoring and Recovery Monitoring Tape Drive Status Monitoring Tape Drive Status With TSM To check the status of all tape drives on your system with the TSM Service Application: 1. Log on to the TSM Service Application. 2. From the tree pane (Figure 10-3): a. Double-click Tape Drives. b. Click the tape drive whose status you want to check. 3. From the Attributes tab in the details pane: a. Check that the Service State is OK.
PAGE 166
Tape Drives: Monitoring and Recovery Monitoring Tape Drive Status Figure 10-3. Monitoring Tape Drives With TSM CDT 810.
PAGE 167
Monitoring Tape Drive Status Tape Drives: Monitoring and Recovery Monitoring Tape Drive Status With SCF To check the status of all tape drives on your system with SCF: > SCF STATUS TAPE $* A listing similar to this one is sent to your home terminal: STORAGE - Status TAPE \MINDEN.$XTAPE LDev State Primary Backup PID PID 93 STOPPED 1,287 0,279 STORAGE - Status TAPE \MINDEN.
PAGE 168
Monitoring Tape Drive Status Tape Drives: Monitoring and Recovery Monitoring Tape Drive Status With MEDIACOM The MEDIACOM command STATUS TAPEDRIVE displays the current status of a tape drive. Among other things, this command tells you whether a tape is mounted on the drive, the name of the DEFINE associated with the tape, and which volume catalog and pool owns it. Note. Manual unloading of a tape is not detected by a tape drive, so information from STATUS TAPEDRIVE can be out of date.
PAGE 169
Tape Drives: Monitoring and Recovery Monitoring the Status of Labeled-Tape Operations Monitoring the Status of Labeled-Tape Operations Use the MEDIACOM STATUS TAPEDRIVE and STATUS TAPEMOUNT commands to determine the current status of labeled-tape operations on your system.
PAGE 170
Tape Drives: Monitoring and Recovery Recovery Operations for Tape Drives Recovery Operations for Tape Drives You can perform recovery operations on tape drives using either the SCF interface to the storage subsystem, the OSM Service Connection, or the TSM Service Application.
PAGE 171
Related Reading Tape Drives: Monitoring and Recovery b. Right-click the tape drive. c. Select Actions from the menu. The Actions dialog box appears. d. You can select up or down, which correspond to the SCF commands START and STOP, or select various tests to perform on the tape drive. For information on recovery operations, refer to the TSM online help or suggested Repair Actions text (listed under Alarm Details) for specific tape-related alarms in the TSM Service Application.
PAGE 172
Related Reading Tape Drives: Monitoring and Recovery Table 10-2. Related Reading for Tapes and Tape Drives (page 2 of 2) For Information About... Refer to...
PAGE 173
11 Processors: Monitoring and Recovery When to Use This Section 11-1 Monitoring and Maintaining Processors 11-2 Monitoring Processor Status Using OSM or TSM 11-2 Monitoring Event Messages 11-3 Monitoring the State of PMF CRUs 11-3 Monitoring Processor Performance Using ViewSys 11-4 Identifying Processor Problems 11-5 Hardware Error Freezes 11-5 Processor Hangs 11-5 Processor Halts 11-5 Recovery Operations for Processors 11-6 Halting One or More Processors 11-7 Recovery Operations for a System Hang 11-7 Reco
PAGE 174
Monitoring and Maintaining Processors Processors: Monitoring and Recovery Monitoring and Maintaining Processors Use OSM, TSM, the ViewSys product, and other tools to monitor processors.
PAGE 175
Processors: Monitoring and Recovery Monitoring Event Messages Monitoring Event Messages For more information, refer to Monitoring EMS Event Messages on page 4-1. Monitoring the State of PMF CRUs Use the OSM Service Connection or the TSM Service Application to monitor the state of each processor multifunction (PMF) customer-replaceable unit (CRU). To monitor a PMF CRU and determine the cause of a problem: 1. In the Management window, check the Physical view of the PMF CRU.
PAGE 176
Processors: Monitoring and Recovery Monitoring Processor Performance Using ViewSys Monitoring Processor Performance Using ViewSys Use the ViewSys product to view system resources online and to see information on system performance. ViewSys provides information about processor activity. Using ViewSys, you can list the processors on your system and determine their status. For more information, refer to ViewSys on page B-7.
PAGE 177
Processors: Monitoring and Recovery Identifying Processor Problems Identifying Processor Problems Abnormal processor states include hardware error freezes, system hangs, and processor halts. Hardware Error Freezes A hardware error freeze occurs when a processor cannot continue processing due to the risk of using corrupt data from a hardware error. Contact your service provider before dumping a frozen processor.
PAGE 178
Processors: Monitoring and Recovery Recovery Operations for Processors If system freeze is enabled, the status for all other freeze-enabled processors becomes: Frozen by other processor The Processor Halt Codes Manual documents processor halt codes. Note. Do not freeze-enable a processor unless instructed to do so by your service provider. Recovery Operations for Processors Processor halts can sometimes be confused with other types of errors.
PAGE 179
Processors: Monitoring and Recovery Halting One or More Processors Halting One or More Processors To place a selected processor or processors in a halt state and set the status and registers of the processor or processors to an initial state: 1. Log on to the OSM or TSM Low-Level Link. 2. On the toolbar, click Processor Status. 3. In the Processor Status dialog box, select the processor to be halted or select all the processors to halt all of them. 4. Select Processor Actions>Halt. 5.
PAGE 180
Processors: Monitoring and Recovery Recovery Operations for a Hardware Error Freeze 7. Reload the remaining processors. Note. After reloading the remaining processors, run your startup scripts if any. Send the dumps to your service provider. Recovery Operations for a Hardware Error Freeze Contact your service provider. Depending on the circumstances, a hardware error freeze might require the PMF CRU or a memory unit to be replaced. See Replacing Processor Memory or a PMF CRU on page 11-27.
PAGE 181
Processors: Monitoring and Recovery Recovery Operations for a Processor Halt 3. Dump (copy) the contents of its memory to disk or tape unless otherwise indicated. Dumping the contents of a halted processor (its registers and entire memory contents) can be a useful diagnostic tool for analyzing and resolving the problem.
PAGE 182
Processors: Monitoring and Recovery Dumping a Processor to Disk Dumping a Processor to Disk A processor dump to disk occurs while the system is running. The dump occurs over either the X or Y ServerNet fabric. When a processor is dumped to disk, the RCVDUMP utility begins copying the dump in a compressed format from the specified processor into a disk file called dumpfile. If dumpfile does not exist, the RCVDUMP utility creates it.
PAGE 183
Processors: Monitoring and Recovery Dumping a Processor to Disk You will need this information when you notify your system manager or service provider about this dump. Procedure to Dump a Processor to Disk Complete syntax and considerations for RECEIVEDUMP and RCVDUMP, as well as the error and informational messages that they generate, are described in the Guardian User’s Guide. For an explanation of the messages generated by RCVDUMP, refer to the TACL Reference Manual.
PAGE 184
Processors: Monitoring and Recovery Enabling or Disabling System Freeze Enabling or Disabling System Freeze The Enable System Freeze tool is for debugging purposes only. Its intent is for use only under the direction of a service provider. Upon activation of Enable System Freeze, when one freeze-enabled processor halts, all other freeze-enabled processors also halt. The default setting is Disable System Freeze. Caution. Do not Enable System Freeze if you are using the server in a production environment.
PAGE 185
Processors: Monitoring and Recovery Enabling or Disabling Freeze on a Processor 4. Click Perform Action. If you selected Disable System Freeze, the action begins immediately. If you selected Enable System Freeze, a message prompts you to confirm the action. 5. If you selected Enable System Freeze, click OK or Cancel. If you clicked OK, the status of the action appears in the Action Status box. Enabling or Disabling System Freeze After System Discovery 1.
PAGE 186
Processors: Monitoring and Recovery Enabling or Disabling Freeze on a Processor Checking If Freeze Is Enabled or Disabled on One or More Processors Using the Processor Status Dialog Box 1. Log on to the OSM or TSM Low-Level Link. 2. Do one of the following: • • From the Summary menu, choose Processor Status. On the toolbar, click Processor Status. The Processor Status dialog box appears. If “F” appears next to a processor, freeze is enabled on that processor.
PAGE 187
Processors: Monitoring and Recovery Freezing the System or Processor 6. In the Action status box, monitor the status of the Enable Freeze or Disable Freeze action: • • After the Enable Freeze action has successfully finished, a completed message appears, and an “F” appears next to the processor in the Processor Status dialog box. After the Disable Freeze action has successfully finished, a completed message appears, and an “F” next to the processor disappears from the Processor Status dialog box.
PAGE 188
Processors: Monitoring and Recovery Freezing the System or Processor Freezing a Processor 1. Check the Processor Freeze attribute for each processor in the system: a. In the tree pane, click the system tab. b. Select the processor. c. Click the Attributes tab in the details pane and check the value of the Processor Freeze attribute. If you want a processor to freeze, make sure its Processor Freeze attribute is Enabled.
PAGE 189
Processors: Monitoring and Recovery Dumping a Processor to Tape (Down System Only) Dumping a Processor to Tape (Down System Only) If the entire system is down (all processors are halted), you can perform a tape dump using the OSM or TSM Low-Level Link. Your service provider can use the memory dump to troubleshoot your system. For more information on determining processor problems, see Monitoring Processor Status Using OSM or TSM on page 11-2.
PAGE 190
Processors: Monitoring and Recovery Dumping a Processor to Tape (Down System Only) Before You Begin To prepare for a tape dump: 1. Log on to the OSM or TSM Low-Level Link. 2. On the toolbar, click Processor Status. The Processor Status dialog box appears. 3. Write down the status message displayed in the Processor Status dialog box (Figure 11-2) for the processor to be dumped. You will need this information when you notify your system manager or service provider about this dump. Figure 11-2.
PAGE 191
Processors: Monitoring and Recovery Dumping a Processor to Tape (Down System Only) example, you can select processors 2, 3, and 4, but not 2 and 4. To select processors 2 and 4, use the Ctrl key with the left mouse button. b. In the Processor Action menu, scroll to Halt. c. Click Perform action. 5. Verify that a tape drive is connected to a PMF CRU in group 01. 6. Mount a tape that is not write-protected into that tape drive. For open-reel tapes, check that the write-enable ring is present. 7.
PAGE 192
Processors: Monitoring and Recovery Dumping a Processor to Tape (Down System Only) 2. In the Dump Processor-n to Tape dialog box, type: a. The SCSI ID of the tape drive. The default value is 4 or 5, which is the current software requirement. b. The location of the PMF CRU to which the tape drive is connected. Tape drives are connected to SCSI controllers on PMF CRUs.
PAGE 193
Processors: Monitoring and Recovery Dumping a Processor to Tape (Down System Only) Troubleshooting and Recovery Operations for Tape Dumps Table 10-1 lists the errors that can occur during a tape dump. Perform the recommended recovery operation. Table 11-1. Common Errors That Occur During Tape Dumps Status Message Cause Recovery Dump operation failed Various. Check with your system manager or service provider. Error during tape dump Various. If it is a tape error, try a new tape.
PAGE 194
Processors: Monitoring and Recovery Dumping All Processors in a System Dumping All Processors in a System Dump an entire server when you want to examine the contents of all processors on a frozen server. You must be logged on to the OSM or TSM Low-Level Link to perform this task. Note. Normally you do not perform system dumps. System dumps are performed primarily in development environments. Dumping an Entire Server 1. Enable system freeze. See Enabling or Disabling System Freeze on page 11-12. 2.
PAGE 195
Processors: Monitoring and Recovery Reloading a Single Processor on a Running Server Reloading a Single Processor on a Running Server Sometimes one or more processors in a running server are not operating. For information on how to determine whether a processor is operating, see Monitoring Processor Status Using OSM or TSM on page 11-2. After you have determined that a processor is not operating, check that the processor is halted. Dump (copy) its memory to disk.
PAGE 196
Processors: Monitoring and Recovery Loading a Processor From Disk 12. Either: • • If you selected Reset, type reload n,prime. If you selected Prime for Reload, type reload n. Note. n is the number of the processor you want to reload. 13. Check the OutsideView window for status messages, which will report successes or errors during the load. Monitor the state of the processor you are loading until it is executing the NonStop Kernel operating system. 14.
PAGE 197
Processors: Monitoring and Recovery Loading a Processor From Disk 6. Type the RVU of the software you want to load in the SYSnn edit box. 7. Select the configuration file using the option buttons. 8. Click the CIIN disabled check box if you want to disable the CIIN file. 9. Type the disk information in the group, module, and slot boxes. The $SYSTEM-P disk is located in group 1, module 1, slot 11. The $SYSTEM-M disk is located in group 1, module 1, slot 12. 10.
PAGE 198
Processors: Monitoring and Recovery Copying a Dump File From Tape to Disk Copying a Dump File From Tape to Disk To copy a dump file from tape to a disk file in compressed format, use the COPYDUMP utility. COPYDUMP automatically determines the size of the dump file. To make a compressed disk copy of a dump file: 1. At a TACL prompt: COPYDUMP { $tape | dumpfile }, destfile where: • • • $tape is the name of the tape drive where the tape dump file is located.
PAGE 199
Processors: Monitoring and Recovery Replacing Processor Memory or a PMF CRU Replacing Processor Memory or a PMF CRU Processor memory is field-replaceable only in Nonstop S7000 and NonStop S70000 servers, depending on your service area. Check with your service provider on the procedure for your area. If memory units cannot be replaced, the entire PMF CRU must be replaced.
PAGE 200
Processors: Monitoring and Recovery Submitting Information to Your Service Provider Table 11-2. Other Files to Submit to Your Service Provider File Description $SYSTEM.SYSnn.CONFLIST SYSGENR output file $SYSTEM.ZLOGnn EMS event log ($0 operator log files) All files located in the $SYSTEM.ZSERVICE subvolume Service event log ($ZLOG files) To back up configuration and operations files: 1. For this backup operation, use any tape drive that is in a STARTED state and a READY substate.
PAGE 201
Submitting Information to Your Service Provider Processors: Monitoring and Recovery Additional Information Required by Your Service Provider In addition to the tapes previously discussed, submit the information listed in Table 10-3 to your service provider. Table 11-3.
PAGE 202
Related Reading Processors: Monitoring and Recovery Related Reading For more information about tools used to monitor and perform recovery operations on processors, refer to the documentation listed in Table 10-4. . Table 11-4.
PAGE 203
12 ServerNet Fabrics: Monitoring and Recovery When to Use This Section 12-1 Monitoring the Status of the ServerNet Fabrics 12-2 Monitoring the ServerNet Fabrics Using OSM or TSM 12-2 Monitoring the ServerNet Fabrics Using SCF 12-3 Identifying ServerNet Fabric Problems 12-5 Recovery Operations for the ServerNet Fabrics 12-6 Recovery Operations for a Down Disk Due to a Fabric Failure 12-6 Recovery Operations for a Down Path Between Processors 12-6 Recovery Operations for a Down Processor 12-6 Recovery Operati
PAGE 204
ServerNet Fabrics: Monitoring and Recovery Monitoring the Status of the ServerNet Fabrics Monitoring the Status of the ServerNet Fabrics To monitor the status of the ServerNet fabrics: • • Use the OSM Service Connection or the TSM Service Application to check the communication between processor enclosures, I/O enclosures, and systems. Use the Subsystem Control Facility (SCF) to check the status of interprocessor communication on the X and Y fabrics.
PAGE 205
Monitoring the ServerNet Fabrics Using SCF ServerNet Fabrics: Monitoring and Recovery Monitoring the ServerNet Fabrics Using SCF The SCF STATUS SERVERNET command displays a matrix for the ServerNet X fabric and a matrix for the ServerNet Y fabric. Each matrix shows the status of the paths between all pairs of processors. Use the SCF STATUS SERVERNET command to display current information about the ServerNet fabric.
PAGE 206
ServerNet Fabrics: Monitoring and Recovery ° Monitoring the ServerNet Fabrics Using SCF The status from processors 2 through 15 is displayed as down. Normal ServerNet Fabric States Normal states for a path on the ServerNet fabrics can be one of: • UP The path from the processor in the FROM row to the processor in the TO column is up. The status for all ServerNet connections between existing processors in a system should be UP.
PAGE 207
ServerNet Fabrics: Monitoring and Recovery Identifying ServerNet Fabric Problems Identifying ServerNet Fabric Problems Depending on how your system is configured, these states for a path on the ServerNet fabrics might indicate a problem: • DIS (disabled) The ServerNet fabric is down at the TO location.
PAGE 208
ServerNet Fabrics: Monitoring and Recovery Recovery Operations for the ServerNet Fabrics Recovery Operations for the ServerNet Fabrics For most recovery operations, refer to the SCF Reference Manual for the Kernel Subsystem.
PAGE 209
13 Applications: Monitoring and Recovery When to Use This Section 13-1 Monitoring TMF 13-1 Monitoring the Status of TMF 13-2 Monitoring Data Volumes 13-2 TMF States 13-4 Monitoring the Status of Pathway 13-5 PATHMON States 13-6 Related Reading 13-6 When to Use This Section This section explains how to monitor the status of the HP NonStop Transaction Transaction Management Facility (TMF) and Pathway transaction processing applications.
PAGE 210
Applications: Monitoring and Recovery Monitoring the Status of TMF Monitoring the Status of TMF To monitor TMF using TMFCOM: 1. At a TACL prompt: > TMFCOM 2. At the TMFCOM prompt: ~ STATUS TMF Note. The STATUS TMF command presents status information about the audit dump, audit trail, and catalog processes. Thus, in addition to the general TMF information, the STATUS TMF command combines information from the STATUS AUDITDUMP, STATUS AUDITTRAIL, and STATUS BEGINTRANS commands.
PAGE 211
Applications: Monitoring and Recovery Monitoring Data Volumes For example, to check the status of all data volumes, at a TMFCOM prompttype: ~ STATUS DATAVOLS TMFCOM responds with output similar to: Audit Recovery Volume Trail Mode State --------------------------------------------------$DATA1 MAT Online Started $DATA2 MAT Online Started $DATA3 MAT Online Recovering $DATA4 MAT Archive Recovering $DATA5 AUX01 Online Started $DATA6 AUX01 Online Started $DATA6 AUX01 Archive Recovering HP NonStop S-Series Ope
PAGE 212
TMF States Applications: Monitoring and Recovery TMF States The TMF subsystem can be in any of the states listed in Table 13-1. Table 13-1. TMF States State Meaning Configuring New Audit Trails The TMF subsystem has not yet been started with this configuration. Deleting The TMF subsystem is purging its current configuration, audit trails, and volume and file recovery information for the database in response to a DELETE TMF command.
PAGE 213
Monitoring the Status of Pathway Applications: Monitoring and Recovery Monitoring the Status of Pathway Pathway is a group of related software tools that enables businesses to develop, install, and manage online transaction processing applications. Several Pathway environments can exist for a system. As a system operator, you might check the status of Pathway in your routine system monitoring. This subsection explains how to check the status of the Pathway transaction processing applications. 1.
PAGE 214
PATHMON States Applications: Monitoring and Recovery PATHCOM responds with a output such as: PATHMON -PATHCTL LOG1 SE LOG2 REQNUM 1 2 STATE=RUNNING CPUS 6:1 (OPEN) $GROG.VIEWPT.PATHCTL (OPEN) $0 (CLOSED) FILE PATHCOM TCP PID $Y622 $Y898 PAID 8,001 WAIT PATHMON States The status of the PATHMON process can be either STARTING or RUNNING: • • STARTING indicates that a system load or cool start has not finished. RUNNING indicates that a system load or cool start has finished.
PAGE 215
14 Printers and Terminals: Monitoring and Recovery When to Use This Section 14-1 Overview of Printers and Terminals 14-1 Monitoring Printer and Collector Process Status 14-2 Monitoring Printer Status 14-2 Monitoring Collector Process Status 14-2 Recovery Operations for Printers and Terminals 14-3 Recovery Operations for a Full Collector Process 14-3 Related Reading 14-3 When to Use This Section This section provides a brief overview about monitoring and recovery for printers and terminals.
PAGE 216
Monitoring Printer and Collector Process Status Printers and Terminals: Monitoring and Recovery Monitoring Printer and Collector Process Status This subsection explains how to list the printers on your system and determine their status. It also explains how to check the status of the spooler subsystem collector processes, which accept output from applications and store the output on a disk.
PAGE 217
Printers and Terminals: Monitoring and Recovery Recovery Operations for Printers and Terminals This listing shows that the three collector processes, $S, $S1, and $S2, are active and none is approaching a full state.
PAGE 218
Printers and Terminals: Monitoring and Recovery For information about the spooler and SPOOLCOM: • • Guardian User’s Guide Spooler Utilities Reference Manual HP NonStop S-Series Operations Guide—522459-007 14- 4 Related Reading
PAGE 219
15 Power Failures: Preparation and Recovery When to Use This Section 15-1 How an Enclosure Responds to Power Failures 15-1 How External Devices Respond to Power Failures 15-2 With an Uninterruptible Power Supply (UPS) 15-2 Without an Uninterruptible Power Supply (UPS) 15-2 Preparing for Power Failure 15-3 Monitoring Power Supplies 15-3 Maintaining Batteries 15-3 Recharging Spare Batteries 15-3 Monitoring Batteries 15-3 Causes of Drained Batteries 15-3 Recharging Drained Batteries 15-4 Power Failure Recovery
PAGE 220
Power Failures: Preparation and Recovery How External Devices Respond to Power Failures The default power-fail delay time is 30 seconds, but this time can vary depending on how your system is configured. In some circumstances, the operating system might shorten the power-fail delay time. With a shorter power-fail delay time, the batteries might be able to provide power to the memory for longer than the normal 45 minutes.
PAGE 221
Power Failures: Preparation and Recovery Preparing for Power Failure Preparing for Power Failure To prepare for power failures, regularly monitor power supplies and batteries. Monitoring Power Supplies Monitor power-generating equipment and run regular checks on any backup generators to make sure that you can handle extended power outages. Maintaining Batteries Make sure that the batteries in each enclosure and all spare batteries are always fully charged.
PAGE 222
Power Failures: Preparation and Recovery • Recharging Drained Batteries Hardware problems: ° ° ° A power monitor and control unit (PMCU) failure has occurred. A battery failure has occurred. A processor multifunction (PMF) CRU hardware failure has occurred. The failed hardware component might need to be replaced. Refer to the CSSI Web. Recharging Drained Batteries Batteries are automatically recharged when the system is running.
PAGE 223
Power Failures: Preparation and Recovery Setting System Time 2. Log on to the OSM Service Connection or the TSM Service Application, and then: a. Check the state of the batteries as described in Monitoring Batteries on page 15-3. b. Check the status of all system components in the enclosures to make sure they are started. 3. Use SCF commands to check the status of external devices and, if necessary, to restart any external devices to bring them back online.
PAGE 224
Power Failures: Preparation and Recovery HP NonStop S-Series Operations Guide—522459-007 15- 6 Related Reading
PAGE 225
16 Starting and Stopping the System When to Use This Section 16-2 Minimizing the Frequency of Planned Outages 16-2 Anticipating and Planning for Change 16-2 Performing a Change Online 16-3 Powering On the System 16-3 Before Powering On the System 16-3 System Power-On Procedure 16-4 Troubleshooting and Recovery Operations When Powering On the System 16-4 Starting the System 16-6 The System Startup Dialog Box 16-6 The Load Processor-n From Disk Dialog Box 16-9 Troubleshooting and Recovery Operations When Star
PAGE 226
Starting and Stopping the System When to Use This Section When to Use This Section You normally leave a system running. Therefore, powering the system on and off, or starting (performing a system load) and stopping the system, are not part of the daily operations routine. However, you do have to perform these procedures as part of some system operations.
PAGE 227
Starting and Stopping the System Performing a Change Online increase the maximum number of objects controlled by PATHMON objects without a system shutdown. Performing a Change Online You can perform many changes to a NonStop S-series system online. For information on hardware changes, application changes, and communications subsystem changes you can perform without shutting the system down, refer to the NonStop S-Series Planning and Configuration Guide and the Availability Guide for Change Management.
PAGE 228
Starting and Stopping the System System Power-On Procedure System Power-On Procedure To power on a system: 1. Locate the power-on push button above the handle on either processor multifunction (PMF) customer-replaceable unit (CRU) in group 01 (the group containing processors 0 and 1). Refer to Section 2, Determining Your System Configuration. 2. Press and hold down the power-on push button for at least one second. 3.
PAGE 229
Starting and Stopping the System Troubleshooting and Recovery Operations When Powering On the System Green LED Is Not Lit After POSTs Finish It can take several minutes for the green LEDs on all system components to light: 1. Check that fans are turning and that the AC power cords and power-on cables are properly connected. 2. Wait for the POSTs to finish. It might take as long as 10 minutes for all system components. 3. If the green LEDs still do not light: a.
PAGE 230
Starting the System Starting and Stopping the System Starting the System Starting a system involves loading the NonStop Kernel operating system into the memory of each processor in the server. Use the OSM or TSM Low-Level Link to start a system by either: • • Using the System Startup dialog box is the normal method for most circumstances, if you are performing a system load from the system disks located in slots 1.1.11 and 1.1.12.
PAGE 231
Starting and Stopping the System The System Startup Dialog Box first processor to be “Executing NonStop OS” after the system load finishes successfully. If the system load fails along all eight paths, refer to Troubleshooting and Recovery Operations When Starting the System on page 16-12. After the first processor is loaded, the initial TACL process automatically invokes the CIIN file unless the CIIN file is disabled.
PAGE 232
Starting and Stopping the System • The System Startup Dialog Box Base (CONFBASE) is the most basic configuration required for system startup. You will probably never need to load the system from the CONFBASE file. However, if the current configuration file has become corrupted and there is no other configuration file from which you can load the system, use this option. 4. Make sure that the CIIN disabled check box is not selected if you want the command in the CIIN file to execute. 5.
PAGE 233
Starting and Stopping the System The Load Processor-n From Disk Dialog Box For example, if you load the \EAST system from the CONFBASE file (which specifies \NONAME as the system name), an INFO SUBSYS $ZZKRN command displays \EAST as the current system and \NONAME as a pending change. Enter an ALTER SUBSYS command to change the system name to \EAST and cause the pending change to disappear. It is not displayed when you enter INFO SUBSYS again.
PAGE 234
Starting and Stopping the System The Load Processor-n From Disk Dialog Box 4. Select File>Start Terminal Emulator>For Event Streams. Procedure to Use the Load Processor-n From Disk Dialog Box To perform a system load into a specified processor, perform these steps from the OSM or TSM Low-Level Link: 1. From the toolbar, click Processor Status. The Processor Status dialog box appears. 2. In the Processor Status dialog box: a. Select the processor you want to load. b.
PAGE 235
Starting and Stopping the System The Load Processor-n From Disk Dialog Box 3. In the Load Processor-n From Disk dialog box: a. Type the current SYSnn. b. Select the current configuration file (CONFIG), or if you are unable to load using the CONFIG file, select a saved version (CONFxxyy). c. Check the CIIN disabled option if you plan to dump processors. d. Type the group, module, and slot numbers of the disk from which you want to load.
PAGE 236
Troubleshooting and Recovery Operations When Starting the System Starting and Stopping the System Troubleshooting and Recovery Operations When Starting the System Problems that might occur when you start a system are listed next.
PAGE 237
Starting and Stopping the System Troubleshooting and Recovery Operations When Starting the System Figure 16-3. OutsideView Buttons on the Windows Toolbar TIF719 4. Select File>Start Terminal Emulator>For Event Streams. 5. Two OutsideView windows appear, but one launches on top of the other. If you do not see the TACL prompt in one OutsideView window, you can check the other OutsideView window (see Figure 16-3).
PAGE 238
Starting and Stopping the System Troubleshooting and Recovery Operations When Starting the System CIIN File Is Not Invoked During System Startup The initial TACL process invokes the CIIN file automatically after the first processor is loaded if all these conditions are true: • • • The CONFTEXT configuration file located in the $SYSTEM.SYSnn subvolume has an INITIAL_COMMAND_FILE entry for the CIIN file. The CIIN file is available in the specified location.
PAGE 239
Starting and Stopping the System b. c. d. e. f. g. Troubleshooting and Recovery Operations When Starting the System On the toolbar, select Processor Status. In the Processor Status dialog box, select the processors to be reloaded. Select Actions>Prime for Reload. Click Perform Action. Close the Processor Status dialog box. Again try the RELOAD (nn) command at a TACL prompt. 6. If you continue to have problems, contact your service provider.
PAGE 240
Starting and Stopping the System Troubleshooting and Recovery Operations When Starting the System 3. If you still cannot load the system or if a CONFxxyy is not available, try one of these procedures: • Replace the current system disk with an alternate system disk if one is available: a. Replace the $SYSTEM disk. For replacement procedures, refer to the CSSI Web. b. Load the system as described in The System Startup Dialog Box on page 16-6.
PAGE 241
Starting and Stopping the System Getting a Corrupt System Configuration File Analyzed When you consider performing a tape load from a SIT: • • • • You must contact the GCSC for guidance in restoring your system disk from a SIT. You can configure the Distributed Systems Management/Software Configuration Manager (DSM/SCM) to create a SIT whenever a significant software update is performed, or you can request one each time a new SYSnn is created. For more information, see the DSM/SCM User’s Guide.
PAGE 242
Stopping the System Starting and Stopping the System You use SCF to start many system components. Refer to the SCF Reference Manual for G-Series RVUs as an overall reference.
PAGE 243
Starting and Stopping the System Alerts Alerts Before stopping a system: • • • • • Unless you stop a system in a careful and systematic manner, you can introduce abnormalities in the system state. Such abnormalities can affect disk file directories and can cause the processors to hang in an endless loop when you attempt to load your system. To maximize application availability, make stopping the system a planned event whenever possible.
PAGE 244
Starting and Stopping the System • • Preparing to Stop the System Following the SPOOLER DRAIN subcommand, the collectors allow current jobs to finish but reject new opens with a file-system error 66 (device downed). When you drain the spooler, each collector stops when it has no more open jobs. Each print process finishes printing any active jobs and then stops. After all collectors and print processes have stopped, the supervisor stops. The spooler enters the dormant state, ready to be warm started.
PAGE 245
Starting and Stopping the System Procedure to Stop the System Using OSM or TSM 8. Refresh the disks to put them in an orderly state before shutdown. Use the SCF CONTROL DISK, REFRESH command: > SCF CONTROL DISK $*,REFRESH Procedure to Stop the System Using OSM or TSM To place all processors in a halt state and set the status and registers of the processors to an initial state: 1. Log on to the OSM or TSM Low-Level Link. 2. On the toolbar, click Processor Status. 3.
PAGE 246
Starting and Stopping the System Powering Off the System Powering Off the System The system powers off by powering off all system components and finally shutting down the power supplies. In this state, you can power up the system only by pressing the power-on push button on either PMF CRU in group 01. The method you use to power off a system depends on the state of the system: • Under normal circumstances, use SCF to power off a running system that has been brought to an orderly stop.
PAGE 247
Starting and Stopping the System Emergency Power-Off Procedure The system is powered off, and you are automatically logged off of the Low-Level Link. • To power off a stopped system after system discovery: 1. Select Display>Actions. 2. The Actions dialog box appears. In the Actions box, select Power Off. 3. Click Perform Action. 4. A message box prompts you to confirm the power off system action. To power off, click OK. 5. Shut off AC power to all peripherals and subsystems.
PAGE 248
Starting and Stopping the System Recovery Operations for Stopping or Powering Off the System Recovery Operations for Stopping or Powering Off the System If all processors in the system have been halted and you are unable to log off, press Alt-F4 to exit the OSM or TSM Low-Level Link. Reducing Shutdown Time An important component of a planned outage is the time required to start and stop your applications, devices, and processes.
PAGE 249
Starting and Stopping the System Use Parallel Processing This PATHCOM START command uses explicit names to start all of the TERM objects defined in the PATHMON configuration file: = START TERM (TERM1, TERM2, TERM3, TERM4, TERM5, TERM6) Note. When using explicit names, you must revise your command files whenever a configuration change occurs. Therefore, you should balance the time it takes to update configuration files against the savings in startup or shutdown time.
PAGE 250
Starting and Stopping the System Investigate Product-Specific Techniques process is started in whichever processor is running. Of course, if neither processor is up, the attempt to start the process fails.
PAGE 251
Related Reading Starting and Stopping the System Related Reading For more information about powering on and starting the system, refer to the documentation listed in Table 16-1. Table 16-1. Related Reading for Starting and Stopping a System For Information About Refer to Using SCF, customizing your configuration SCF Reference Manual for G-Series RVUs provides an overall reference for SCF, as well as information on customizing your configuration using command files.
PAGE 252
Starting and Stopping the System HP NonStop S-Series Operations Guide—522459-007 16 -28 Related Reading
PAGE 253
17 Preventive Maintenance When to Use This Section 17-1 Monitoring Physical Facilities 17-2 Checking Air Temperature and Humidity 17-2 Checking Physical Security 17-2 Maintaining Order and Cleanliness 17-2 Checking Fire-Protection Systems 17-2 Cleaning System Components 17-3 Cleaning an Enclosure 17-3 Cleaning and Maintaining Printers 17-3 Cleaning Tape Drives 17-3 Handling and Storing Cartridge Tapes 17-4 When to Use This Section This section describes routine maintenance tasks required for NonStop S-se
PAGE 254
Monitoring Physical Facilities Preventive Maintenance Monitoring Physical Facilities This subsection explains how to check the physical environment of your computer facility. You might be asked to monitor these aspects of your physical facility: • • • • Air temperature and humidity Physical security Order and cleanliness Fire-protection systems Checking Air Temperature and Humidity Check that the temperature and humidity are at the correct level established by management personnel.
PAGE 255
Cleaning System Components Preventive Maintenance Cleaning System Components This subsection contains basic information about cleaning enclosures, printers, and tape drives. Many companies have service-level agreements with HP that include regular preventive maintenance (PM) of their hardware components. If a Field Service Organization (FSO) representative handles cleaning and other preventive maintenance for your company, you need not be concerned with the cleaning tasks mentioned here.
PAGE 256
Handling and Storing Cartridge Tapes Preventive Maintenance For ordering information, see the operator’s guide shipped with the tape subsystem. Caution. These precautions are extremely important to prevent damage: • • • • • • Do not use cleaner solutions that contain lubricants. Lubricants deposit a film on the tape head and impair performance. Do not use aerosol cleaners, even if they contain isopropyl alcohol.
PAGE 257
A Operational Differences Between Systems Running D-Series and G-Series RVUs Users familiar with systems running D-series RVUs will find several major differences in the operational environment of systems systems running G-series RVUs. Although many of the operations to be performed remain the same, the tools you use to execute these operations might differ significantly.
PAGE 258
Operational Differences Between Systems Running D-Series and G-Series RVUs HP NonStop S-Series Operations Guide—522459-007 A- 2
PAGE 259
B Tools and Utilities for Operations When to Use This Appendix B-2 BACKCOPY B-2 BACKUP B-2 Disk Compression Program (DCOM) B-2 Disk Space Analysis Program (DSAP) B-2 EMSDIST B-2 Event Management Service Analyzer (EMSA) B-3 File Utility Program (FUP) B-3 Measure B-3 MEDIACOM B-3 NSKCOM and the Kernel-Managed Swap Facility (KMSF) Object Monitoring Facility (OMF) B-3 OSM Package B-4 PATHCOM B-4 PEEK B-4 RESTORE B-4 SPOOLCOM B-4 Subsystem Control Facility (SCF) B-5 HP Tandem Advanced Command Language (TACL) B-5
PAGE 260
Tools and Utilities for Operations When to Use This Appendix When to Use This Appendix This appendix briefly describes the tools and utilities that might be available on your system to assist you in performing the operations tasks for a NonStop S-series server. The use of some of these tools and utilities is discussed throughout this guide. For a list of other documentation that provides detailed information about these tools and utilities, refer to Appendix C, Related Reading.
PAGE 261
Tools and Utilities for Operations Event Management Service Analyzer (EMSA) Event Management Service Analyzer (EMSA) Use the Event Management Service Analyzer (EMSA) to extract specific types of event messages from EMS log files and to create an Enscribe database that you can query to analyze problem trends. File Utility Program (FUP) The File Utility Program (FUP) is a component of the standard software package for the NonStop Kernel operating system.
PAGE 262
Tools and Utilities for Operations OSM Package OSM Package The HP Open System Management (OSM) product replaces TSM as the system management tool of choice for NonStop S-series systems. OSM applications perform all of the same functions that TSM does. However, OSM offers a browser-based interface that improves scalability and performance and overcomes other limitations that exist in TSM. TSM is still supported, but OSM is required to support new functionality in G06.21 and later. For G06.
PAGE 263
Tools and Utilities for Operations Subsystem Control Facility (SCF) Subsystem Control Facility (SCF) SCF configures and manages several subsystems that control system processes and hardware, including communications paths, disks, tapes, terminals, printers, and communications lines. You can run SCF from any workstation or terminal on the system after you are logged on.
PAGE 264
Tools and Utilities for Operations TSM Event Viewer Diagnostic System (TMDS), and the Remote Maintenance Interface (RMI).
PAGE 265
Tools and Utilities for Operations ViewPoint you with the most complete view of the status of your NonStop S-series server and allows you to perform many service operations.
PAGE 266
Tools and Utilities for Operations Windows Event Viewer given resource and the percentage of that resource used. Thus, possible resource contention problems can be detected before they become serious. Viewing the resource allocations across processors on a running system allows you to balance the application load more evenly. It can help you decide when to move user processes to processors and disk files that are less busy or when to relocate partitions to disk volumes that are less busy.
PAGE 267
C Related Reading For more information about tools and utilities used for system operations, refer to the documentation listed in Table C-1. Table C-1. Related Reading for Tools and Utilities (page 1 of 6) Tool Documentation Description BACKCOPY Guardian Disk and Tape Utilities Reference Manual This manual describes these disk and tape utilities: BACKCOPY, BACKUP, DCOM, DSAP, RESTORE, and TAPECOM. This manual supports both Gseries and D-series RVUs; TAPECOM is not supported for G-series RVUs.
PAGE 268
Related Reading Table C-1. Related Reading for Tools and Utilities (page 2 of 6) Tool Documentation Description MEDIACOM (continued) Guardian User’s Guide This guide contains information explaining how to perform routine operations relating to the tapes and tape drives on your system. The guide explains the MEDIACOM utility and provides examples for using it.
PAGE 269
Related Reading Table C-1.
PAGE 270
Related Reading Table C-1. Related Reading for Tools and Utilities (page 4 of 6) Tool Documentation Description RESTORE Guardian Disk and Tape Utilities Reference Manual This manual describes these disk and tape utilities: BACKCOPY, BACKUP, DCOM, DSAP, RESTORE, and TAPECOM. This manual supports both G-series and D-series RVUs; TAPECOM is not supported for G-series RVUs.
PAGE 271
Related Reading Table C-1. Related Reading for Tools and Utilities (page 5 of 6) Tool Documentation Description SCF interface to the WAN subsystem WAN Subsystem Configuration and Management Manual This manual describes how to configure a ServerNet wide area network (SWAN) concentrator on a NonStop S-series server. It also describes how to monitor, modify, and control the WAN subsystem. It includes detailed descriptions of the SCF commands used with the WAN subsystem.
PAGE 272
Related Reading Table C-1. Related Reading for Tools and Utilities (page 6 of 6) Tool Documentation Description ViewPoint ViewPoint Manual This manual describes ViewPoint, a multifunction operations console application that allows the management of a network of systems. The manual contains information on installing, configuring, and starting ViewPoint for custom applications. It also describes the concepts underlying ViewPoint operation.
PAGE 273
D Converting Numbers When to Use This Appendix D-1 Overview of Numbering Systems D-2 Binary to Decimal D-3 Octal to Decimal D-4 Hexadecimal to Decimal D-5 Decimal to Binary D-7 Decimal to Octal D-8 Decimal to Hexadecimal D-9 When to Use This Appendix Refer to this appendix if you need to convert numbers from one numbering system to another.
PAGE 274
Overview of Numbering Systems Converting Numbers Overview of Numbering Systems Internally, a computer stores data as a series of off and on values represented symbolically by the binary digits, or bits, 0 and 1, respectively. Because numbers represented as strings of binary 0s and 1s are difficult to read, binary numbers are generally converted into octal, decimal, or hexadecimal form. Table D-1 describes the binary, octal, decimal, and hexadecimal number systems. Table D-1.
PAGE 275
Binary to Decimal Converting Numbers Binary to Decimal To convert a binary number to a decimal number: 1. Starting from the right, multiply the least significant (rightmost) binary digit by the first placeholder value. Moving towards the left, multiply each new binary digit by its corresponding placeholder value until the binary number is exhausted. To establish placeholder values, the first placeholder value (on the far right) is 1.
PAGE 276
Octal to Decimal Converting Numbers Octal to Decimal To convert an octal number to a decimal number: 1. Starting from the right, multiply the least significant (rightmost) octal digit by the first placeholder value. Moving towards the left, multiply each new octal digit by its corresponding placeholder value until the octal number is exhausted. To establish placeholder values, the first placeholder value on the far right is 1.
PAGE 277
Hexadecimal to Decimal Converting Numbers Hexadecimal to Decimal To convert a hexadecimal number to a decimal number: 1. Starting from the right, multiply the least significant (rightmost) hexadecimal digit by the first placeholder value. Moving towards the left, multiply each new hexadecimal digit by its corresponding placeholder value until the hexadecimal number is exhausted. To establish placeholder values, the first placeholder value (on the far right) is 1.
PAGE 278
Hexadecimal to Decimal Converting Numbers Figure D-3. Hexadecimal to Decimal Conversion Placeholder values ... 4096 256 16 1 ... B A 1 0 Hexadecimal number 0 * 1 1 * 16 10 * 256 11 * 4096 = 0 = 16 = 2560 = 45056 47632 CDT 609.CDD 1. Take the rightmost hexadecimal digit and multiply it by the rightmost placeholder value. 2. Moving to the left, take the next hexadecimal digit and multiply it by the next placeholder value. Continue to do this until the hexadecimal number has been exhausted.
PAGE 279
Decimal to Binary Converting Numbers Decimal to Binary To convert a decimal number to a binary number: 1. Divide the decimal number by 2. The remainder of this first division becomes the least significant (rightmost) digit of the binary value. 2. Divide the quotient from Step 1 by 2, and use the remainder of the next division as the next digit (to the left) of the binary value. Continue to divide the quotients by 2 until the decimal number is exhausted.
PAGE 280
Decimal to Octal Converting Numbers Decimal to Octal To convert a decimal number to an octal number: 1. Divide the decimal number by 8. The remainder of this first division becomes the least significant (rightmost) digit of the octal value. 2. Divide the quotient from Step 1 by 8, and use the remainder of the next division as the next digit (to the left) of the octal value. Continue to divide the quotients by 8 until the decimal number is exhausted.
PAGE 281
Decimal to Hexadecimal Converting Numbers Decimal to Hexadecimal To convert a decimal number to a hexadecimal number: 1. Divide the decimal number by 16. The remainder of this first division becomes the least significant (rightmost) digit of the hexadecimal value. If the remainder exceeds 9, convert the 2-digit remainder to its hexadecimal letter equivalent. Use this table for conversion. Decimal Hexadecimal 10 A 11 B 12 C 13 D 14 E 15 F 2.
PAGE 282
Decimal to Hexadecimal Converting Numbers HP NonStop S-Series Operations Guide—522459-007 D -10
PAGE 283
Safety and Compliance Regulatory Compliance Statements The following warning and regulatory compliance statements apply to the products documented by this manual. FCC Compliance This equipment has been tested and found to comply with the limits for a Class A digital device, pursuant to part 15 of the FCC Rules. These limits are designed to provide reasonable protection against harmful interference when the equipment is operated in a commercial environment.
PAGE 284
Regulatory Compliance Statements Safety and Compliance Korea MIC Compliance Taiwan (BSMI) Compliance Japan (VCCI) Compliance This is a Class A product based on the standard or the Voluntary Control Council for Interference by Information Technology Equipment (VCCI). If this equipment is used in a domestic environment, radio disturbance may occur, in which case the user may be required to take corrective actions.
PAGE 285
Regulatory Compliance Statements Safety and Compliance DECLARATION OF CONFORMITY Supplier Name: HP COMPUTER CORPORATION Supplier Address: HP Computer Corporation, NonStop Enterprise Division 10333 Vallco Parkway Cupertino, CA 95014 USA Represented in the EU By: Hewlett Packard EMEA GmbH P.O.
PAGE 286
Consumer Safety Statements Safety and Compliance Consumer Safety Statements Customer Installation and Servicing of Equipment The following statements pertain to safety issues regarding customer installation and servicing of equipment described in this manual. • • Keep door closed for normal operation. Batteries must be disposed of in compliance with local ordinances. Caution.
PAGE 287
Safety and Compliance Consignes de sécurité à l'intention du client Consignes de sécurité à l'intention du client Installation et entretien du système par le client Les consignes de sécurité qui suivent concernent l'installation et l'entretien par le client du système décrit dans le présent manuel. • • Garder la porte fermée pendant le fonctionnement normal du système. Jeter les piles usagées conformément au règlement local en vigueur. Attention.
PAGE 288
Verbraucher-Sicherheitsangaben Safety and Compliance Verbraucher-Sicherheitsangaben Geräteinstallation und -wartung durch den Kunden Die folgenden Angaben betreffen Sicherheitsfragen in Hinsicht auf die Geräteinstallation und -wartung durch den Kunden, wie sie in diesem Handbuch beschrieben werden. • • Tür für normalen Betrieb geschlossen lassen. Batterien müssen in Übereinstimmung mit örtlichen Vorschriften beseitigt werden. Vorsicht.
PAGE 289
Safety and Compliance Declaraciones sobre la seguridad del consumidor Declaraciones sobre la seguridad del consumidor Instalación y servicio al equipo por el consumidor Las siguientes declaraciones tienen que ver con aspectos de seguridad relacionados con la instalación y servicio al equipo por el consumidor, y que se describen en este manual. • • Mantenga la puerta cerrada durante la operación normal del equipo. Las baterías (pilas) deben desecharse cumpliendo con los reglamentos locales. Precaución.
PAGE 290
Forbrugersikkerhedsmeddelelser Safety and Compliance Forbrugersikkerhedsmeddelelser Installation og service af udstyr der udføres af kunden De følgende meddelelser vedrører sikkerheden angående installation og service af udstyr, der udføres af kunden, som beskrives i denne brugerhåndbog. • • Hold lugen lukket under normal drift. Batterierne skal kasseres i overensstemmelse med lokale vedtægter.
PAGE 291
Veiligheidsinstructies voor de consument Safety and Compliance Veiligheidsinstructies voor de consument Installatie en onderhoud van apparatuur door de klant De volgende veiligheidsinstructies betreffen de installatie en het onderhoud door de klant van de in deze handleiding beschreven apparatuur. • • Houd bij normaal bedrijf de deur gesloten. Batterijen moeten overeenkomstig de plaatselijke voorschriften worden weggegooid. Opgelet.
PAGE 292
Käyttöturvaa koskevia huomautuksia Safety and Compliance Käyttöturvaa koskevia huomautuksia Asiakkaan suorittama laiteasennus ja huolto Seuraavat huomautukset koskevat turvallisuusnäkökohtia, jotka asiakkaan täytyy ottaa huomioon tässä käsikirjassa kuvattuja laiteasennuksia ja huoltotoimenpiteitä suoritettaessa. • • Kansi täytyy pitää suljettuna normaalin käytön aikana. Paristot täytyy hävittää paikallisten säädösten mukaisesti. Varoitus.
PAGE 293
Veiligheidsinstructies voor de consument Safety and Compliance Veiligheidsinstructies voor de consument Installatie en onderhoud van apparatuur door de klant De volgende veiligheidsinstructies betreffen de installatie en het onderhoud door de klant van de in deze handleiding beschreven apparatuur. • • Houd bij normaal bedrijf de deur gesloten. Batterijen moeten overeenkomstig de plaatselijke voorschriften worden weggegooid. Opgelet.
PAGE 294
Misure precauzionali per i clienti Safety and Compliance Misure precauzionali per i clienti Installazione e manutenzione del sistema da parte del cliente Le seguenti misure precauzionali riguardano l’installazione e la manutenzione da parte del cliente del sistema descritto nel presente manuale. • • Mantenere la porta chiusa durante il funzionamento normale del sistema. Lo smaltimento delle batterie usate deve essere effettuato secondo la normativa locale. Avvertenza.
PAGE 295
Safety and Compliance Informações de segurança para os consumidores Informações de segurança para os consumidores Instalação e manutenção do equipamento pelo cliente As seguintes informações se referem a questões de segurança relacionadas à instalação e manutenção, pelo cliente, do equipamento descrito neste manual. • • Para garantir o funcionamento normal, mantenha a porta fechada. As pilhas usadas devem ser descartadas de acordo com as leis locais. Cuidado.
PAGE 296
Safety and Compliance Informações de segurança para os consumidores Informações de segurança para os consumidores Instalação e manutenção do equipamento pelo cliente As seguintes informações referem-se a questões de segurança relacionadas à instalação e manutenção, pelo cliente, do equipamento descrito neste manual. • • Para garantir o funcionamento normal, mantenha a porta fechada. As pilhas usadas devem ser descartadas de acordo com as leis locais. Cuidado.
PAGE 297
Meddelanden beträffande konsumentsäkerhet Safety and Compliance Meddelanden beträffande konsumentsäkerhet Kundutförd installation och service De följande meddelandena beskriver säkerhetsföreskrifter för kundutförd installation och service av utrustning som beskrivs i denna manual: • • Dörren skall vara stängd under normal drift. Batterier måste kasseras i enlighet med lokala förordningar.
PAGE 298
Safety and Compliance Kundutförd installation och service S7400/ S7x000 S7x000 HP NonStop S-Series Operations Guide—522459-007 Statements -16
PAGE 299
Safety and Compliance Kundutförd installation och service S7400/ S7x000 S7x000 HP NonStop S-Series Operations Guide—522459-007 Statements -17
PAGE 300
Safety and Compliance Kundutförd installation och service S7400/ S7x000 S7x000 HP NonStop S-Series Operations Guide—522459-007 Statements -18
PAGE 301
Index A Asynchronous Terminal Process 6100 (ATP6100) 6-3 ATM 3 ServerNet adapter (ATM3SA) 6-2 ATM3SA 6-2 ATP6100 6-3 B BACKCOPY utility B-2 BACKUP utility backing up configuration and operations files 11-27 description of B-2 Batteries charging 15-3 maintaining 15-3 monitoring 15-3 recharging drained 15-4 Battery ride-through 16-24 Binary number system D-2 Binary to decimal conversion D-3 Bus dumps See Dumps C Cartridge tape, handling and storing 17-4 Cleaning enclosures 17-3 CMI, replaced by SCF A-1 Coll
PAGE 302
Index E Dump Processor-n to Tape dialog box dumping a processor to tape (down system only) 11-17 screen capture 11-19 Dumps completed message 11-10, 11-11 dump file checking with FUP 11-11 compressing 11-21/11-26 submitting to service provider 11-26/11-29 processor to disk 11-10/11-11 processor to tape 11-17/11-21 E E4SA 6-2 EMS Analyzer (EMSA) B-3 EMS event messages, monitoring 4-1/4-3 EMSA B-3 EMSDIST description of B-2 using to monitor EMS event messages 4-2 EMSLOG file 11-28 Enclosures cleaning 17-3
PAGE 303
Index G Freeze (continued) recovery operations for a hardware error freeze 11-8 system freeze 11-15 FRU 2-2 FUP See File Utility Program (FUP) I G Kernel-Managed Swap Facility (KMSF) B-3 KMSF B-3 G4SA 2-10 GESA 6-2 Gigabit Ethernet 4-port ServerNet adapter 2-10 Gigabit Ethernet ServerNet adapter 6-2 Group numbering 2-3, 2-14 Group, in a system 2-1 Guided procedures, OSM 1-13 Guided procedures, TSM 1-13 G-series xv H Halting processors 11-7 See also Processor halts Hang of processor 11-5 of system, re
PAGE 304
Index N Monitoring (continued) processors 11-2/11-6 ServerNet fabrics 12-1/12-4 ServerNet/DA 7-1 tape drives 10-1/10-9 terminals 14-1 MSP 0 or 1 16-13 N NonStop IPX/SPX 6-3 NonStop S7000 processor enclosure 2-7 NonStop S7400 processor enclosure 2-8 NonStop S7x00 xv NonStop Sxx000 xv NonStop Sxx000 processor enclosure 2-8 NonStop TCP/IP 6-3 NSKCOM B-3 Number conversion binary to decimal D-3 decimal to binary D-7 decimal to hexadecimal D-9 decimal to octal D-8 hexadecimal to decimal D-5 octal to decimal D-
PAGE 305
Index R Powering on the system 16-3/16-5 Power-on LED, disk drive 9-2 Power-on push button, locating 2-13 Printers monitoring 14-1 recovery operations for 14-2 Problems, common disk drive 9-11 tape drive 10-9 Processes generic 5-2 I/O 5-2 monitoring 5-3/5-6 recovery operations for 5-6 system 5-2 Processor halts halt code = %nn message 11-5 recovery operations for 11-8 Processor multifunction (PMF) CRU status LEDs 3-30 Processor Status dialog box 11-18 Processors dumps See Dumps freeze See Freeze halt See
PAGE 306
Index T ServerNet fabrics monitoring 12-1/12-4 recovery operations for 12-6 ServerNet switch board 2-10, 8-1 ServerNet/DA, monitoring 7-1 Setting system time 15-5 Slot 2-1 SNAX/APN 6-3 SPOOLCOM B-4 Starting the system 16-6/16-17 Stopping the system 16-18/16-21 Storing cartridge tapes 17-4 Subsystem Control Facility (SCF) See SCF Subsystems displaying configuration of 2-31 Kernel 2-32 SLSA 2-33, 6-2 storage 2-32 TCP/IP 2-31 WAN 2-34, 6-2 Sxx000 xv System organization 2-1 performance 16-2 powering off 16-22
PAGE 307
Index V V Verifying state of TSM connections 1-12 ViewPoint description of B-7 using to monitor EMS event messages 4-3 ViewSys utility 11-4, B-7 W Windows Event Viewer 1-13 Special Characters $SYSTEM, recovery operations for 16-15 HP NonStop S-Series Operations Guide—522459-007 Index -7
PAGE 308
Index Special Characters HP NonStop S-Series Operations Guide—522459-007 Index -8