HP Performance Agent for NonStop Server Monitoring Guide Part number: 519586-003 Third edition: 04/2012
Legal notices Hewlett-Packard makes no warranty of any kind with regard to this manual, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose. Hewlett-Packard shall not be held liable for errors contained herein or direct, indirect, special, incidental or consequential damages in connection with the furnishing, performance, or use of this material.
Contents Preface Before You Begin ........................................................................................................................................ 8 Audience ................................................................................................................................................... 8 Goals of Documentation .............................................................................................................................. 8 Organization................
3-7-1 3-7-2 3-8 3-8-1 3-8-2 3-9 3-9-1 3-9-2 3-10 3-10-1 3-10-2 3-11 3-11-1 3-11-2 3-12 3-12-1 3-12-2 3-13 3-13-1 3-13-2 3-14 3-14-1 3-14-2 3-15 3-15-1 3-15-2 3-16 3-16-1 3-16-2 3-17 3-17-1 3-17-2 3-18 3-18-1 3-18-2 3-19 3-19-1 3-19-2 3-20 3-20-1 3-20-2 3-21 3-21-1 3-21-2 3-22 3-22-1 3-22-2 3-23 3-23-1 3-23-2 3-24 3-24-1 3-24-2 3-25 3-25-1 DISKFILE Instance Configuration ............................................................................................. 34 DISKFILE Metrics ........................
4-1-1 4-1-2 Threshold Configuration Macro .............................................................................................. 53 Writing the Threshold Text File ............................................................................................... 54 5 Configuring 5-1 5-2 5-3 5-4 5-5 Loop Detection Introduction .......................................................................................................................... 57 CPU Loop Detection ...............................
System Alert ...................................................................................................................................... 106 Event Alert......................................................................................................................................... 107 Alert Tokens ...................................................................................................................................... 107 EMS Format ............................................
Figures Figure 1 Estimation of Database Size ...............................................................................................................
Preface HP Performance Agent for NonStop (OVNPM), formerly known as HP OpenView NonStop Server Performance Management, is an out-of-the-box solution that provides performance monitoring of HP NonStop servers. The software is a natural extension of HP Operations (formerly known as HP OpenView) and provides you with true end-to-end management of your NonStop environment. The various OVNPM Guides provides information about how to get started and use the product.
Organization Chapter No. Chapter Name Description Chapter 1. OVNPM Monitoring Overview Provides an overview of the OVNPM monitoring. Chapter 2. OVNPM Monitoring Configuration Provides information on how to configuration monitoring parameters and building the configuration on HP NonStop Server. Chapter 3. OVNPM Entities Details the instance configurations and entity configurations for OVNPM domains. Chapter 4.
OVNPM Documentation Map HP Performance Agent for NonStop (OVNPM) provides a set of manuals and online help that help you use the product and understand the concepts underlying the product. This section describes what information is available and where you can find it. Electronic Versions of the Manuals All manuals are available as Adobe Portable Document Format (PDF) files in the documentation directory on the OVNPM product CD-ROM. All manuals are also available in the HP web server directory.
1 OVNPM Monitoring Overview 1-1 Goals OVNPM monitors NonStop server and customers application along with the hardware and system software components. Installation procedure adds most of the information in the OVNPM monitoring configuration. Monitoring user specific applications requires more analysis. This guide provides the information to build a monitoring configuration adapted to the specific need of the production. 1-2 Definitions An instance is a system resource to be measured.
2 OVNPM Monitoring Configuration 2-1 OVNPM Collect and Database Parameters Installation process defines default values for collect and database parameters. This topic provides information for configuring these parameters after installation. This information is optional and is provided for advanced users. In this topic, configuration means OVNPM monitoring configuration. Most of these tasks can be performed using the VCONF utility. For details, refer Appendix B: VCONF Utility.
2-1-1-2 Blacklisted Measurements OVNPM can start up to five measurements. When USERDEF and BASE24DEF entities are declared in the monitoring configuration, OVNPM searches in all Measurements where these USERDEF are bumped. This search operation has a cost attached. The CPU and time cost can be reduced by excluding the measurements from the search scope. To achieve this, user edit BLACKLST file located in configuration subvolume.
System Report: Alphanumeric Data File Time Interval Default Retention Period in Days 32 minutes 7 Report time interval equals Collect interval (30 seconds or 15 Seconds) 3 14
2-2 Building the OVNPM Configuration on HP NonStop Server The OVNPM installation process creates a default configuration for out-of-the-box use. You will have to modify and build the OVNPM configuration in the following scenarios: • Hardware is added or removed • System Software added, removed or upgraded • User Application added, removed or upgraded • Change the monitoring status of any of the existing entity NOTE: Installation process doesn’t know user applications.
• PROGRAM • PROCESS • USERDEF • BASE24DEF • TERMINAL • SOCKETS • SQLPROC • SQLSTMT • TCP/IP The USERCFG file needs to be modified manually to change the monitoring parameters for the existing entities or to add user application information. You can edit the USERCFG file to remove units you do not want to monitor, change trend flag status, or add additional units. Use the following guidelines to edit the file: • Entities are stored in the USERCFG file by domain type.
Refer to Symbolic Path Names for details. You can use the syntax provided above to make the desired changes in the USERCFG file. In the following sample section of the USERCFG file, a new APPLIDEF entity named APPLICATION2 has been added, which will monitor processes $Pra, $PRb and $PRc. - section: APPLIDEF -- Syntax : APPLIDEF [,TREND=] -- =<$Process1>+<$Process2>+.. -- .processes=<$Process3> <$Process4>+..
2-2-3 Step 3: Building the Configuration Use the following commands to build the OVNPM configuration using the modified USERCFG and USERMT files. 1. 2. Change the subvolume to the OVNPM TACL subvolume. Load the OVNPM environment using the following command. run visload 3. Build the OVNPM configuration. sv_build_config 4. If the USERCFG file contains error, the build configuration fails. Syntax errors are displayed with line number. Also, duplicated instances are not allowed.
2-3 Symbolic Path Name SPN or the Symbolic Path Name provides OSS pathname support for PROGRAM, FILESIZE, and DISKFILE domains. SPN are declared in the USERCFG file. 2-3-1 Symbolic Path Name Declaration A symbolic pathname ID or the SPNID is a unique identifier that defines and refers an OSSPATH. For example, 0web, my_app, and 2004data are valid SPNIDs. The USERCFG syntax to declare a SPN is: SPN [] Where: is an eight character unique identifier.
2-3-2 Instance Name Using the SPN Instance name identifies an instance. The instance name length must be lower than 32 bytes. Instance name for in domain DISKFILE, PROGRAM and FILESIZE can use the OSS pathname if only the pathname length is lower than 32 bytes. If length is higher than 32 bytes the SPN must be used. The USERCFG Syntax of an instance using an SPN is: [spnId] [,trend=on|off] Where: domainName is one of FILESIZE, DISKFILE, and PROGRAM domains.
2-4 Monitoring User Application A user application is a set of instances belong domain PROCESS, PROGRAM, APPLIFILE, APPLIDEF, USERDEF and BASE24DEF. To monitor user application, its components (that is instances) must be found and added in the USERCFG file. In OVNPM version 1.2, this operation must be done manually. • Step 1: Get Instances Related Processes In the first step, we must locate the subvolume or the directory that contains the executable belonging the application.
ITERATION 0.94 Accum 1 BUSY 0.02 % Busy 0 PROCESSSTATES 0.14 Accum 0 PROCESSSTATES 0.15 Accum 1 To monitor those counters in OVNPM, we must locate and use: The process name in red The counter name in blue The counter index And the counter type to define the OVNPM “unit”. So Meascom output becomes in USERCFG file: userdef $vq0A.iteration.000.nb , TREND=ON userdef $vq0A.iteration.001.nb , TREND=ON userdef $vq0A.busy.000.% , TREND=ON Notice that accum unit “nb” gives value of the counter.
sv_discover_application This function should be run when most programs from an application are in the same subvolume. The application should be running. Syntax: sv_discover_application
sv_discover_userdef This function uses the Measure Counter File (MCF) to build the OVNPM Userdef instances. The Measure Counter File is the Meascom input file to define and start a USERDEF measurement. NOTE: sv_discover_userdef supports only MCF that uses process names. OSS and guardian program name are not supported. Syntax: sv_discover_userdef Where, : Measure Counter File that contains the definition of the userdef counter to start a Measurement.
3 OVNPM Entities This section explains how to configure instances in the USERCFG file. 3-1 APPLIDEF 3-1-1 APPLIDEF Instances Configuration The APPLIDEF domain is made to monitor a set of processes as an application. Domain APPLIDEF DescriptionDomain APPLIDEF represents application software that is executing on the system The application software is made of a maximum of 80 named processes. Syntax APPLIDEF ~[,TREND= [.process[es]]=<$Process1>{+<$Process2>} {[.
Description Sum-Accel-Busy %Percentage of time that the CPU was busy executing accelerated code for the APPLIDEF processes. Sum-Checkpoints.nb/s Number of checkpoints per second executed by the APPLIDEF processes. Sum-Comp-Traps.nb/s Number of times per second a compatibility trap occurred on the processor for the APPLIDEF processes. Sum-CPU-Busy %Percentage of time that the APPLIDEF processes were executing in the CPU. Sum-Dispatches.
3-2 APPLIFILE The APPLIFILE domain offers the capability to monitor logical I/Os made by a set of up to 32 individual processes on a particular data file (local or remote). A set of processes and file name defining an APPLIFILE unit, (like all units) is configured in the USERCFG file while the APPLIFILE metrics (like all metrics) are configured in the USERMT file.
Metric Description Access-Open.nb Number of opened sessions used by the processes belonging to the APPLIFILE set that access the file. Access-Process.nb Number of processes belonging to the set of processes that access the file. Allocated-Extent.nb Number of extents allocated for the file. For partitioned file, this number is returned for the partition with the smallest Free-Extent.nb. NOTE: This metric is not available if a remote file is defined in the applifile Delete-Writeread.
3-3 BASE24DEF 3-3-1 BASE24DEF Instance Configuration Domain BASE24DEF DescriptionDomain BASE24DEF represents the features of a user application Syntax BASE24DEF <$Process>...
Description Avrg-Mem-QLen Average number current processes waiting for page fault to be serviced. Avrg-Process-Load Average CPU-Load value of an active process on the current processor. Busy-Memory Percentage of Memory on the CPU that is not available for use by applications. Cache-Hits.nb/s Number of times per second that required block was found in cache during an I/O operation. Comp-Trap.nb/s Number of times per second compatibility trap occurred on the processor. Cpu-Busy.
Metric Description Process-Dispatches.nb/s Number of times per second that process dispatching occurred during the sample time Ptle-Curr.nb The number of used Process Time List Elements Ptle-Free.nb The number of free Process Time List Elements Send-Busy.% Percentage of time the CPU spent sending data to other CPUs. Swappable-Memory.% Percentage of swappable memory on the CPU. Term-Response.% Percentage of time that terminal processes on the CPU spent on terminal responses Tle-Curr.
Description Audit-Buf-Force-4096.nb Number of times that the disk process had to write to the audit trail volume before it could write a dirty 4096 cache block and thereby free that block for another use. Audit-Buf-Force-512.nb Number of times that the disk process had to write to the audit trail volume before it could write a dirty 512 cache block and thereby free that block for another use Avrg-Cache-Hits.% Average of the metric Cache-512-Hits.%, Cache-1024-Hits.%, Cache-2048-Hits.
Metric Description Write-Qbusy.% Percentage of time spent by write requests queued to the disk. This metric available only on G-Series replaces the metric Write-Busy.% on D-Series Write-Rate.Kb/s Total number of kilobytes written during the sampling period.
3-7 DISKFILE The DISKFILE domain represents the activity on a disk file. For D-series and G-series running Measure G07 and previous versions, an asterisk (*) indicates that metrics are not collected every 15 or 30 seconds, but every 1, 2, 4, 8, 16 or 32 minutes based on the IoCollectInterval=n parameter in the DCPRM file. 3-7-1 DISKFILE Instance Configuration Domain DISKFILE DescriptionDomain DISKFILE represents the activity on a disk file. Syntax DISKFILE <$Volume>..
Metric Description Read.nb (*) Number of read operations including cache hits. Requests-Blocked.nb (*) Number of times an I/O operation had to wait because the requested file or record was locked. Requests.nb (*) Number of requests for this file. sql-deletes.nb Number of row delete operations performed on an SQL table during last collect interval [meas] sql-inserts.nb Number of row insert operations performed on an SQL table during last collect interval [meas]. sql-updates.
Metric Description Disc-Ext-Warn.nb Number of times that the size of the next allocated extent (primary or secondary extent) of the file is larger than the largest extent size of the disk. • For a single file This number can be equal to 0 or 1 (0 means the file will not have a problem for the next extent allocation, 1 means that the next extent allocation will fail) • For a partitioned file This number may be greater than 1 due to multiple secondary partitions.
3-10 LINE The LINE domain represents communication lines that are attached to the system. 3-10-1 LINE Instance Configuration Domain LINE DescriptionDomain LINE represents the individual communication lines that are attached to the system Syntax LINE <$Device> [,TREND=] Example LINE $X25A Metric Description Cpu-Busy.% Percentage of CPU time used by the line handler for the current line. IO.nb/s Number of I/Os per second during a sampling period. IO-Data.
3-11 NETLINE The NETLINE domain represents communication lines that are being used by EXPAND, HP NonStop’s proprietary networking system. 3-11-1 NETLINE Instance Configuration Domain NETLINE DescriptionDomain NETLINE represents the individual EXPAND communication lines. Syntax NETLINE <$Device> [,TREND=] Example NETLINE $PATHSQ 3-11-2 NETLINE Metrics Metric Description Cpu-Busy.% Percentage of CPU time used by the netline handler for the current netline. IO.
Description Avrg-Disc-Service-Time.% Percentage of average disc service time (Disc-Busy). Avrg-Line-IO-Rate.Kb/s Average number of Kilobytes per second exchanged on a line. Avrg-NLine-IO-Rate.Kb/s Average number of Kilobytes per second exchanged on a Netline. Avrg-Process-Load Average CPU-Load of an active process. Comp-Trap.nb/s Number of times a compatibility trap occurred on all CPUs for the node during the time interval, presented as a single count. Cpu-Busy.
3-13 PATHWAY The PATHWAY domain represents the individual PATHWAY applications that are executing on the HP NonStop host. 3-13-1 PATHWAY Instance Configuration Domain PATHWAY DescriptionDomain PATHWAY represents the individual PATHWAY applications Syntax PATHWAY <$PathmonName> [,TREND=] Example PATHWAY $ZVPT 3-13-2 PATHWAY Metrics Metric Description Spi-Run.nb Number of SPI sessions currently executing against this PATHWAY system. SvrClass-Fpend.
3-14 PROCESS The Process domain represents application software that is executing on the system. 3-14-1 PROCESS Instance Configuration Domain PROCESS DescriptionDomain PROCESS represents application software that is executing on the syste Syntax PROCESS <$ProcessName> [,TREND=] Example PROCESS $CMON Metric Description Accel-Busy.% Percentage of time that the CPU was busy executing accelerated code for the current process. Busy-Ready.% Ratio between Ready-Time and CPU-Busy.
Metric Description ossns-wait-time.% Amount of time the process waits on requests to all OSS name servers during last collect interval [meas]. sent-qlen.nb Average sent queue length of the process during last collect interval. Max-Lcbs-Inuse.nb Maximum number of Link Control Blocks allocated to the process. Mem-Page.nb Number of memory pages that have been swapped in by the current process. Mem-QTime.% Percentage of time that the process was on the ready list and waiting on page fault. Msg-Rcvd.
3-15 PROGRAM The PROGRAM domain represents the activity of Processes executed by a program file. The process-activity is summarized from a user point-of-view. 3-15-1 PROGRAM Instance Configuration Domain PROGRAM DescriptionDomain PROGRAM represents the activity of processes executing a program file. Syntax PROGRAM<$Volume>.. [,TREND=] Example PROGRAM $APPLI.PATHWAY.SERVER1 PROGRAM $SYSTEM.SYS00.TACL 3-15-2 PROGRAM Metrics Metric Description Active-Process.
Domain SERVERNET Syntax SERVERNET [,TREND= Where ServernetName is: SAC name for LAN device (class NIOC) SERVERNET [,TREND= Where ServernetName is: SAC name for LAN device (class NIOC). SAC name for SCSI device (class SCSI). IPC name for process linker listner protocol (class IPC). SCF command “INFO SAC $ZZLAN.*” lists SAC names of LAN devices on the system. OVNPM name format for a SCSI SAC is: PMF.SAC-...
Metric Description Write-QLen-Max The maximum number of requests on the write request queue. Write-QTime.% The total time spent by write requests queued for this entity. 3-17 SOCKETS 3-17-1 SOCKETS Instances Configuration A Sockets instance provides the data traffic at the IP addresses and ports level. Basically, a sockets instance monitors a group of sockets (i.e.
Metric Description Recv-Qlen.nb Average receive queue length for all inet belong a sockets instance Writeread.nb/s Disabled, not collected, for future use. Messages.nb/s Number of message sent for all inet:socket belong a SOCKETS instance Read-Rate.KB/s Data read or received for all inet/socket belong a SOCKETS instance. Write-Rate.KB/s Data write or send for all inet/socket belong the SOCKETS instance. Io-Rate.KB/s Total data transferred (sum Write-Rate.KB/s and ead-Rate.
3-19 SQLSTMT The SQLSTMT entity type provides information about all SQL statements within a SQL process. 3-19-1 SQLSTMT Instance Configuration Domain SQLSTMT Description The SQLPROC entity type provides information about one or more SQL processes. Syntax SQLSTMT .[.#] Where: is the sql process name. is the procedure in that process executing sql statement. is the index of the statement to be measured. If omitted, OVNPM uses index 0.
3-20 TCP/IP The TCP/IP entity type provides TCP/IP statistics about the transport provider, UDP, and subnet traffic. NOTE: All providers can be added in the monitoring configuration. However, the TCP/IP domains do not collect data for the CIPSAM provider. 3-20-1 TCP/IP Instance Configuration Domain TCP/IP Description The TCPIP entity type provides TCPIP statistics.
3-20-2 TCPIP Metrics Metric Description Con_in.nb Number of connections received per second. Con_Out.nb Number of connections initiated by local host per second. Con_estab.nb Number of connections initiated by local and remote hosts per second. Con_Closed.nb Number of connections closed per second. Pkt_snd.nb/s Number of data and control packets sent per second. Pkt_rcv.nb/s Number of data and control packets received per second. Byte_snd.kb/s Number of bytes sent per second. Bytes_rcv.
3-22 TMF The TMF domain represents global activities associated with Transaction Management, such as Transaction-Rate. 3-22-1 TMF Instance Configuration Domain TMF DescriptionDomain TMF represents global activities associated with Transaction Management Syntax TMF [,TREND=] Example TMF 00 3-22-2 TMF Metrics Metric Description Aborted-Trans.tps Number of aborted transactions per second that involved this system. Homenet-Trans.
3-23-2 USER Metrics Metric Description Activ-Process.nb Number of processes with this USER-ID that had CPU-activity during the sample period. Cpu-Busy.% Percentage of time that processes with this USER-ID were executing in this node. Mem-Page.nb Number of memory pages that have been swapped in by the processes belonging to the group and are still resident Msg-Rcvd.nb Number of messages received by processes belonging to the user Msg-Sent.
3-25 SPN 3-25-1 SPN Instance Configuration Domain SPN DescriptionDomain SPN (Symbolic Path Name) is a way to identify the OSS path of a file Syntax 1. SPN [] Where: is an eight character unique identifier is the path name for an OSS entity like FILE or DIRECTORY. It must start by character “/”. If it contains space it must be quoted. The maximum length can be 248 characters.
4 Threshold Configuration This section how to manage threshold with the command line tools. 4-1 Threshold Command Line Interface OVNPM provides the macro written to address the need for small and efficient command line utilities to export threshold in text file or import thresholds from a text file. In the OVNPM environment, the thresholds are defined using the Display Agent GUI. Offering a command line configuration tool allows the user to build thresholds from a TACL script.
4-1-2 Writing the Threshold Text File This section explains the syntax to write a threshold text file. To understand the threshold syntax, export threshold configurations and compare the text file with the thresholds in the Display Agent Alert Configuration window. The threshold text file defines two kinds of objects: the Group and the Threshold itself.
is one of following comparison operator: ">" | “>=” ,” =” “<>”, “<=” and “<”. is digit number followed with optional sign “/”.
severity high; Sample 3: In the following sample, the inclusion and exclusion time define the monitoring period from 8h00 to 12h00 and from 14h00 to 17h00, respectively. threshold DISC[$SYSTEM].Cpu-Busy.
5-1 Introduction This feature detects and reports any process using excessive CPU or memory resources, conditions that typically indicate a runaway program loop. Runaway Loop Detection operates as part of the SystemAlert module. Runaway memory loops and CPU loops are two of the most severe problems that may occur during the life of a process. Ordinarily, these conditions are very difficult to detect. OVNPM can detect these situations and immediately reports them to avoid dramatic system degradation.
• The process acquired more incremental memory than acceptable during 3 collection intervals (MemIncNb threshold > 2) so OVNPM reports the process as a runaway loop with the event "MemLoop". • 13:32:30 this same process uses 3900 memory pages • This process continued to acquire more incremental memory than its threshold specifies during 5 collects (>2), so OVNPM continues to report the process with the event "Mem-Loop".
Description SendOVNPMAlertOut0• • • • Send the loop warning to output device0 [1 = Yes, 0 = No] Default = 1 SendOVNPMAlertOut1 • • • Send the loop warning to output device0 [1 = Yes, 0 = No] Default = 1 SendOVNPMAlertOut2• • • • Send the loop warning to output device2 [1 = Yes, 0 = No] Default = 0 SendOVNPMAlertOut3• • • • Send the loop warning to output device3 [1 = Yes, 0 = No] Default = 0 Configuring Loop Detection Parameter NOTE: According to the OS Version or Series, values of MemIncStep
6 Configuring File Scanning 6-1 Introduction The File Scanning module enables a large number of files to be monitored for size, and file presence in real time, without impacting the OVNPM domain size limit. The File Scanning module scans all the files defined in the [FileScan] section of the configuration file XCFGFIL, examines the full.% and the free-extent.nb metrics of each file, and sends an alert if the full.% or the free-extent.nb metrics reach the defined threshold.
6-3-2 FileScan Section The [FileScan] section defines the files that are monitored for size and extent, and any exceptions to the default thresholds. Files can be defined individually and as entire subvolume content. In the second case, it defines the files to be excluded from monitoring. Syntax: File|Subvol[:[Full.% Threshold][:[FreeExtent.nb Threshold]]] The ":" extension will override the default value. The specific threshold defined will override the default threshold $dev.
6-3-3 FileAvailability Section The [FileAvailability] section defines the files that must be present on the system. The files must be defined individually. OVNPM sends an alert each time that it detects that a file is missing. The following is an example of a file that generates an alert if the file HOSTS is missing: $SYSTEM.ZTCPIP.HOSTS syntax: <$vol.subvol.file> ---------------------------------------------------------------$system.ztcpip.hosts $system.ztcpip.
$system.sys??.zzsa*: > 3 $data07.pmexe.zzsa????: > 0 $data07.pmexe.zzs*: > 0 $OSS.KSAO33.zzsa*: > 3 $OSS.KSAO33.*com: < 2 Pattern Definition Valid file patterns start with $, which are followed by a colon. The file pattern can contain the asterisk symbol (*) and the question mark (?). When a file pattern represents a subvolume or a volume, you can use the subvolume or volume name. NOTE: To avoid excessive CPU usage, avoid pattern such as $*.*.*.
6-4 Editing the File Scanning Configuration To edit the file XCFGFIL, enter one of the following commands: SV_EDIT_XCFGFIL or SV_TEDIT_XCFGFIL If the configuration file XCFGFIL is modified, the next scan will automatically retrieve the contents of the new configuration file. The configuration file can contain comments. These comments start with the character # and finish at the end of the line.
7 Configuring Availability This chapter explains the syntax and output devices to which messages are sent. 7-1 Overview OVNPM can check availability of applications or devices, such as disks and lines. Each time a device or a process belonging to an application changes status, an alert message is sent to one of the following means of display: • To four output devices configured in the file VISENV (^alert^out0 to ^alert^out3 environmental variables).
-- APPLI or APPLICATION -- -- .comment= -- -- .process=<$Process1> <$Process2> ... -- -------------------------------------------------------. . --{section QA_OVNPM_application discovery. APPLI QA_OVNPM .process=$VVPM $VVDS $VVTCP $VVB00 $VVSN $VVSNI $VVTTS $VVX00 .process=$VVCM $VVDBS $VVB01 $VVXF $VVDBR $VVTRD --}section QA_OVNPM_application discovery. 75 RECORDS TRANSFERRED \VEGAS $DATTD1 TMCFG:v 25> sv_new_appcfg * notify new Application Configuration..
Appendix A OVNPM Metrics This section describes the various OVNPM metrics like SystemInsight, SystemTrend, SystemReport, SystemCloseup and SystemAccounting and the associated domains. Metrics OVNPM provides real-time monitoring and trend information on a number of HP NonStop system components. Data counters used to store the information for the components are called Metrics. Domains The metrics for the system are concatenated into a logical group called a Domain.
Domain NODE Node view contains a maximum of 16 lines (one for each CPU). Each line describes features of one CPU and the four processes that use the most CPU Time. Metric Description %Cpu-Busy Percentage of time the CPU was busy during the sample time.
Domain CPU SystemCloseup supplies metrics on CPU and on all processes on this CPU using more than 0.1% of CPU Time. Metrics on the current CPU: Metric Description AcclBusyTime.% Percentage of time that the CPU was busy executing accelerated code CacheHits.nb.s Number of times the required block was found in cache during I/O operations. CollectTime.s Duration of collect. CompTraps.nb.s Number of times a compatibility trap occurred on the processor.
Metric Description LcbMx (nb) (MaxLcbsInuse) Maximum Link Control Block on the queue. LcbUs (Qt.s) (LcbsInuseQtime) Time that Link Control Blocks have been allocated. MemQt (%%) (MemQtime) Time the process was on the ready list and waiting on a page fault. MRcvd (nbs10) (MessagesReceived) Number of messages received by the process. MSent (nbs10) (MessagesSent) Number of messages sent by the process. PgFlt (nbs10) (PageFaults) Number of page faults generated by the process.
Metric Description Read (KBytes) Number of KBytes read from the disk. ReadBusy (% x100) Time in percentage spent reading from the disk. Reads (nbs x100) Number of read operations performed by the disk process, in number per second with a scale equal to 100. Reqst (nbs x100) Number of I/O requests (read, write, FILEINFO) received by the disk process. ReqstQLenMx (nb) Maximum number of items on the Request queue.
Metric Description Filename The name of the active file LkBounc (nbs x100) Number bounced locks (FELOCKED=error 73) returned to File System. LkTmO (nbs x100) Number of time-outs on locks (error 40). LkWtm (% x10) Percentage of time in microseconds spent waiting for locks. MxLkT (% x10) Maximum wait time per lock. OpenQLenMax Maximum number of opens on this file. OpenQTime (%) Percentage of time in microseconds for all the opens of this file (including transient).
Metric Description Escal (Escalations) Number of times a lock escalated to a file-level lock FileBusyTime (ms) Time spent executing waited I/O requests. LockWaits Number of times a call waited for a lock request. Msg Number of messages sent. Msg (KByte) Number of message bytes sent and received. Occur Number logical MEASURE records for the current items Opener Process Opener process name Pid CPU, Pin of the opener process Program Filename Object filename executed by the opener process.
Metrics on PROCESSES that execute the current file (for file code 100 only): Metric Description AccBuTm (%%) (AccelBusyTime) Time that CPU was busy executing accelerated code. ChkPt (nbs10) (Checkpoints) Number of checkpoints. ComTr (nbs10) (CompTraps) Number of compatibility trap occurring during execution. CpuBu (%%) (CpuBusyTime) Time that the CPU spent executing the process. Dispa (nb.s) (Dispatches).
Domain PROCESS SystemCloseup provides metrics on the current PROCESS, on processes that have opened the current PROCESS, on processes opened by the current PROCESS, and on other units opened by the current process. For an SQL process, it supplies information related to SQL activity on the process (SQLPROC) and within the process (SQLSTMT). Metric Description Checkpoints.nb. Number of checkpoints CollectTime.s. Duration of collect CpuBusyTime.
Metrics on opener PROCESSES: Metric Description Escal (Escalations) Number of times a lock escalated to a file-level lock. FileBusyTime (ms) Time spent executing waited I/O requests. LockWaits Number of times a call waited for a lock reques. Msg Number of messages sent. Msg (Kbytes) Number of message bytes sent and received. Occur Number logical MEASURE records for the current items. Opener Process Opener process name.
Metric Description FileBusyTime (ms) Time spent executing waited I/O requests FileCode File code of the file. Full (%) Percentage of EOF according to the maximum file size. InfoCalls Number of FILEINFO and FILERECINFO operations made by the current process. LastModification Last modification date of the file LockWaits Number of times a call waited for a lock request. MaxExt (Maximu Extents number) Maximum extents number configured.
Metric Description TmOutCancl (TimeoutsOrCancels) Number of time-outs or cancels issued. UpdOrRepli Number of update or reply operations made by the current process. Writs (Writes) Number of write operations. WrRds (WriteReads) Number of write/read operations. Metrics on SQL process: Metric Description ObjRecomp Number of objects recompiled for the process. ObjRecompTime Percentage of time the process spent recompiling objects. StmtRecomp Number of statements recompiled for the process.
SystemAccounting SystemAccounting metrics return monthly DISKs and SPOOLERs statistics based on the information collected respectively by the programs XCDISC and XCSPOOL started daily at the configured time.
Appendix B VCONF Utility This section lists all the steps, in chronological order, to be followed to run the OVNPM Configuration Utility, VCONF It explains how to choose a default editor, set OVNPM environmental variables, change database retention parameters and edit the list of collected metrics. It describes how to enter the list of collected devices and product license information, how to build the configuration and to view an estimate of the database size.
Sample Configuration Utility Menu -------------------------------------------------------------------- OVNPM 1.5 configuration utility. -- ------------------------------------------------------------------1 Choose EDIT or TEDIT editor Current is edit. 2 Edit OVNPM environment edit $DATA07.OVPMTACL.VISENV 3 Configure distributed servers dsv and dsn 4 Edit collection parameters edit $DATA07.OVPMCFG.dcprm 5 Edit database retention parameters 6 Edit list of collected metrics edit $DATA07.
Editing the OVNPM Environment (Optional) OVNPM environmental variables are used to specify locations of program, data, and configuration files. They also store other operating information such as the name, priority, and launching CPU of the processor. These variables are created and initialized by the OVNPM TACL macro named VGEN. These variables are stored in the VISLOAD file located in the OVNPM TACL subvolume. To edit the environment, select the Edit OVNPM environment option by typing 2 as your choice.
Editing Database Retention Parameters (Optional) Collected data is stored in the OVNPM database. Each collection interval is kept in a separate data file. There is a separate alphanumeric data file for SystemReport values. Each collection interval can have a different retention period. The retention period is the number of days that data is kept in the file. NOTE: If this is your first OVNPM installation, accept the default settings.
Preparing an Initial List of Units The first time you use this dialog box, enter A (Automatic). OVNPM scans your system, builds an initial list of physical units, creates a USERCFG file, and saves the list in the file. This routine is normally used only once. If you select the automatic option and an existing USERCFG file is found, you are asked if you want to continue. If you answer yes, the existing file is renamed and a new USERCFG file is created excluding any previous changes.
Entering Accounting Parameters This procedure schedules OVNPM disk and spooler accounting data collection. You must schedule data capture capability for disk and/or spooler in order to run these specific usage reports. Data capture is performed by batch programs, which are run at a time you specify. In this procedure, you schedule the time data collection to execute each day as well as additional run-time controls. You also can modify the list of disks and spoolers for which data is captured.
Starting OVNPM Subsystem Use this function to start an OVNPM subsystem. If any problems occurred during the installation process, OVNPM displays the errors. When possible, it suggests possible causes of the error. If you have not yet installed a copy of the OVNPM Display Agent, install it now. Use the Display Agent to exercise all the monitoring and reporting functions of OVNPM. Viewing OVNPM Status Use this function to view the status of the OVNPM subsystem.
Appendix C USERMT File Appendix C This is an example of a USERMT file. ----------------------OVNPM Metrics configuration file V7.0a-------NODE: Accel-Busy.% , Collect = ON , Trend = OFF NODE: Activ-Disc-Process.nb , Collect = ON , Trend = OFF NODE: Activ-Line-Hdler.nb , Collect = ON , Trend = OFF NODE: Activ-NetLine-Hdler.nb , Collect = ON , Trend = OFF NODE: Activ-Process.nb , Collect = ON , Trend = ON NODE: Active-Cpu.nb , Collect = ON , Trend = ON NODE: Avrg-Cache-Hits.
NODE: NetLine-Cpu-Busy.% , Collect = ON , Trend = ON NODE: Netline-Io.nb/s , Collect = ON , Trend = OFF NODE: Send-Busy.% , Collect = ON , Trend = ON NODE: Tns-Busy.% , Collect = OFF, Trend = OFF NODE: Tnsr-Busy.% , Collect = OFF, Trend = OFF NODE: Trans-Total.tps , Collect = ON , Trend = ON CPU: Cpu-Busy.% , Collect = ON , Trend = ON CPU: Cpu-Load , Collect = ON , Trend = ON CPU: Cpu-Overhead.% , Collect = ON , Trend = ON CPU: Cpu-QLen-Max.
Disc-Cpu-Busy.% , Collect = ON , Trend = ON CPU: Disc-Io.nb/s , Collect = ON , Trend = ON CPU: Free-Memory.% , Collect = ON , Trend = ON CPU: High-Pcb-Curr.nb , Collect = ON , Trend = OFF CPU: High-Pcb-Free.nb , Collect = ON , Trend = OFF NODE: Lowest-Cpu-Busy.% , Collect = ON , Trend = ON NODE: Lowest-Memory-Free.% , Collect = ON , Trend = ON NODE: Memory-Free.% , Collect = ON , Trend = ON NODE: Memory-Swap.nb/s , Collect = ON , Trend = ON NODE: Memory-Wait.
NODE: Memory-Wait.us , Collect = ON , Trend = OFF NODE: NetLine-Cpu-Busy.% , Collect = ON , Trend = ON NODE: Netline-Io.nb/s , Collect = ON , Trend = OFF NODE: Send-Busy.% , Collect = ON , Trend = ON NODE: Tns-Busy.% , Collect = OFF, Trend = OFF NODE: Tnsr-Busy.% , Collect = OFF, Trend = OFF NODE: Trans-Total.tps , Collect = ON , Trend = ON CPU: Accel-busy.% , Collect = ON , Trend = OFF CPU: Activ-Disc-Process.nb , Collect = ON , Trend = OFF CPU: Activ-Line-Hdler.
Intr-Busy.% , Collect = ON , Trend = ON CPU: Line-Cpu-Busy.% , Collect = ON , Trend = ON CPU: Low-Pcb-Curr.nb , Collect = ON , Trend = OFF CPU: Low-Pcb-Free.nb , Collect = ON , Trend = OFF CPU: Mem-Queue-Len.nb , Collect = ON , Trend = OFF CPU: Memory-Swap.nb/s , Collect = ON , Trend = ON CPU: Memory-Wait.us , Collect = ON , Trend = OFF CPU: NetLine-Cpu-Busy.% , Collect = ON , Trend = ON CPU: Page-Fault.nb/s , Collect = ON , Trend = ON CPU: Process-Dispatches.
DISC: Io-Busy.% , Collect = ON , Trend = OFF DISC: Io-Len.B , Collect = ON , Trend = OFF DISC: Io-Long-Time.ms , Collect = ON , Trend = OFF DISC: Io-Rate.Kb/s , Collect = ON , Trend = OFF DISC: Io.nb/s , Collect = ON , Trend = ON DISC: Read-Busy.% , Collect = ON , Trend = OFF DISC: Read-Len.B , Collect = ON , Trend = OFF DISC: Read-Long-Time.ms , Collect = ON , Trend = OFF DISC: Read-Rate.Kb/s , Collect = ON , Trend = OFF DISC: Read.
Eof.KB , Collect = ON , Trend = OFF FILESIZE: Free-Extent.nb , Collect = ON , Trend = ON FILESIZE: Full.% , Collect = ON , Trend = ON FILESIZE: Disc-Ext-Warn.nb , Collect = ON , Trend = OFF PROGRAM: Active-Process.nb , Collect = ON , Trend = ON PROGRAM: Mem-Page.pg , Collect = ON , Trend = ON PROGRAM: Msg-Rcvd.nb , Collect = ON , Trend = ON PROGRAM: Msg-Sent.nb , Collect = ON , Trend = ON PROGRAM: Page-Fault.nb , Collect = ON , Trend = ON PROGRAM: Rcv-Qlen.
NETLINE: Read.nb/s , Collect = ON , Trend = OFF NETLINE: Write.nb/s , Collect = ON , Trend = OFF TERMINAL: Io-Rate.b/s , Collect = ON , Trend = ON TERMINAL: Query-Resp-Time.s , Collect = ON , Trend = OFF TERMINAL: Query.nb/s , Collect = ON , Trend = OFF USER: Activ-Process.nb , Collect = ON , Trend = ON USER: Cpu-Busy.% , Collect = ON , Trend = ON USER: Mem-Page.nb , Collect = ON , Trend = OFF USER: Msg-Rcvd.nb , Collect = ON , Trend = OFF USER: Msg-Sent.
oss-tty-read-rate.kb/s , Collect = ON, Trend=OFF PROCESS: oss-tty-write-rate.kb/s , Collect = ON, Trend=OFF PROCESS: ossns-message-rate.kb/s , Collect = ON, Trend=OFF PROCESS: oss-tty-reads.nb , Collect = ON, Trend=OFF PROCESS: oss-tty-write.nb , Collect = ON, Trend=OFF PROCESS: ossns-dd-calls.nb , Collect = ON, Trend=OFF PROCESS: ossns-requests.nb , Collect = ON, Trend=OFF PROCESS: ossns-redirects.nb , Collect = ON, Trend=OFF PROCESS: launches.
APPLIDEF: Avrg-Ready-Time.% , Collect = ON , Trend = OFF APPLIDEF: Avrg-Tns-Busy.% , Collect = OFF, Trend = OFF APPLIDEF: Sum-Accel-Busy.% , Collect = OFF, Trend = OFF APPLIDEF: Sum-Checkpoints.nb/s , Collect = ON , Trend = OFF APPLIDEF: Sum-Comp-Trap.nb/s , Collect = OFF, Trend = OFF APPLIDEF: Sum-Cpu-Busy.% , Collect = ON , Trend = ON APPLIDEF: Sum-Dispatches.nb/s , Collect = ON , Trend = OFF APPLIDEF: Sum-Ext-Segs-Max.nb , Collect = OFF, Trend = OFF APPLIDEF: Sum-Io-Rate.
USERDEF: User-Counter BASE24DEF: Base24-Counter , Collect = ON , Trend = ON , Collect = ON , Trend = ON Appendix C NOTE: Do not remove a line to set a metric to OFF, change the Collect flag to OFF instead.
Appendix D DCPRM File OVNPM supplies a file named DCPRM that contains parameters to customize and tune OVNPM to match your needs. Display Agent ShowAllInterval When set to 0 (default value), the Display Agent shows only the intervals that are saved to the database. (See Insight retention period). Data Collection Parameter Description CollectInterval SystemInsight and SystemReport collections use this parameter. It represents the collection metrics interval.
Description MeasExtMaxNb The maximum of extents used by a measurement file. The measurement files are located in LOG subvolume. The maximum eof of a measurement file is equal to MeasExtNb * MeasExtSz * 2048 bytes. The default value is 50. Previous name of this parameter was MeasMaxExt. Even that name is still supported, it should not be used any more. MeasSwapFull The MEASSWAPFULL parameter specifies when OVNPM should swap the measurement file.
Appendix E USERCFG File This section provides a sample of a USERCFG file.
LINE $LAN1 , TREND = ON LINE $X25A , TREND = ON ---------------------------------------------------------------- Section: NETLINE -- Syntax : NETLINE <$Device> [,TREND=] -- Example: NETLINE $PATHSQ --------------------------------------------------------------NETLINE $LEXT2 , TREND = OFF ---------------------------------------------------------------- Section: TMF -- Syntax : TMF [,TREND=] -- Example: TMF 00 --------------------------------------------------------------TMF
.Processes=$STK0+$STK1+$STK2 .File=$OLD.JMN60TST.
-- Example: DISKFILE $DATA.DATABASE.FILE1 -- Usage : This domain is made to measure physical accesses in the configured files -- Warning: Adding units in this domain will increase the overhead of the product. If you do not need to know physicalaccesses in the configured files, use the domain FILESIZE instead of this one. --------------------------------------------------------------DISKFILE $SWAP.DATA.
-- with = Q for QUEUE counter -- Example: USERDEF $XYZ.TRANSACTION-NUM.000.NB/I --------------------------------------------------------------USERDEF $ZMSB.MT1-ACCUM.000.NB ,TREND = ON USERDEF $ZMSB.MT1-ACCUM.000.NB/S ,TREND = ON USERDEF $ZMSB.MT1-ACCUM.000.NB/I ,TREND = ON USERDEF $ZMSB.MT4-BUSY.001.% ,TREND = ON USERDEF $ZMSB.MT5-QUEUE.000.Q ,TREND = ON USERDEF $ZMSB.MT5-QUEUE.000.QT ,TREND = ON USERDEF $ZMSB.MT5-QUEUE.000.
-- is an eight case sentive character unique identifier. -- only letters, digits and sign '_' are allowed. -- is identifier of an already define spn. -- is the path name for an OSS entity -- like FILE or DIRECTORY. It must start by character '/'. -- It can be quoted if it contains space. -- The maximum length can be 248 characters. -- The combined length of their corresponding OSS pathnames -- should not exceed 1024 characters.
Appendix F SystemAlert and Event Messages Event Management System (EMS) messages are generated by OVNPM, which can send these messages to SystemAlert or to third parties. System Alert The system creates alert when a thresholds violation occurs. System alert thresholds are defined on instances added in the USERCFG file. The alert number from system alert is “100 + domainId”. For example disc domain id is 1, so alls alert number from any disc thresholds will be 101. 04-06-01 11:07:28 \NODE.$TDSNODE.255.
Event Alert Event creates alert on process that match generic threshold, application and device from Navigator Configuration File APPCFG. Event alert numbers are in range 200-299. Current numbers used are: No. Event Name Description and Example 201 Evt-EventPrs 202 Evt-EventFile Filescan event, created by modifying XCFGFIL It can be used to check file size and number of free extents. 000202 FileEvent File=$DATA02.OW.LOG1 Full=100% [>=1%] FreeExtent.
EMS Format Each EMS message created by OVNPM is made up of one subsystem identifier (SSID) and at least one token map. A token map describes one SystemAlert or Event message.
Token Type Comment DiId zspi-ddl-int Unit Id Scale zspi-ddl-int Internal Scale Zoom zspi-ddl-byte Collect Interval Vtype zspi-ddl-byte Violation Type Thrtype zspi-ddl-byte Threshold Type Severity zspi-ddl-int Severity Thrvalue zspi-ddl-int2 Threshold value Thrlimit zspi-ddl-int2 Threshold limit Vdur zspi-ddl-char6 Violation duration Xdur zspi-ddl-char6 Exclusion duration NoticeDate zspi-ddl-char6 Notification date NoticeTime zspi-ddl-char6 Notification time Token Map f
Token Map for File Event Messages Token Type Comment EvtType zspi-ddl-inte FileEvt Typ EvtId zspi-ddl-int FileEvt Id EvtName zspi-ddl-char24 FileEvt Name dtId zspi-ddl-int DomainType Id (File). Fname zspi-ddl-char40 FileName Violation zspi-ddl-int One of OVNPM-evt-FileEvt* constant Full zspi-ddl-int File Full space % FullThr zspi-ddl-int Full space % threshold Free Ext zspi-ddl-int File Free extent number. FreeExtThr zspi-ddl-int Free extent number threshold.
Token Type Comment unit zspi-ddl-int Unit (Disc,Line,NEtLine..) state zspi-ddl-int SH (Warm,Up,Down) time zspi-ddl-timestamp Token Map for device Disc Event Messages for G Series Token Type Comment EvtType zspi-ddl-int Evt Type EvtId zspi-ddl-int Evt Id EvtName zspi-ddl-char24 Evt Name dtId zspi-ddl-int DomainType Id ().
Additional information about OVNPM messages Additional information about SystemAlert and Event messages is found in the files below. These files are contained in the distribution subvolume OVPMSRC. File Contents VDDL Data Definition Language (DDL) containing a complete description of SystemAlert and Event messages. VDDTACL, VDDC, VDDTAL TACL, C, and TAL definition from the VDDL file. F* files Sample of filter source code example of a filter for SystemAlert messages.
Document Feedback Form We would appreciate your comments on the OVNPM Server Monitoring Guide. After you have read the Guide and used the software, please take a few moments to complete this form and return it to appropriate HP product support service. Guide 1. The organization of the Guide makes it easy to locate topics quickly. Strongly Disagree 1 2. Strongly Agree 2 3 4 The appearance of the pages makes it easy to locate information quickly. Strongly Disagree 1 3.
9. The level of writing in this book is: Too Basic Just Right Too Technical (Please be specific.) 10.
A accounting parameters, 85 Alert Tokens, 107 APPLIFILE, 27 audience, 8 B BASE24DEF, 29 Blacklisted Measurements, 12, 13 Build configuration, 84 Building Configuration, 18 Building the Configuration, 18 C collection parameters, 12 Collector, 85 CPU domain, 29 Creating the OVNPM Environment, 82 Event alert, 107 Event Management System, 106 R F Range Definition, 63 File Scanning, 60 FileAvailability, 62 FileCount], 62 FILESIZE domain, 35 S group, 54 GROUP domain, 36 SERVERNET, 43 SPN, 19 SPN declara