HP NonStop TMF Operations and Recovery Guide Abstract This manual describes how to perform day-to-day operations with the HP NonStop™ Transaction Management Facility (TMF), how to dump and archive audit information and the database, and how to recover from error conditions. This manual is intended for system managers and operators. It covers the features and operations available with the TMF 3.3 product. Product Version TMF G08 Supported Releases This publication supports G06.
Document History Part Number Product Version Published 136589 NonStop TM/MP D46 May 1998 422843-001 NonStop TM/MP D46 March 2001 522417-001 TMF G07 August 2002 522417-002 TMF G07 April 2004 522417-003 TMF G08 April 2005
HP NonStop TMF Operations and Recovery Guide Index Tables What’s New in This Manual vii Manual Information vii New and Changed Information viii About This Manual xi Who Should Read This Manual xi How This Manual is Organized xi TMF Documentation xii Other Documentation xiii Notation Conventions xiv 1.
2. Routine Maintenance (continued) Contents 2.
3. Occasional Operations (continued) Contents 3.
. The TMF Catalog Contents 6.
. Recovery Methods (continued) Contents 7.
C. Managing SQL Objects Contents C. Managing SQL Objects Audited Objects C-1 TMF Guidelines for SQL Objects C-1 Operations for SQL/MP Only C-2 Operations for Both SQL/MP and SQL/MX C-6 Impact of SQL Operations on Online Dumps C-8 Index Tables Table 1-1. Table 1-2. Table 1-3. Table 1-4. Table 2-1. Table 2-2. Table 2-3. Table 2-4. Table 2-5. Table 3-1. Table 4-1. Table 4-2. Table 7-1.
What’s New in This Manual Manual Information HP NonStop TMF Operations and Recovery Guide Abstract This manual describes how to perform day-to-day operations with the HP NonStop™ Transaction Management Facility (TMF), how to dump and archive audit information and the database, and how to recover from error conditions. This manual is intended for system managers and operators. It covers the features and operations available with the TMF 3.3 product.
What’s New in This Manual New and Changed Information New and Changed Information This is the eighth edition of the HP NonStop TMF Operations and Recovery Guide. This edition describes the features and operations provided by the new TMF 3.3 product. In addition, this edition incorporates the previously published document, TMF Supplement for Large Audit-Trail Files.
What’s New in This Manual • • New and Changed Information Section 7, Recovery Methods, revises the following discussions and includes other minor corrections and clarifications. ° Displaying Backout Process Activity, to note that the STATUS TRANSACTION command shows the specific backout processes assigned to aborting or aborted processes. ° RECOVER FILES Command Specification, to present these new or changed elements: ° The new and NOT options.
What’s New in This Manual New and Changed Information HP NonStop TMF Operations and Recovery Guide —522417-003 x
About This Manual This manual describes methods for maintaining the HP NonStop™ Transaction Management Facility (TMF) on HP NonStop servers. This manual also describes TMF recovery methods that can be used to maintain the integrity and consistency of databases for online transaction processing (OLTP) applications.
TMF Documentation About This Manual Section 2, Routine Maintenance, describes how to perform routine maintenance on TMF objects, including audit-trail volumes, data volumes, local and distributed transactions, and TMF operations. Section 3, Occasional Operations, tells how to perform TMF operations that are needed only periodically and are not considered part of routine TMF maintenance.
Documentation Description About This Manual Documentation Description • • • • • • • TMF Introduction Read this manual first. It provides a general overview of TMF concepts and capabilities for business professionals, application designers and programmers, and system managers and administrators. TMF Glossary Refer to this manual to look up technical terms used in the TMF documentation set.
Notation Conventions About This Manual • • • • • • Distributed Systems Management/Software Configuration Manager (DSM/SCM) Event Management Service (EMS) File Utility Program (FUP) Measure Subsystem Pathway/iTS Surveyor Subsystem Various programming languages, including HP COBOL for NonStop Systems, Pathway SCREEN COBOL, FORTRAN, TAL, C, C++, and SQL (NonStop SQL implementation). Notation Conventions Hypertext Links Blue underline is used to indicate a hypertext link within text.
General Syntax Notation About This Manual each side of the list, or horizontally, enclosed in a pair of brackets and separated by vertical lines. For example: FC [ num ] [ -num ] [ text ] K [ X | D ] address { } Braces. A group of items enclosed in braces is a list from which you are required to choose one item. The items in the list can be arranged either vertically, with aligned braces on each side of the list, or horizontally, enclosed in a pair of braces and separated by vertical lines.
Change Bar Notation About This Manual Line Spacing. If the syntax of a command is too long to fit on a single line, each continuation line is indented three spaces and is separated from the preceding line by a blank line. This spacing distinguishes items in a continuation line from items in a vertical list of selections. For example: ALTER [ / OUT file-spec / ] LINE [ , attribute-spec ]… Examples. In the examples of commands, command entries and other user input appears in bold-face type.
1 Overview of TMF Maintenance and Recovery This section provides an overview of the maintenance tasks performed for TMF. It also describes the TMF STATUS and TMF INFO displays. These two displays let you monitor the status and attributes of TMF, and contain information vital for smooth TMF operation.
Overview of TMF Maintenance and Recovery • • Corrective Tasks Back up the TMF catalog and configuration files (Section 2, Routine Maintenance and Section 7, Recovery Methods). Create Disk Space Analysis Program (DSAP) reports for audit-trail volumes and volumes that contain audited files (Section 7, Recovery Methods). Corrective Tasks Take the following actions only when there is a deficiency or problem on your system: • • • • • • • Reconfigure the audit dump configuration (Section 4, Audit Dumps).
Overview of TMF Maintenance and Recovery Viewing Your TMF System Viewing Your TMF System As the TMF operator, you should be familiar with the TMF configuration at your node. The TMFCOM STATUS and TMFCOM INFO commands provide this information. For instructions on using TMFCOM, see the TMF Reference Manual. Displaying TMF Status Use the STATUS TMF command to view the status of your TMF system.
Overview of TMF Maintenance and Recovery Displaying TMF Status To recover from this situation, increase the audit-trail capacity as described in Responding to Audit-Trail Overflow on page 3-5. It is important that you determine the cause of the capacity problem so you can correct it and make sure it does not recur. The information in the STATUS TMF display can notify you of a potential problem on the system, which you can usually prevent by taking some action.
Overview of TMF Maintenance and Recovery Displaying TMF Status Table 1-2 lists the possible TMF states reported by the STATUS TMF command. Table 1-2. TMF States State Meaning Empty audit-trail configuration TMF has been brought up for the first time on this node and no configuration exists for it, or a DELETE TMF command was executed. To start TMF, you must use the ADD AUDITTRAIL command to add at least one audit trail.
Overview of TMF Maintenance and Recovery Displaying Configuration Attributes Table 1-3 lists the possible catalog states reported by the STATUS TMF command. Table 1-3. Catalog States State Meaning up The catalog is up and available. down The catalog is down and unavailable. active TMF is working on the catalog, perhaps adding, altering, or deleting entries; the catalog is temporarily busy. waiting TMF is being brought up; the catalog is not yet available.
Overview of TMF Maintenance and Recovery Displaying Configuration Attributes RMOpenPerCpu 32 BranchesPerRM 128 Table 1-4 describes the information in the INFO TMF display. For detailed information about TMF configuration attributes, see the TMF Reference Manual. Note. The GoRemote, PioBuffer, and TransactionProtocol items can appear in this display only if your system is running the HP NonStop™ Remote Database Facility subsystem that guarantees Zero Lost Transactions (RDF/ZLT).
Overview of TMF Maintenance and Recovery Displaying Configuration Attributes Table 1-4. Understanding the INFO TMF Display (page 2 of 3) Heading/Attribute Meaning overflowthreshold The percentage of the active-audit volumes’ capacity that can be used before TMF begins copying the oldest audittrail files to the overflow volumes. begintransdisable The percentage of the active-audit volumes’ capacity that can be used before new transactions cannot start.
Displaying Configuration Attributes Overview of TMF Maintenance and Recovery Table 1-4. Understanding the INFO TMF Display (page 3 of 3) Heading/Attribute Meaning Last Reel Name (identifier) of tape reel, if TMF has been started and a tape has been assigned. Otherwise, this field displays the following information: • • “None” if TMF has been started but no tape has been assigned. “TMF not started” if TMF is stopping or stopped.
Overview of TMF Maintenance and Recovery Displaying Configuration Attributes Note. The RECRMCOUNT, RMOPENPERCPU, and BRANCHESPERRM fields shown at the end of Table 1-4 apply to resource managers used in heterogeneous transactions supported by HP products such as the NonStop TUXEDO system. Heterogeneous transactions involve multiple transaction managers running on multiple platforms.
2 Routine Maintenance This section describes how to perform routine maintenance on your TMF system’s components.
Understanding Audit-Trail File Names Routine Maintenance Understanding Audit-Trail File Names An audit-trail file name consists of three parts: volume, subvolume, and file ID: $active-audit-volume The system manager uses the ADD AUDITTRAIL or ALTER AUDITTRAIL command to specify from 1 to 16 active-audit volumes for the audit trail. ZTMFAT All TMF audit trails are stored in the ZTMFAT subvolume. AA | BB...
Displaying Audit-Trail Activity Routine Maintenance Auxiliary01: Active audit trail capacity used: 54% First pinned file: $AUX1.ZTMFAT.BB000003 Reason: Required by the MAT. Files: $AUX1.ZTMFAT.BB000001( available ) $AUX1.ZTMFAT.BB000002( available ) $AUX1.ZTMFAT.BB000003( active, dumped ) $AUX1.ZTMFAT.BB000004( active, dumped ) $AUX1.ZTMFAT.BB000005( active, dumped ) $AUX1.ZTMFAT.BB000006( active, current ) $AUX1.ZTMFAT.ZTMF0001( preallocated ) Note.
Displaying Audit-Trail Activity Routine Maintenance Table 2-1. Understanding the STATUS AUDITTRAIL Display (page 1 of 2) Audit-Trail Status Meaning Master | Auxiliarynn The audit-trail name. If the name is followed by the message “Overflow space in use,” the active-audit volumes for this audit trail have exceeded the overflow threshold. See Responding to AuditTrail Overflow on page 3-5 for information on how to proceed. Active-audit trail capacity used The percentage of the audit trail that is full.
Why Audit-Trail Files Stay Pinned on a Volume Routine Maintenance Table 2-1. Understanding the STATUS AUDITTRAIL Display (page 2 of 2) Audit-Trail Status Meaning (Dumped) —The file has been copied to archive media by the audit dump process. An available file is not marked as dumped because it must have been dumped to become available. (Dumping)—The file is being copied to archive media by the audit dump process. (Not Dumped)—The file has not yet been copied to archive media.
Why Audit-Trail Files Stay Pinned on a Volume Routine Maintenance Reason. Might be needed to recover datavol volume-name volume Meaning. This audit-trail file contains audit records that may be needed by the volume recovery process to reapply transactions to the specified data volume.
Why Audit-Trail Files Stay Pinned on a Volume Routine Maintenance Reason: Required by RDF environment Meaning. The NonStop Remote Database Facility (RDF) requested the pinning of the file on disk, and eventually will release the disk. The environment field indicates the RDF environment that made the request, identified by the system (node) and RDF configuration ID. Action. If RDF will be down for a long time, you may want to use the RDFCOM UNPINAUDIT command to release the pinned audit-trail file.
Initiating an Audit-Trail File Rollover Routine Maintenance Reason: Volume is down Meaning. The volume on which the audit-trail file resides is down, so TMF cannot perform the operation needed to release the file. Action. Perform the maintenance needed to bring the volume back online. Initiating an Audit-Trail File Rollover TMF normally begins writing audit records to a new active audit-trail file when the current file becomes full. This activity is known as a “rollover.
Maintaining Data Volumes Routine Maintenance Maintaining Data Volumes When TMF starts, it automatically starts all data volumes that are enabled for transaction processing. You monitor the status of enabled data volumes to make sure that they are operating and configured properly. This section describes how to monitor the data volume status display, and explains what action may be needed based on the information in that display.
Understanding Data Volume States Routine Maintenance Table 2-2. Understanding the STATUS DATAVOLS Display Heading Meaning Volume The data volume name, as specified in an ADD DATAVOLS command. Audit Trail The audit trail that will contain the audit records from the data volume, as specified in an ADD DATAVOLS command. Recovery Mode Whether the audit-trail files associated with this data volume remain on disk until they are no longer required for volume recovery.
Understanding Data Volume States Routine Maintenance State: Unknown Meaning. A data volume can be in the unknown state only when TMF is in the process of stopping. The volume is not configured as a data volume or it is configured but has been shut down. TMF has no information about the volume. Action. Use the STATUS DATAVOLS command again when TMF is running. State: Unconfigured Meaning. The volume is not configured as a data volume or an audit-trail volume.
Understanding Data Volume States Routine Maintenance State: Waiting for transaction resolution Meaning. There may be transactions in progress that affect the data volume. This state normally lasts a short time; it may last longer if there is a long-running transaction on the system, or if a homogeneous or heterogeneous distributed transaction cannot complete because of a network partition.
Understanding Data Volume States Routine Maintenance SCF PRIMARY command is discussed in the SCF Reference Manual for G-Series RVUs, and the ALTER TMF, SWITCHPROCESS TMP command is explained in the TMF Reference Manual. State: Started Meaning. The volume is ready to process transactions. If a data volume in this state stops because of a physical or disk process problem, it is automatically returned to the started state the next time it is operational. Action.
Understanding Data Volume States Routine Maintenance Action. When the volume is ready to be used for transaction processing, use the ENABLE DATAVOLS command. State: Down Meaning. The volume would be enabled under normal conditions, but is currently inaccessible. This can happen because of a physical disk or disk process problem, or because TMF is stopped. Action. Repair the disk, if necessary, and bring it back online.
Maintaining Transactions Routine Maintenance Action. If the wrong volume is mounted, mount the correct one. If you want to use the mounted volume, you must delete the volume and then re-add it to the TMF configuration; to do this, thus omitting this volume from the volume recovery, use the DISABLE DATAVOLS, DELETE DATAVOLS, and ADD DATAVOLS commands with suitable options. For more details, see the descriptions of these commands in the TMF Reference Manual.
Displaying Transaction Activity Routine Maintenance node-number specifies the number of the node from which the transaction originates. The default is the node on which the TMFSERVE process communicating with your TMFCOM is running. tm-flags specifies a number representing flags used internally by TMF. If this number is zero, it does not appear in displays of the transaction identifier presented by the STATUS TRANSACTIONS command. cpu is the number of the processor from which the transaction originates.
Displaying Transaction Activity Routine Maintenance In this example, the STATUS TRANSACTIONS command was issued at \SYS1. Transaction \SYS1.0.40302 was started on \SYS1 and has accessed \SYS2 and \SYS3, which appear in the output as children. The same command issued on \SYS2 would show the following information: Transaction Identifier ---------------------\SYS1.0.
Understanding Transaction States Routine Maintenance Table 2-3. Understanding the STATUS TRANSACTIONS Display Heading Meaning Transaction Identifier The transaction identifier, as described in the previous section, consisting of the node name or number, CPU number, and sequence number. In addition, if the tm-flags value used internally by TMF is not zero, this value appears in parentheses following the node name or number.
Verifying that New Transactions Can Start Routine Maintenance Table 2-4. Transaction States State Meaning Committed The transaction has completed, and locks held by the transaction are being released. The transaction stays in this state until all nodes affected by the transaction have been informed of the committed changes. Hung The transaction is aborting, but the backout process has failed to undo the transaction.
Verifying that New Transactions Can Start Routine Maintenance Reason: The TRANSCOUNTTHRESH was exceeded. Meaning. The number of current transactions in the system is at or above the configured maximum, which causes TMF to disallow new transactions. New transactions can start when the transaction count decreases to an allowable number (see Table 1-4 on page 1-7 for more information). Action.
Maintaining Distributed Transactions Routine Maintenance Reason: The available TMP memory in CPU number is running low. Meaning. The percentage of extended segment memory allocated by the TMP is at or above the level that causes TMF to disallow new transactions. New transactions can start when the memory used decreases to an allowable percentage (see Table 1-4 on page 1-7 for more information). Action.
Normal Distributed Transaction Processing Routine Maintenance When a heterogeneous transaction in your local TMF system requires access to a table or file in a database managed by a foreign transaction manager, TMF and the foreign transaction manager cooperate in achieving this access. The system in which a distributed transaction originates is referred to as the parent node while the secondary system, whose resources the transaction requires, is referred to as the child node.
What Happens When a Communication Link Goes Down Routine Maintenance Table 2-5. Homogeneous Distributed Transaction States (page 2 of 2) Transaction States in the… Parent Node Child Node Activity COMMITTED PREPARED After the transaction is in the prepared state in all involved child nodes, the parent-node TMP changes the transaction from active to committed and sends a confirmation message to the child-node TMP.
What Happens When a Link is Restored Routine Maintenance Homogeneous Transactions When a communications link goes down, homogeneous distributed transactions in the active state are automatically aborted both in the parent and in all child nodes. Homogeneous distributed transactions in the committed or prepared state, however, cannot change states until the link is restored and the involved TMPs are once again communicating with one another.
Displaying Status of Distributed Transactions Routine Maintenance Displaying Status of Distributed Transactions The STATUS TRANSACTIONS command displays information about transactions of all types, including homogeneous and heterogeneous distributed transactions. If you enter this command without the DETAIL option and a transaction has a resource manager as either a parent or a child, the Parent and/or Children columns in the display will show the notation “Foreign” for each transaction branch.
Maintaining TMF Operations Routine Maintenance State # Attempts to Abort Starting MAT Seq. No : prepared Identifier Start Time Process Parent Child State # Attempts to Abort Starting MAT Seq. No : : : : : : Transaction Identifier Start Time Process Parent Child Child Child State # Attempts to Abort Starting MAT Seq. No : 0 : 809 \PRUNE.5.2343 21-Jan-2002 13:22:17 \PRUNE TUX-HP@ESSG.1 prepared : 0 : 809 : : : : : : : : \TSII.3.93283 21-Jan-2002 13:02:09 $APPL3 TUX-HP@ESSG.1 TUX-HP@ESSG.
Maintaining TMF Operations Routine Maintenance Here is an example of the STATUS OPERATIONS command showing the status of all operations in progress: TMF 52> STATUS OPERATIONS *, STATE INPROGRESS Type State Begin Time -----------------------------------------[16] DumpFiles Finished 13-Jan-2002 09:04:07 [24] StartTMF Finished 13-Jan-2002 09:24:13 [30] DumpFiles InProgress 13-Jan-2002 09:47:01 [31] DumpFiles In Progress 13-Jan-2002 09:55:38 End Time -------------13-Jan-2002... 13-Jan-2002...
Viewing EMS Messages Routine Maintenance Viewing EMS Messages TMF sends event messages to the Event Management Service (EMS). The EMS collects event messages from reporting processes and subsystems and then selectively distributes those messages to various destinations. Such destinations range from a local operator console to a management application running on a remote system. Read these messages as part of your routine TMF maintenance to get a view of the system as reported by each TMF operation.
Viewing EMS Messages Routine Maintenance You can also view TMF EMS messages by using the TMFCOM DISPLAY OPERATIONS command. In the syntax for this command, you must specify the operation number whose EMS messages you want to see. To get the operation number, use the STATUS OPERATIONS command, as described in Maintaining TMF Operations on page 2-26.
Routine Maintenance Keeping Current System Information Keeping Current System Information You can easily keep track of your TMF system and other system components by maintaining a command file that contains a copy of your present configuration, and by printing copies of the output from TMF status and information commands and other system reports you routinely run.
3 Occasional Operations This section describes how to perform TMF operations that are needed only periodically; they are not considered part of routine TMF maintenance.
Stopping TMF Occasional Operations If you issue a START TMF command immediately after a STOP TMF, ABRUPT command, there is a momentary period when the new TMP attempts to create a new backout process (as a result of START TMF) before the old TMF backout process terminates (as a result of STOP TMF, ABRUPT). Very soon thereafter, the backout process terminates but is not restarted by the new TMP. In this event, aborting transactions are hung.
NOWAIT Option Occasional Operations When you need to shut down your TMF system, use the STOP TMF command; you must be a member of the super user group to issue this command.
Altering the Size of an Audit Trail Occasional Operations whether their transactions were committed or aborted. Aborted transactions are undone by the backout process when TMF is started again. Caution. The STOP, ABRUPT command, of course, can disrupt customer applications. If the audit trails are not disturbed while TMF is down, any transactions aborted when the STOP, ABRUPT operation occurs should be properly backed out when TMF is restarted, and the database should remain consistent.
Responding to Audit-Trail Overflow Occasional Operations Responding to Audit-Trail Overflow The overflow threshold is the capacity to which an audit trail can fill before TMF copies the oldest audit-trail file to an overflow-audit volume. Audit trails with an overflow threshold of 80% (default) or slightly higher should rarely exceed that threshold.
Responding to Audit-Trail Overflow Occasional Operations Reason: Long-running transaction Meaning. All audit-trail files affected by a transaction must remain on the active or overflow volumes until the transaction is committed or aborted. Normally, TMF automatically aborts transactions that run longer than two hours; however, if the autoabort function has been turned off, a transaction can run an excessively long time, causing the audit trail to fill. Caution.
Responding to Capacity-Based Transaction Aborts Occasional Operations Recovery Mode on page 3-12 for more information on using archive recovery mode. If, however, there has been a sudden burst of audit information generated, or if there is a permanent increase in the audit generation rate, consider increasing the size of the audit trail or increasing the overflow threshold. Responding to Capacity-Based Transaction Aborts Meaning.
The ALTER AUDITTRAIL Command Occasional Operations In the following example, the files per active-audit volume are increased to 6 in the MAT configuration: TMF 21> ALTER AUDITTRAIL MAT, FILESPERVOLUME 6 Adding Another Active-Audit Volume Adding an active-audit volume increases the audit tail capacity by the number of files configured to reside on each volume of an audit trail; if there is not enough disk space for these files, the volume cannot be added. Note.
Occasional Operations Using the SCF STOP DISK Command on Audit-Trail Volumes if such an event occurs. In addition, you can specify how long you want this deferral to continue. This capability greatly enhances the ability to achieve full ZLT. Note. The COMMITHOLDMODE and COMMITHOLDTIMER options are only available on systems running RDF/ZLT. These options are enabled when RDF/ZLT is installed.
The ENABLE DATAVOLS Command Occasional Operations The data volumes you specify in the DISABLE DATAVOLS command are shut down for TMF transaction processing, but they continue to operate for other purposes. (To shut down the volume completely, name the volume in an SCF STOP DISK command after you disable it.) The volumes remain in the disabled state, through starts and stops of TMF, until they are named in an ENABLE DATAVOLS or DELETE DATAVOLS command.
Changing the Data Volume Configuration Occasional Operations Changing the Data Volume Configuration Data volume configuration changes can have various effects on your TMF system. Before you change the configuration, refer to system reports about typical audit-trail capacity usage, transactions generated during normal and peak periods, and disk space on your system: you should keep such reports in a notebook for easy reference (see Keeping Current System Information on page 2-30).
Specifying Recovery Mode Occasional Operations By default, all configured data volumes are automatically started: • • When you start TMF When the volume comes up after a volume failure (if TMF is started at the time) Data volumes automatically stop processing audit information when you stop TMF. Specifying Recovery Mode When you add a data volume, you can specify its recovery mode.
Replacing Damaged Data Volumes Occasional Operations If you want to remove a data volume from your TMF configuration permanently, the proper way to do so is as follows: 1. Issue a DISABLE DATAVOL command to shut the volume down cleanly within the TMF environment. 2. Issue a DELETE DATAVOL command to delete the volume from the TMF environment.
Moving Data Volumes to Another System Occasional Operations 6. Issue a RECOVER FILES command of the following form: TMF 41> RECOVER FILES $volname.*.* 7. Make online dumps of all database tables or files on the added volume. Moving Data Volumes to Another System If you need to physically move a configured data volume from one TMF system to another, the proper way to do so is as follows: 1. Issue a DISABLE DATAVOL command to shut the volume down cleanly within the TMF environment.
Associating a Data Volume with a Different Audit Trail Occasional Operations Associating a Data Volume with a Different Audit Trail Most applications, even very large ones, use only a master audit trail. If you have configured one or more auxiliary audit trails, however, and you need to associate a configured data volume with a different audit trail, the proper way to do so is as follows: 1. Issue a DISABLE DATAVOL command to shut the volume down cleanly. 2.
Moving Audited Files to a Different Data Volume Occasional Operations Moving Audited Files to a Different Data Volume If you need to move protected files to a different data volume, follow these steps: 1. Stop any applications that access the files to be moved. 2. Using the FUP DUP command, move the files to the new data volume. 3. Using the FUP ALTER command, set the audit attribute of the new files so that the files are audited. 4.
Enabling New Transactions Occasional Operations To start transaction processing later, use the ENABLE BEGINTRANS command. Enabling New Transactions On most systems, transaction processing should run constantly, unless a maintenance operation or system failure causes it to stop. This section describes the situations that can prevent new transactions from starting, and contains suggestions for recovery.
Disabling New Transactions Occasional Operations Reason: There are too many active transactions in progress on the system Meaning. The maximum number of transactions allowed to run concurrently, as configured on your system, has been exceeded. Action. After ruling out application errors by checking the system and application logs for error messages, work with your system manager to alter the transaction count threshold, as described in the TMF Planning and Configuration Guide.
Changing the Autoabort Configuration Occasional Operations Changing the Autoabort Configuration The TMF autoabort function automatically aborts transactions if they run longer than a set amount of time. This feature is useful in an online transaction processing environment where the transactions are usually short, and long-running transactions are assumed to be “runaways.
Changing the Autoabort Threshold Occasional Operations Changing the Autoabort Threshold You can alter the autoabort threshold while TMF is running. When the autoabort threshold is changed, new transactions become subject to the new autoabort threshold. However, currently active transactions remain subject to the previously set autoabort threshold. Caution.
Controlling Individual Transactions Occasional Operations Controlling Individual Transactions In certain rare cases, it may be necessary for you to manually cause a transaction to complete. For transactions on local systems, you use the ABORT TRANSACTION command to abort a transaction; for transactions on remote systems, you use the RESOLVE TRANSACTION command to force a transaction to commit or abort. This section discusses when you may need to take these actions.
Resolving Distributed Transactions Occasional Operations The AVOIDHANGING Option Use the AVOIDHANGING option when you want to remove a hung transaction without compromising data integrity. Use this option, for example, when data integrity is more important than the availability of a particular set of files. When you specify AVOIDHANGING, however, the files affected by a transaction that cannot be undone are marked undo-needed.
Deleting Transactions Occasional Operations Table 3-1. Resolving Distributed Transactions Transaction State at Home Node Action to Take at Remote Node Aborted Issue the RESOLVE TRANSACTION command with the STATE ABORTED option. Aborting Issue the RESOLVE TRANSACTION command with the STATE ABORTED option. Committed Issue the RESOLVE TRANSACTION command with the STATE COMMITTED option. Nonexistent Issue the RESOLVE TRANSACTION command with the STATE ABORTED option.
Requesting Performance-Tuning Operations Occasional Operations The following command removes transaction number 5769602 that originated in CPU 4 of the system in which the command is issued: TMF 50> DELETE TRANSACTION 4.5769602 Caution. The DELETE TRANSACTION command can leave your database in an inconsistent state that cannot be corrected by the TMF recovery process. Contact the Global Customer Support Center (GCSC) or your service provider before issuing this command.
Gathering Information About Errors and Failures Occasional Operations • • • TMF configuration files (including the TMF catalog), whose location is listed in the INFO TMF command display All pertinent notifications and reports from the HP Tandem Service Management (TSM) Notification Director Application If the policies at your site permit: • • Copies of any audit-trail files you believe might be related to the problem (for example, those for which audit-reading errors are reported) Relevant information
Occasional Operations Gathering Information About Errors and Failures HP NonStop TMF Operations and Recovery Guide —522417-003 3- 26
4 Audit Dumps Audit dumps preserve audit-trail files for use by the file recovery process. The audit dump process copies audit-trail files from disk to tape, or from disk to disk when the files are full and no longer needed by outstanding transactions or volume recovery.
Displaying Dump Media Information Audit Dumps The following example shows a STATUS AUDITDUMP command display: TMF 5> STATUS AUDITDUMP AuditDump Status: Master: State: enabled, Status: active, Process: $X824 File: $MAT1.ZTMFAT.AA000012 AUXILIARY01: Not configured for dumping AUXILIARY02: State: DISABLED, Status: inactive Table 4-1 describes the status and state values that can appear in an audit dump status display. Table 4-1.
Making Audit Dumps Audit Dumps Making Audit Dumps For a dump to tape, the audit dump process locates the audit-trail file that is ready to be dumped and gets the name of a scratch tape from the TMF catalog. If the tape is not mounted, the labeled-tape process generates an event message like the following: $ZSVR: 0033 MOUNT TMF018 WITH RING "TMF Audit-Dump ($X025) of $MAT.ZTMFAT.AA000047. Tape #1.
Solving Audit Dump Problems Audit Dumps No operator intervention is required to process a disk dump. If an audit dump to disk fails, the audit dump’s temporary file is not written to disk and the process is tried again.
Pausing and Resuming Audit Dumps Audit Dumps Pausing and Resuming Audit Dumps You can use the DISABLE AUDITDUMP and ENABLE AUDITDUMP commands to control the audit dump process: • • DISABLE AUDITDUMP makes the audit dump process unavailable. This command does not affect an audit dump in progress. ENABLE AUDITDUMP makes the audit dump process available. The DISABLE AUDITDUMP and ENABLE AUDITDUMP commands can be issued only by members of the super user group.
Dumping to Remote Systems Audit Dumps Dumping to Remote Systems Dumping audit-trail files to a remote system is not considered a TMF distributed operation. A remote system does not have to be running TMF to receive audit dumps. You can dump audit trails to either a tape drive or a disk on a remote system. Tape Dumping Use the ALTER AUDITDUMP command to identify a remote system for dumping audittrail files to tape.
5 Online Dumps Online dumps are copies of audited files on tape or disk. These dumps preserve audited files for use by the file recovery process. If your database is damaged, the file recovery process restores the online dump files to disk and uses audit trail files to reconstruct the database.
Making Online Dumps Online Dumps Making Online Dumps You can make online dumps only when TMF is running. Use the DUMP FILES command to make online dumps, specifying the names of the audited files to be dumped. The online dump process copies each file to tape or disk and makes an entry in the TMF catalog for each file.
Selecting Dump Options Online Dumps Selecting Dump Options Use the DUMP FILES command to specify which audited files are dumped. Files are selected or not based on the file names specified in the DUMP FILES command. File names are entered in the DUMP FILES command syntax as follows: DUMP FILES {fileset } {(fileset [,fileset ] ... ) } { } [,NOT {file-set } {(file-set [,file-set]...
Selecting Dump Options Online Dumps The file-name pattern syntax is: [[$pattern.]pattern.]pattern In this syntax, pattern consists of one or more characters. Allowable characters are letters, digits, asterisks (*), and question marks (?). The maximum length of a pattern is 8 characters, including wild-card characters. Wild-card characters can appear in any portion of a name, for as many times as there can be characters in that portion.
Selecting Dump Options Online Dumps In the file, patterns that do not indicate fully-qualified file names are expanded using the default node, volume, and subvolume names established in TMFCOM. (For commands entered directly through SPI rather than TMFCOM, however, you must specify the defaults to be used.) NOT fileset The NOT option specifies which files in the file set list are not to be dumped. If you specify the same file in this option and in file-set, the file is not dumped.
DUMP FILES Command Examples Online Dumps The attributes for online dumps, which are set with the DUMP FILES command, are similar to those for audit dumps, which are set with the ALTER AUDITDUMP command, with the following exceptions: • • • The COPIES option specifies the number of parallel copies to be made of each dump; online dumps cannot be made serially. To make two copies of an online dump, therefore, requires two tape drives.
Dumping Enscribe Alternate-Key Files and Partitioned Files Online Dumps All files on the $DATA volume will be dumped to tape, which is the default. Dumping an entire volume ensures that all audited files get dumped. The following example shows a DUMP FILES command that also uses the wild card: 27> TMFCOM TMF 1> DUMP FILES *.DATABASE.* All of the files in a partitioned database will be dumped to tape.
Dumping SQL Objects Online Dumps Files explicitly specified in the DUMP FILES command are dumped. The NOT option of the DUMP FILES command, however, specifies the files that are not to be dumped, whether or not they are specified elsewhere in the same DUMP FILES command. If you must perform file recovery and have omitted alternate-key files from your online dumps, you can use the following procedure to recover them: 1. Use the FUP ALTER command to change the audited alternate-key file to nonaudited. 2.
Tape Dumping Online Dumps Tape Dumping Use the SYSTEM option of the DUMP FILES command to identify a remote system for dumping audited files to tape. The remote system need not be running TMF, but it must be configured for labeled-tape processing. Tapes used for a remote dump must be registered in the local TMF catalog. Here is a DUMP FILES command example for making an online dump to tape on a remote system: TMF 2> DUMP FILES $DATA.*.
Disk Dumping Online Dumps HP NonStop TMF Operations and Recovery Guide —522417-003 5- 10
6 The TMF Catalog The TMF catalog records audit and online dump media: it specifies where all dumped files reside and which dump media are available for reuse. This section describes how the TMF catalog functions and the operations you perform to maintain it.
Maintaining the TMF Catalog The TMF Catalog 6. When all of the online dump files associated with a particular dump serial number have been changed to released status, all entries for that serial number are deleted. For dumps on tape, the associated tape media are changed from assigned status to scratch status (or to released status, if you specified RELEASED ON in an ALTER CATALOG command). For dumps on disk, the status is changed from assigned to purged. 7.
Displaying the Catalog The TMF Catalog • • The names of the tape or disk volumes on which the file resides The names of the oldest audit-trail files (MASTER and AUXILIARY) necessary to recover the file You initiate online dumps with the DUMP FILES command, as described in Section 5, Online Dumps. A single online dump can include many audited files that share a unique dump serial number. Displaying the Catalog Use the INFO DUMPS command to see dump media information recorded in the TMF catalog.
Displaying the Catalog The TMF Catalog The following example shows an INFO DUMPS command that produces a report of currently assigned audit dumps to disk: 28> TMFCOM/ OUT $S.#DUMP1/ TMF 6> INFO DUMPS *.*.
Changing the State of a Dump The TMF Catalog Note. Use the INFO DUMPS command after each online dump to get a detailed report of dumps currently recorded in the TMF catalog. Save your reports in a disk file stored in a safe location, sorted by date. You can also manually track online dump entries by using the forms provided in Appendix A, Dump Tracking Forms.
Changing the Catalog Configuration The TMF Catalog The following example demonstrates how a dump could accidentally be deleted if you change the status of the last assigned entry in the dump to invalid or released: TMF 10> ALTER DUMPS $DATA.TEMP.F3,SERIAL 100,INVALID ON TMF 11> INFO DUMPS,DETAIL,SERIAL 100 File Dump Name Serial Date-Time ---- ------ ---------- Dump Type ---- Dump Master Data Status ------ ---- ------ Media Media Type Status ---- ------ Media Name ---- Part ---- Copy ---- $DATA.
Changing the Catalog Configuration The TMF Catalog Determining the Status of a Tape Volume Upon Release When all of the dumps contained on a tape volume are released, the status of the tape volume normally changes from assigned to scratch. If you want to change that status to released instead, specify RELEASED ON in an ALTER CATALOG command, as follows: TMF 25> ALTER CATALOG, RELEASED ON When a tape’s status is scratch, the tape can be used for receiving new dumps.
Reentering Deleted Dumps The TMF Catalog Reentering Deleted Dumps You use the ADD DUMPS command to reenter online dump and audit dump entries that were mistakenly deleted from the TMF catalog or to recover a destroyed TMF catalog. Note. The ADD DUMPS command only re-creates entries in the TMF catalog, it does not recreate actual lost or damaged dumps themselves. The dump entries are only effective if the dumps they describe actually exist on available media.
Reentering Deleted Dumps The TMF Catalog TMF catalog at any time through the TMFCOM OBEY command. More information about this approach, along with a full example, appears in the ADD DUMPS and INFO DUMPS command descriptions in the TMF Reference Manual. The following example shows how to use the OBEYFORM option to format INFO DUMPS command output as command file text that can be executed by TMFCOM: TMF 26> INFO /OUT DMPINFO/ DUMPS *.*.
Removing Dumps from the Catalog The TMF Catalog To reenter the audit dump $ZTMF.ZTMFAT.AA000503 into the TMF catalog, issue the ADD DUMPS command, with the TIME option, as follows: TMF 36> ADD DUMPS $ZTMF.ZTMFAT.AA000503, SERIAL 1582, >>>TYPE AUDITDUMP, TAPEMEDIA TMF236:1:1, & >>>TIME 23 Jan 2002, 03:10:31 & Next, assume the INFO DUMPS, DETAIL command has displayed the following information for the online dump $DATA.PANDADA.
Changing Dump Locations The TMF Catalog then you can retry the DELETE DUMPS command when the problem has been corrected. There are many reasons why you might not be able to purge a disk file: the disk’s file directory could contain a damaged sector, for example, or another process might have the file open. As long as other valid ASSIGNED dumps exist on a medium, that medium retains its ASSIGNED status.
Using Tapes of Different Sizes The TMF Catalog rack. Using this scheme, tape ABC12 occupies the twelfth slot in the third rack (C) of cabinet AB. You can specify reel identifiers using uppercase and lowercase letters; TMF001 is the same as tmf001. If you have more than one system in the same area, you might want to use one character of the reel identifier to indicate which system the tape belongs to.
Adding Tape Media to the Catalog The TMF Catalog This INFO TAPEMEDIA command displays a list of scratch tape volumes: 28> TMFCOM TMF 1> INFO TAPEMEDIA, STATUS SCRATCH Media Name Media Type Media Status -----------------------------------ATM007 tape scratch ATM008 tape scratch ATM009 tape scratch Adding Tape Media to the Catalog Use the ADD TAPEMEDIA command to add tape volumes to the TMF catalog. TMF tape volumes are labeled tapes, created by HP system labeled-tape processing.
The TMF Catalog • • Removing Tape Volumes from the Catalog Released: not to be used for file recovery or for receiving new dumps Bad: physically damaged, not usable for file recovery or receiving new dumps Initially, you define a tape as a scratch volume when you add it to the TMF catalog using the ADD TAPEMEDIA command. The tape automatically becomes assigned when it receives an audit dump or online dump.
7 Recovery Methods TMF automatically protects your database with its backout and volume recovery processes. You can perform file recovery on your database, if necessary, as long as you have routinely done online dumps and audit dumps. This section discusses how TMF recovery processes work and gives examples of how to recover from or prevent a variety of situations that could damage your database.
Displaying Backout Process Activity Recovery Methods • • • Communication is lost between two nodes accessed by the transaction. The primary processor for the disk process of a volume accessed by the transaction fails while the transaction is active. Both the primary and the backup processors for the disk process of any data volume fail while the transaction is active on that system.
Volume Recovery Recovery Methods recovery operation with its current memory allocation. If this happens, use the ALTER PROCESS command to increase the EXTENDEDSEGSIZE attribute of the backout process and then stop the process. TMF automatically restarts the process with the larger extended segment allocation. Refer to the TMF Reference Manual for instructions on using the ALTER PROCESS command and to the TMF Planning and Configuration Guide for specific recommendations.
Recovering Multiple Volumes Recovery Methods Volume recovery repairs the audited files on a volume by: 1. Redoing the changes written to cache but not to disk at the time the volume went down: this redo action ensures that the files reflect all the work recorded in the audit trail. During this step, the volume recovery process generates EMS Message 401: Phase 1 of recovery completed at atseqno atseqno RBA rba. 2.
Volume Recovery Failures Recovery Methods Volume Recovery Failures When the volume recovery process fails to recover a volume, you can usually correct the problems that caused the failure. Consider the following guidelines: • • • • • • A volume must be up before it can be repaired by the volume recovery process. Use the SCF STATUS DISK command to view the state of the volume. Use the SCF START DISK command to change the state, if necessary.
Volume Recovery Example Recovery Methods Volume Recovery Example The following volume recovery example applies to situations in which TMF is in the started state. If you need to restart several data volumes that are down, and you know that volume recovery will take a long time to run (because there was significant transaction activity or audit record generation during or after the time these volumes went down), consider using the DISABLE DATAVOLS command to disable the volumes before bringing them up.
File Recovery Recovery Methods File Recovery The file recovery process reconstructs audited files when the copy on disk is not usable. A file could become unusable for one or more of the following reasons: • • • • • A disk media failure occurs. The volume recovery process recovers a data volume but is unable to recover one or more of the audited files that reside there. An audited file is mistakenly purged. An application program incorrectly changes the database.
Recovery Methods RECOVER FILES Command With No Options 5. Reads the active-audit trail to find the incomplete transactions that affected files specified in the RECOVER FILES command. During this step, the file recovery process produces EMS Message 402: Phase 2 of file recovery completed at atseqno atseqno RBA rba. 6. Backs out any partial transactions remaining in the files. During this step, the file recovery process generates EMS Message 403: Phase 3 of recovery completed at atseqno atseqno RBA rba.
Checking File-Recovery Status Recovery Methods Checking File-Recovery Status The file recovery process generates event messages while it runs. Check the EMS log to monitor the process. Alternatively, you can use the STATUS OPERATIONS command (see Maintaining TMF Operations on page 2-26) to check the status of the file recovery operation. When the file recovery process completes, perform an online dump of the volume on which you recovered files and return to storage all dump tapes used during the recovery.
RECOVER FILES Command Specification Recovery Methods RECOVER FILES Command Specification You specify which audited files are to be recovered by using the RECOVER FILES command. Files are selected or not based on the RECOVER FILES file-set list or specifier and the NOT attribute. File names are entered in the RECOVER FILES syntax as follows: RECOVER FILES {file-set } {(file-set [,file-set]...)} { } [,NOT {file-set } {(file-set [,file-set]...
RECOVER FILES Command Specification Recovery Methods Enter file-set in the following format: [[volume.]subvolume.]file-id Note. If you plan to refer to SQL/MX objects in a file-set list, you must use the Guardian names of the underlying files in all TMFCOM commands. You can run the MXGNAMES utility to convert one or more objects’ ANSI names to their underlying Guardian file names. You can then use the Guardian file names in the file-set list.
RECOVER FILES Command Specification Recovery Methods identifies a file that contains one or more file-name patterns. These patterns, in turn, designate files to be recovered. The file can be an EDIT format file (file code 101) or a C data file (file code 180). Note. You must include the angle-brackets (“<“ and “>”) exactly as shown in the syntax, embedding the file name within these. For example: RECOVER FILES <$BOULDER.SNOW.
RECOVER FILES Command Specification Recovery Methods exceed the size of the SPI message buffer that supports communication between TMFCOM and TMFSERVE, which is 28 KB. If this limit is exceeded, TMFCOM displays Error Message 1050. Note. If you plan to refer to SQL/MX objects in a NOT file-set list, you must use the Guardian names of the underlying files in all TMFCOM commands. You can run the MXGNAMES utility to convert one or more objects’ ANSI names to their underlying Guardian file names.
RECOVER FILES Command Specification Recovery Methods NOWAIT requests that when TMFCOM accepts the RECOVER FILES command, it suppresses display of the EMS events for the file-recovery operation and immediately issues a prompt for another command. You can check the status of the recovery operation later by issuing a STATUS OPERATIONS command. If you do not specify NOWAIT, TMFCOM lists the EMS events on your terminal or writes them to the OUT file (if you specify that) as the recovery progresses.
RECOVER FILES Command Specification Recovery Methods If you use the TIME option, the FROMARCHIVE option is automatically selected. TOMATPOSITION (atseqno, rba) applies the FROMARCHIVE option (whether it is specified or not), recovering all files requested from the relevant online and audit dumps, and directs that the file recovery process redo all transactions that were committed up to a specified location in the master audit trail (MAT).
RECOVER FILES Command Specification Recovery Methods RECOVERVDPPHASE1 recovers the underlying catalog for the virtual disk process (VDP). If you omit this option, the files specified in file-set are recovered, but the catalog is not. Caution. The RECOVERVDPPHASE1 option is intended for use with SMF files only. Before you issue a RECOVER FILES command with this option, it is vital that you read “Options for the SMF Product” in the RECOVER FILES command description in the TMF Reference Manual.
Performing File Recovery Recovery Methods than one renaming pattern in this option and there is a conflict, TMF uses the first applicable pattern. Within the context of MAP NAMES, the following syntax restrictions apply: Wild-card Characters in old-fileset-list In old-fileset-list, you can use wild-card characters in the volume, subvolume, file ID fields as follows: * An asterisk matches from 0 through 8 letters, digits, or a combination of these, in the position where it appears.
Performing File Recovery Recovery Methods 3. Make sure that the data volumes on which the files to be recovered reside are in the started state. You can use the STATUS DATAVOLS command to make sure that all needed data volumes are started. If necessary, issue the SCF START DISK command, the ENABLE DATAVOLS command, or both to start data volumes for TMF processing. The SCF START DISK command is described in the SCF Reference Manual for G-Series RVUs. 4.
Recovering Only Files from a Particular Dump Recovery Methods If an entire disk volume is destroyed, you can rely on file recovery to determine which files to recover (as long as there are online dumps of the files). For example, to recover all audited files on $DATA, issue the command: TMF 32> RECOVER FILES $DATA.*.*, FROMARCHIVE Specify any additional RECOVER FILES options, as described previously in this section, in the discussion of the RECOVER FILES command. 8.
Retaining Audit Files Restored from Tape on Restore-Audit Volumes Recovery Methods The following RECOVER FILES command uses the MAP NAMES option to recover the file T32, originally stored on $TMF.TMF01, to $DATA.TMF01.T32: TMF 92> RECOVER FILES $TMF.TMF01.T32, & >>>MAP NAMES ($TMF.*.* TO $DATA1.*.*) . . . By using wild-card characters in the MAP NAMES option, you can gain great efficiency and flexibility in specifying source files and target files.
Recovering Audit-Trail Files Dumped to Disk Recovery Methods volume to free disk space for other files. When TMF is conducting multiple recovery operations, recovery processes often must wait for TMF to copy the same audit-trail files from tape repeatedly to support each recovery operation. TMF allows you to override this standard restore-and-purge mechanism to keep the restored audit-trail files on the restore-audit volume indefinitely.
Preventive Maintenance Recovery Methods Preventive Maintenance Recovering from certain problems on your TMF system can be greatly simplified if you perform the following tasks: • • • Make online dumps to tape at least once a week; make two copies (if the only copy becomes unusable, file recovery may be impossible). Make audit dumps to tape as needed; make two copies (if the only copy becomes unusable, file recovery may be impossible).
Recovery Methods • Responding to Incorrect Updates to the Database Do not delete or move audit-trail files that are in the ZTMFAT subvolume. Audittrail files have a file code of 134, which you can view by using the TACL FILEINFO command. Responding to Incorrect Updates to the Database Use the RECOVER FILES command to correct certain problems that have caused database inconsistencies, such as: • • • • An audited file has been mistakenly purged (see next subsection).
Recovery Methods Recovering a File Damaged After the AUDIT Attribute is Turned Off about the recovery steps, see SQL/MP Installation and Management Guide and SQL/MX Installation and Management Guide. Recovering a File Damaged After the AUDIT Attribute is Turned Off If a file is damaged after its AUDIT attribute has been turned off, recovery might still be possible if the online and audit dumps are still available. To attempt the recovery, follow these steps: 1.
Recovery Methods Responding to Loss of Files on NonStop SMF Disk Volumes generated when the file recovery process encounters the file label modification record in the master audit trail (MAT). To complete the recovery successfully, you must use the SNOOP utility to locate the file label modification record. To do so, use the READAUDIT command in SNOOP to scan the MAT with the select criteria specifying the file name, the file type, and the flags 000021.
Recovery Methods Responding to Loss of Files on NonStop SMF Disk Volumes to place the logical recovered files. Furthermore, with the WHEREPHYSVOLIS option, you can recover all the logical files on a specified physical volume. Note. The TOPHYSVOL option does not work when direct file names appear in file-set (in other words, with files not managed by the SMF product).
Recovery Methods Responding to Accidental Loss of an Active-Audit Volume Responding to Accidental Loss of an Active-Audit Volume Loss of an active-audit volume causes TMF to abruptly halt. If the primary volume of a mirrored disk becomes unusable, the system uses the mirror volume. If the activeaudit volume is not mirrored, or if both mirrors of the volume are unusable, contact the Global Customer Support Center (GCSC) or your service provider for assistance.
Restoring the ZTMFCONF Subvolume Recovery Methods Restoring the ZTMFCONF Subvolume If the TMF configuration subvolume, ZTMFCONF, becomes damaged or corrupted, TMF could shut down abruptly and not be able to restart. Particularly significant to the TMF environment, all audit-trail configuration information is lost. Because the ZTMFCONF subvolume typically resides on $SYSTEM, it is likely that the $SYSTEM volume is also damaged or corrupted, which requires a complete system reload.
Responding to a TMF Crash Recovery Methods 6. If you altered the TMF PROCESS configuration in Step 5 for any of the TMFMON, TMFMON2, or TMP processes, issue another STOP TMF, ABRUPT command. 7. Issue the START TMF command. 8. Redo any changes you made to TMF DUMPS or TAPEMEDIA, and reenter any ALTER AUDITTRAIL, AUDITDUMP OFF commands issued after the last backup of ZTMFCONF. 9.
Restarting TMF in a Distributed Transaction Environment Recovery Methods If the cause of the crash is not immediately obvious, collect the following information that may be useful in determining the state of the system at the time of the incident, for analysis by your service provider: • • • • • Processor dumps. TMP saveabend files. EMS and TSM event logs. Copies of the TMF audit-trail files currently on disk.
Recovery Methods Recovering From a Complete System Failure Restarting TMF in a Heterogeneous Distributed Transaction Environment When TMF is restarted in a heterogeneous distributed transaction environment, it first attempts to resolve any pending heterogeneous transactions. When possible, it communicates with the foreign transaction managers through resource manager gateway processes to determine the outcome of such transactions.
Recovering From a Complete System Failure Recovery Methods now have unresolved transaction information from the perspective of both nodes. Use this information and the instructions in Table 7-1, Resolving Distributed Transactions After Recovering a Database to resolve the distributed transactions. Table 7-1. Resolving Distributed Transactions After Recovering a Database The failing node is a parent node.
Recovery Methods Recovering Your TMF Environment to a New System Following Disaster state exist, TMF startup is not affected. Nevertheless, these exported transactions must be completed to clean up data structures for the transaction. Integrating Multiple Configurations After recovering the database to the remote node, you can manually integrate data volumes from the remote node’s configuration into the primary node’s configuration.
Recovery Methods Recovering Your TMF Environment to a New System Following Disaster HP NonStop TMF Operations and Recovery Guide —522417-003 7- 34
A Dump Tracking Forms This appendix contains forms for you to keep track of audit dumps and online dumps. Use the forms in this appendix as hardcopy records of audit dumps and online dumps; you can make photocopies of these forms to use at your site.
This page left intentionally blank.
Audit Trail File Dumped Date Dump Begun System Name: ____________________________ Form for Tracking Audit Dumps to Disk Time Names of Dump Files VST001.
This page left intentionally blank.
Audit Trail File Dumped Date Time Dump Begun System Name: ____________________________ Form for Tracking Audit Dumps to Tape Used Drive Tape Tape Volume ID Part Number Number Copy Offsite Tapes Date Date Sent Ret’d VST002.
This page left intentionally blank.
Dump Serial Number Files Dumped System Name: ____________________________ Form for Tracking Online Dumps to Disk Date Time Dump Begun Master Data Audit Trail Sequence Number Names of Dump Files VST003.
This page left intentionally blank.
Dump Serial Number Files Dumped Date Time Dump Begun System Name: ____________________________ Form for Tracking Online Dumps to Tape Master Data Audit Trail Sequence Number Tape Drive Used Tape Volume ID Part No. Copy No. Date Sent Date Ret’d Offsite Tapes VST004.
This page left intentionally blank.
B Managing Enscribe Files This appendix summarizes how to create and manage audited Enscribe database files on a TMF system. It includes these topics: Topic Page Creating Audited Files B-1 Altering the Audit Attribute B-2 Determining if a File is Audited B-3 Purging an Audited File B-4 Using Format 2 Files B-4 FUP Command Guidelines B-5 Creating Audited Files You create audited Enscribe database files interactively by issuing a FUP CREATE command.
Managing Enscribe Files Altering the Audit Attribute When working with Enscribe files, you should understand both formats and when they are used. To learn more about them, see Using Format 2 Files on page B-4. Altering the Audit Attribute You can alter the audit attribute of a file interactively by issuing a FUP ALTER command. You can alter the audit attribute of a file only when TMF is running and the disk volume on which the file resides is a configured data volume enabled for transaction processing.
Changing a File from Audited to Nonaudited Managing Enscribe Files Changing a File from Audited to Nonaudited If you alter the audit attribute of a file from audited to nonaudited, the following results occur: • • • All TMF protection for the file is eliminated. Backout, volume recovery, and file recovery no longer protect the file. All online dump entries for the file are marked INVALID ON and RELEASED ON in the TMF catalog.
Purging an Audited File Managing Enscribe Files The FUP INFO output also identifies crash-open files and files that require file recovery. Crash-open files are identified by a question mark (?) beside the file name; files requiring recovery are identified by the letter R beside the file name (if “redo” is required, file recovery must be used to recover the file; if only “undo” is required, volume recovery might be able to recover the file): 128> FUP INFO $DATA.SALES.* CODE $DATA.
Managing Enscribe Files FUP Command Guidelines Through a FUP SET command, you can explicitly request format 2 for a newly created file. As an example, the following FUP commands create an audited, key-sequenced format 2 file named ORDERS in the subvolume FILES on the volume $APPL. 48> FUP - VOLUME $APPL.
Managing Enscribe Files The DUP Command The DUP Command You can use the FUP DUP command to duplicate an audited file, but the destination file is created as a nonaudited file. As with the FUP COPY command, you can issue a FUP ALTER command once the file has been duplicated to change the audit attribute of the destination file to audited. Caution. Do not use FUP DUP commands or the BACKUP/RESTORE utilities on audit-trail files except when instructed to do so by your service provider.
Managing Enscribe Files The PURGEDATA Command The PURGEDATA Command You can use the FUP PURGEDATA command to delete the contents of an audited file. Should the TMF file recovery feature be initiated on this file, it will restore the existing online dump and then redo the PURGEDATA operation when encountered in the TMF audit trail. If you mistakenly issue a FUP PURGEDATA command, you can recover the file by issuing a RECOVER FILES command with the TIME option set to a time before the data was purged.
Managing Enscribe Files The RENAME Command HP NonStop TMF Operations and Recovery Guide —522417-003 B- 8
C Managing SQL Objects This appendix summarizes how to create and manage audited HP NonStop SQL/MP and HP NonStop SQL/MX objects on a TMF system. It includes these topics: Topic Page Audited Objects C-1 Impact of SQL Operations on Online Dumps C-8 Audited Objects SQL objects consist of audited components that must reside on TMF data volumes. For this reason, TMF is required for all systems on which SQL/MP and SQL/MX objects are defined and managed.
Managing SQL Objects Operations for SQL/MP Only Operations for SQL/MP Only In SQL/MP, when you create a table, you can set the AUDIT file attribute to determine whether the table is audited. At any time thereafter, you can alter the AUDIT attribute setting or display its current value. The following discussion explains how to perform these and other SQL/MP operations. Note. Remember: the detailed operations described in this subsection apply to SQL/MP only, and not to SQL/MX.
Operations for SQL/MP Only Managing SQL Objects In the following example, the AUDIT attribute of TABLE1 is altered through SQLCI, so the table, indexes, and dependent views are subsequently audited: >> ALTER TABLE TABLE1 AUDIT; In the following example, the AUDIT attribute of TABLE1 is altered through SQLCI, so the table, indexes, and dependent views are subsequently nonaudited: >> ALTER TABLE TABLE1 NO AUDIT; You can alter the AUDIT attribute of a table only when TMF is active, the disk volume that c
Operations for SQL/MP Only Managing SQL Objects If you mistakenly alter the AUDIT attribute from audited to nonaudited and the table has not been modified since it was altered to nonaudited, do the following to restore the table to its original condition: • • Change the AUDIT attribute back to audited. Make a new online dump of the table.
Operations for SQL/MP Only Managing SQL Objects You can also query the SQL catalog table, FILES, to obtain the AUDIT attribute of an object, or list of objects, in the catalog. The following command, for example, queries the FILES table of the catalog $DATA.SALESCAT to generate a list of all audited objects in that catalog: 25> SQLCI >> SELECT FILENAME, AUDIT FROM FILES; FILENAME ---------------------------------. . . \SIERRA.$DATA.SALES.CUSTOMER \SIERRA.$DATA.SALES.ODETAIL \SIERRA.$DATA.SALES.
Managing SQL Objects Operations for Both SQL/MP and SQL/MX The following example copies data to a nonaudited table: 25> SQLCI >> CREATE TABLE $V1.SPECIAL.EMPLOYEE LIKE $V1.PERSNL.EMPLOYEE; >> ALTER TABLE $V1.SPECIAL.EMPLOYEE NO AUDIT; >> COPY $V1.PERSNL.EMPLOYEE, $V1.SPECIAL.EMPLOYEE; >> ALTER TABLE $V1.SPECIAL.EMPLOYEE AUDIT; >> EXIT For more information about TMF considerations for the COPY command, see SQL/MP Reference Manual.
Managing SQL Objects Operations for Both SQL/MP and SQL/MX on page C-2, while SQL/MP objects can be created as either audited objects or nonaudited objects, SQL/MX objects are always audited. You can create a table only when TMF is active, the disk volume that contains the table is enabled for TMF transaction processing, and the disk volume that contains the SQL catalog (in SQL/MP) or the catalog and schema (in SQL/MX) in which the table is registered is enabled for TMF transaction processing.
Managing SQL Objects Impact of SQL Operations on Online Dumps contains the SQL catalog in which the object is registered is enabled for TMF transaction processing. You cannot purge a SQL/MP nonaudited object within a user-defined transaction. When a table is dropped in SQL/MP, all dependent views and indexes are also dropped.
Managing SQL Objects Impact of SQL Operations on Online Dumps type of a recovery must include not only the tables or indexes directly affected, but also all partitions of each table or index and all logically related objects in the database. Caution. If a full recovery of a table is needed and the catalog is not going to be recovered, then the timestamps can cause inconsistencies that leave the table unusable.
Managing SQL Objects Impact of SQL Operations on Online Dumps HP NonStop TMF Operations and Recovery Guide —522417-003 C -10
Index A ABORT TRANSACTION command AVOIDHANGING option 3-22 IGNOREDATAERRORS ON option 3-22 use of 3-6, 3-21/3-22, 7-3 Aborted state 2-18 Aborting state 2-18 ABRUPT option 3-3 Active state 2-18 Active-audit volumes accidental loss of 7-27 adding 3-8 configuring 2-2 files per 3-7 pinning audit-trail files 2-8 ADD AUDITTRAIL command 2-2 ADD DATAVOLS command 2-15, 3-11 ADD DUMPS command example 6-10 reentering dumps in the TMF catalog 6-8 TIME attribute 6-8 ADD TAPEMEDIA command adding tape volumes to the TMF c
B Index Audit dumps (continued) displaying configuration attributes 1-6, 4-1 media information 4-2 status 1-3/1-5 dumping to remote systems 4-6 enabling and disabling 4-5 for file recovery 7-7 operator tasks for making 4-3 pausing and resuming 4-5 pinning audit-trail file 2-6 reentering in the TMF catalog 6-8 solving problems 4-4 state and status values 4-2 understanding entries in the TMF catalog 6-2 Audit files, retaining on restore-audit volumes 7-20 Audit trail volumes see Active audit volumes Audit t
D Index Commands see individual commands Committed state 2-18 Communication line failures 2-24 Configuration attributes changing 3-11/3-15 displaying 1-6, 4-1, 6-2 Configuration subvolume, restoring 7-28 Configuration volume backing up 2-30 displaying 1-6 COPIES option 5-6 CPU allocating memory 2-21 disabling data volumes during reload 7-6 for TMF processes 7-2 D Data volumes adding 3-11 associating with a different audit trail 3-15 changing the configuration 3-11/3-15 clean 2-14 dirty 2-14 disabling and
E Index DUMP FILES command (continued) dumping to a remote system 5-9 example 5-6 SYSTEM option 5-9 use of 5-2/5-7 Dumps recovering only files from particular 7-19 see also Online Dumps see Audit Dumps E EMS messages audit dump errors 4-4 failed transaction backout 7-3 file recovery 7-9 rollover errors 2-8 TMF crash 7-29 viewing 2-28/2-29 ENABLE AUDITDUMP command 4-5 ENABLE BEGINTRANS command 3-17 ENABLE DATAVOLS command recovering multiple volumes 7-4 use of 3-10 Enabling data volumes 3-9/3-10 Enscribe
L Index INFO TMF command general use of 1-6 viewing audit dump attributes 4-1 viewing the catalog configuration 6-2 L LAST REEL attribute 1-9 Location, recovering files to a new 7-19 Logical file names 7-25 Long-running transactions aborted by autoabort function 3-19 causing audit trail to fill 3-6 M Maintenance tasks see Operator tasks MAP NAMES parameter, RECOVER FILES command 7-16/7-17, 7-19/7-20 Master audit-trail file (MAT) 2-2, 2-7 MAXRETAINEDATFILES attribute 1-8 Migrating between software releas
P Index Operator tasks (continued) pausing and resuming audit dumps 4-5 performing file recovery 7-7/7-19 preventative maintenance 7-22 responding to audit-trail overflow 3-5/3-8 solving audit dump problems 4-4 starting TMF 3-1 stopping TMF 3-2 summary 1-1 tracking system information 2-30 viewing EMS messages 2-28/2-29 Overflow-audit volumes 1-8, 2-5/2-8 P Parent nodes 2-21, 2-22 Partitioned databases, dumping files 5-7 Pattern matching in file-sets 5-3, 7-11 Physical file names 7-25 Preallocated audit-t
S Index Remote nodes audit dumps to 4-6 online dumps to 5-8 RESOLVE TRANSACTION command 3-2 Resource managers 2-26 Restore-audit volume 1-8, 2-4, 7-7 Restore-audit volumes 7-20 RETAINDEPTH attribute 1-8, 6-6 Rollforward see File recovery Rollover see Audit trail rollover ROUNDROBIN attribute 1-8, 6-7 ROUNDROBIN parameter, ALTER CATALOG command 6-7 S SCF PRIMARY command 2-11, 2-12, 2-14 SCF RESET DISK command 2-13 SCF STOP DISK command 2-13 Search method, TMF catalog 6-7 SERIAL parameter, RECOVER FILES co
T Index SYSTEM option sending audit dumps to a remote system 4-6 sending online dumps to a remote system 5-9 SYSTEM parameter of RECOVER FILES command 7-14 T Tape volumes, TMF catalog 6-11/6-14 TAPEMEDIA option 5-6 TIME parameter of RECOVER FILES command 7-14 TMF crash 7-29 displaying status 1-3/1-5 recovering from a complete system failure 7-31 restarting in a distributed transaction environment 7-30 starting 3-1 states 1-5 stopping 3-2/3-4 tracking information 2-30 TMF catalog adding tape volumes 6-13
U Index TRANSACTIONPROTOCOL option of ALTER TMF command 7-22 Transactions abnormal activity 7-27 aborting 3-21 backout of 7-1 controlling individual 3-21 deleting 3-23 displaying activity 2-16/2-17 displaying begin-transaction status 2-19 distributed, resolving 3-22/3-23 identifiers 2-15 long-running 3-6 operator tasks for maintaining 2-15/2-21 pinning audit-trail files 2-5 reasons new transactions cannot start 2-19/2-21, 3-17/3-18 see also Distributed transactions see also Local transactions states 2-18
Special Characters Index HP NonStop TMF Operations and Recovery Guide —522417-003 Index -10