RDF System Management Manual for J-series and H-series RVUs (RDF Update 13)

ManualsBrandsHP ManualsServerHP Integrity NonStop H-Series

HP NonStop RDF System Management

Manual for J-series and H-series RVUs (RDF

Update 13)

HP Part Number: 529826-012

Published: September 2013

Edition: J06.03 and subsequent J-series RVUs and H06.09 and subsequent H-series RVUs

Summary of content (504 pages)

PAGE 1
HP NonStop RDF System Management Manual for J-series and H-series RVUs (RDF Update 13) HP Part Number: 529826-012 Published: September 2013 Edition: J06.03 and subsequent J-series RVUs and H06.
PAGE 2
© Copyright 2012, 2013 Hewlett-Packard Development Company, L.P. Confidential computer software. Valid license from HP required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor’s standard commercial license. The information contained herein is subject to change without notice.
PAGE 3
Contents About this Document....................................................................................13 Supported Release Version Updates (RVUs)................................................................................13 Intended Audience..................................................................................................................13 New and Changed Information in this Edition............................................................................
PAGE 4
Zero Lost Transactions (ZLT)..................................................................................................47 Monitoring RDF Entities With ASAP......................................................................................47 2 Preparing the RDF Environment...................................................................48 Configuring Hardware for RDF Operations.................................................................................48 Primary System Configuration........
PAGE 5
Using RDFCOM Non-interactively (without an IN File).............................................................91 Using RDFCOM From a Command File (IN file).....................................................................91 Using Scripts for Easy and Fast RDF Initialization and Configuration..........................................92 Managing Multiple RDF Environments from One RDFCOM Session..........................................
PAGE 6
Reading the Backup Database (BROWSE versus STABLE Access)................................................138 Near Real Time Read Access to Updates on the Primary System.................................................138 Access to Backup Databases with Stable Access.......................................................................139 Stopping TMF on the Primary System..................................................................................139 Using the STOP RDF, DRAIN Command..................
PAGE 7
Process File Names..........................................................................................................180 RDFCOM Commands...........................................................................................................180 ADD..............................................................................................................................181 ALTER........................................................................................................................
PAGE 8
How Triple Contingency Works..............................................................................................264 Hardware Requirements........................................................................................................265 Software Requirements..........................................................................................................265 The RETAINCOUNT Configuration Parameter...........................................................................
PAGE 9
Takeover Phase 2 – File Undo............................................................................................289 Takeover Phase 3 – Network Undo....................................................................................289 Takeover Phase 3 Performance..........................................................................................290 Communication Failures During Phase 3 Takeover Processing................................................290 Takeover Delays and Purger Restarts...
PAGE 10
Primary and Backup ANSI Schema Names Are Not the Same...............................................325 Schema Subvolume Names Are Not the Same.....................................................................325 Guardian Filename Is Incorrect for Partition.........................................................................325 Consideration for Creating Backup Tables................................................................................325 Restoring to a Specific Location...................
PAGE 11
SET PURGER...................................................................................................................343 SET RDF.........................................................................................................................344 SET RDFNET....................................................................................................................344 SET RECEIVER.............................................................................................................
PAGE 12
RDF Metrics Reported by ASAP...............................................................................................494 Index.......................................................................................................
PAGE 13
About this Document The Remote Database Facility (RDF) subsystem enables users at a local (primary) system to maintain a current, online copy of their database on one or more remote (backup) systems, protecting stored information from damage that might occur at the primary system. RDF accomplishes this by sending audit trail information, generated at the primary system by the NonStop Transaction Management Facility (TMF) product, over the network to the backup system.
PAGE 14
• Added section “RDF Transactions” (page 125). • Modified the syntax of “SET VOLUME” (page 228) command. • Added INCLUDEPURGE and EXCLUDEPURGE descriptions to “SET VOLUME” (page 228) command. • For message 883, modified the physical volume size from 15 to 21 in the section “RDF Messages” (page 356). • Added message 935 to “RDF Messages” (page 356). Changes to 529826–011 Manual • Added an architecture diagram for the chapter “Process-Lockstep Operation” (page 299).
PAGE 15
• Updated information in “Examples” (page 191) • Updated information in “INFO * Command” (page 196) • Updated information in “INFO RDF Command” (page 198) • Added new section “LIST” (page 206) • Updated information in “SET RDF” (page 220) • Updated information in “SET VOLUME” (page 228) • Updated information in “SHOW RDF Command” (page 233) • Updated information in “Error” (page 241) • Updated information in “Subvolume-Level and File-Level Replication” (page 271) • Added a Note in “Within
PAGE 16
• Chapter 3 (page 58) explains how to install and configure RDF, including how to copy databases and files from the primary system to the backup system before starting RDF. • Chapter 4 (page 88) discusses how to operate RDF, including how to issue RDFCOM and RDFSCAN commands and how to display RDF configuration parameters and operating statistics, change configuration parameters, and interpret log files.
PAGE 17
Computer Type Computer type letters indicate: • C and Open System Services (OSS) keywords, commands, and reserved words. Type these items exactly as shown. Items not enclosed in brackets are required. For example: Use the cextdecs.h header file. • Text displayed by the computer. For example: Last Logon: 14 May 2006, 08:02:23 • A listing of computer code.
PAGE 18
… Ellipsis An ellipsis immediately following a pair of brackets or braces indicates that you can repeat the enclosed sequence of syntax items any number of times. For example: M address [ , new-value ]… - ] {0|1|2|3|4|5|6|7|8|9}… An ellipsis immediately following a single syntax item indicates that you can repeat that syntax item any number of times. For example: "s-char…" Punctuation Parentheses, commas, semicolons, and other symbols not previously described must be typed as shown.
PAGE 19
!i:i In procedure calls, the !i:i notation follows an input string parameter that has a corresponding parameter specifying the length of the string in bytes. For example: error := FILENAME_COMPARE_ ( filename1:length , filename2:length ) ; !i:i !i:i !o:i In procedure calls, the !o:i notation follows an output buffer parameter that has a corresponding input parameter specifying the maximum length of the output buffer in bytes.
PAGE 20
process-name State changed from old-objstate to objstate { Operator Request. } { Unknown. } | Vertical Line A vertical line separates alternatives in a horizontal list that is enclosed in brackets or braces. For example: Transfer status: { OK | Failed } % Percent Sign A percent sign precedes a number that is not in decimal notation. The % notation precedes an octal number. The %B notation precedes a binary number. The %H notation precedes a hexadecimal number.
PAGE 21
• TACL Reference Manual, which discusses operations available in the HP Tandem Advanced Command Language (TACL), the standard command interface to the NonStop operating system. This is the interface through which you run RDFCOM and RDFSCAN and manage files used by them. • File Utility Program (FUP) Reference Manual, which describes the command syntax and error messages for the File Utility Program (FUP). • Operator Messages Manual, which describes various error codes.
PAGE 22
1 Introducing RDF This manual describes the Remote Database Facility (RDF) subsystem as implemented in version 1, update 13 of the HP NonStop RDF/IMP, IMPX, and update 11 of the RDF/ZLT independent products. Customers who install the above update can use existing RDF configuration scripts provided the scripts are not making use of new functionality.
PAGE 23
uses elements like before-images, after-images, and control records. In addition, you should also understand the TMF processes that perform backout, volume recovery, and file recovery. If you are not familiar with this information, you should read TMF Introduction.
PAGE 24
associated with volumes $D1 through $D10 to the RDF master receiver process on the backup system. Audit records for volumes $D11 through $D15 are sent to the auxiliary audit trail. The RDF auxiliary extractor process reads the auxiliary audit trail and sends audit records associated with volumes $D11 through $D15 to the RDF auxiliary receiver process on the backup system. The master receiver writes transaction status information to the master image trail.
PAGE 25
Table 1 Audit Records at the Time of a Primary System Failure Primary database updates (Sequence in master audit trail Updates sent to the backup (Sequence in image trail file) file) TRANS100—Update 1 TRANS100—Update 1 TRANS100—Update 2 TRANS100—Update 2 . . . . . .
PAGE 26
Features In providing backup protection for online databases, RDF offers many advantages: • Continuous Availability RDF maintains an online copy of your production database on one or more backup systems. If the primary system should go down, the backup database(s) will be consistent and you can resume your business processing on a backup system with minimal interruption and data loss. • Fault tolerance You can restart RDF after a system failure. Single processor failures do not bring the subsystem down.
PAGE 27
Figure 2 RDF Topologies Chain Simplex Multiple Duplicate Sites Ring Reciprocal Centralized Multiple Configurations Loopback • Supports master and auxiliary audit trail protection; RDF can protect all tables and files that are being audited by TMF, whether they are associated with the Master Audit Trail (MAT) or an auxiliary audit trail. • Subvolume and file replication In addition to volume replication, the RDF/IMP and IMPX products support replication of selected subvolumes and files.
PAGE 28
the cost of an updater process replicating an update operation is typically 15-25% of the original cost to do the operation on the primary system. On the primary system RDF uses just one process (the extractor) per audit trail to read and transmit audit records to the backup system. The extractor process automatically filters out any audit records not relevant to the backup database. On the backup system RDF stores and applies all audit records without using any primary system resources.
PAGE 29
You can peruse messages in the EMS log on your terminal screen by using Viewpoint or whatever other tool you normally use for monitoring $0. When you do that, you are dealing with the entire EMS log (not just RDF messages). To isolate RDF messages from the rest of the EMS log, you can use the supplied EMS filter RDFFLTO with an EMS printing distributor to produce an intermediate entry-sequenced file that you then can scan using the RDFSCAN utility.
PAGE 30
Figure 3 RDF Tasks to Maintain a Copy of a Database Primary Captures audit trail records Extractor Filters and transmits audit trail data to backup system Network Communication (Expand Network) Secondary Receives and writes audit trail data to image file Receiver Reads image file and issues REDO request to disk process, supplying image records for REDO operation Updater Disk process performs requested REDO operation, updating the backup database RDF Processes To accomplish its four major tasks, RDF runs
PAGE 31
Figure 4 RDF Subsystem Processes Application Processes RDFCOM Purger $PRG Monitor $MON1 Audited Database Master Audit Trail TMF Product Master Image Trail Secondary Image Trail Updaters $U01-$U0n Receiver $RCV Replicated Database Extractor $EXT Primary Secondary Primary System Processes On the primary system: • The monitor process coordinates most RDFCOM commands involving the main RDF processes (for example, start and stop).
PAGE 32
When updating is enabled, the RDF processes maintain a current, online copy of the primary database on the backup system. By default, the subsystem starts with updating enabled, and the RDF processes continue their updating activities until updating is explicitly disabled or the subsystem is shut down. When updating is disabled, the extractor process still transmits the TMF audit records from the audit trails to the backup system, but no changes are applied to the backup database.
PAGE 33
Figure 5 Extractor Process Operation Master Audit Trail Audit File 56 KB Read Extractor (includes filter) Unwanted Data Discarded Expand Lines 56 KB Buffers To receiver 56 KB per message Primary Node Reading large amounts of data from the MAT, the extractor process stores the following records for subsequent transmission to the backup system: • TMF control records ◦ All transaction state records ◦ TMP control point records ◦ TMF shutdown records ◦ File-incomplete records ◦ File-complete re
PAGE 34
• ◦ ALTER MAXEXTENTS ◦ PURGE (if REPLICATEPURGE is enabled) Filelabel modifications for the following NonStop SQL operation ◦ PURGEDATA NOTE: Except for PURGEDATA, RDF does not replicate NonStop SQL DDL operations on any SQL objects. For more information about NonStop SQL DDL operations and databases on a system protected by the RDF product, see Chapter 6 (page 147) and Chapter 16 (page 316). The extractor filters out all other records and does not send them to the receiver.
PAGE 35
Receiver Process A receiver process is a process pair that runs on the backup system. There is one receiver for each configured extractor. A receiver process accepts audit records from its extractor, sorts them, and then writes them to the appropriate RDF image trail, as shown in Figure 6. (The restartability of a receiver ensures the receiver's correctness at process takeover or under any conditions requiring resynchronization with its extractor.
PAGE 36
Figure 6 Receiver Process Operation Disk 1 (1st Image Trail) Updater 56 KB Buffers Image File Database Volumes Disk 2 (2nd Image Trail) Updater 56 KB Buffers Image File Database Volumes Disk n (nth Image Trail) Updater 56 KB Buffers Image File Database Volumes Audit data stored by destination volume, transmitted by 56KB writes AAnnnnn Master Image File Receiver AAnnnnn and BBnnnnn Expand Lines From Extractor 56 KB Buffers 56KB per message Back up Node With sorted image trails, the act
PAGE 37
Image trails can be added only after RDF has been initialized but before it has been started. RDF Control Points When the extractor has no information to send from the audit trail, it transmits a buffer containing no audit images (an empty buffer) to the receiver. When the receiver process receives an empty buffer, it generates an RDF control-point record in each image trail.
PAGE 38
CREATE, PURGE (if REPLICATEPURGE is enabled), PURGEDATA, ALTER MAXEXTENTS (used only for increasing MAXEXTENTS). • For NonStop SQL files only, performs the following DDL operation: PURGEDATA An updater cannot always respond immediately to the STOP UPDATE and STOP RDF commands. If an updater has audit records queued for the disk process, the updater must wait until all of that information is processed before it can shut down. You specify the primary and backup CPUs for each updater.
PAGE 39
it continues processing. For the SMF ramifications of this file limit, see the note in “Using SMF with RDF” (page 56). REDO Pass Updaters perform REDO operations during all normal processing. The updater applies each audit record as a redo operation, regardless of whether the transaction associated with that audit record committed, aborted, or is still in progress on the primary system.
PAGE 40
NOTE: You must be sure that volumes on the primary system containing alternate key files and indexes are protected by RDF. It is not sufficient to protect just the associated data file or table (particularly in the case of alternate keys). Likewise, if primary partitions reside on volumes protected by RDF, you must ensure that the secondary partitions are also configured for protection. File System Errors Involving Data Files File system errors can occur when: • A file is created. • A file is opened.
PAGE 41
Second, because considerable checking must be done across all trails to determine what files can be purged based on what transactions might be represented in the various files on the various image trails, the purger process performs this task. The purger process is a restartable process pair that runs on the backup system (it is started during START RDF and runs even when the updaters are stopped; image files are purged, however, only when updating is enabled).
PAGE 42
Example 2 Chain Replication System \A System \B System \C RDF Subsystem 1 Primary DB 1 ---------> Backup DB 1 Primary DB 2 ----------> Backup DB 2 RDF Subsystem 2 Thus, system \B is both the backup system in RDF subsystem 1 and the primary system in RDF subsystem 2.
PAGE 43
committed update of your application. Additionally, Primary DB 1 and Backup DB 1 are no longer in synch. Even though the updater on \B had its transaction aborted, that updater will re-apply the application update to Backup DB 1. When done, Primary DB no longer has the update, but Backup DB 2 does. Although this example describes a reciprocal configuration, the same basic problem can happen with chain replication.
PAGE 44
monitor process. In this way, Expand problems affecting one configuration might not necessarily affect the others (depending on the configuration). RDF Control Subvolume The INITIALIZE RDF command includes a control subvolume suffix parameter (SUFFIX char), where char is an alphanumeric character. If you include this parameter, the RDF control subvolume on $SYSTEM will be the local (primary) system name without the backslash and with the specified character appended to it.
PAGE 45
set of disks can be replicated to another set of target disks to provide a copy of the live database. There are two operational considerations unique to this environment: • The updaters operate in transaction mode, which means you should not stop TMF before stopping RDF. • The RDF takeover operation cannot be performed unless you manually stop the monitor and extractor processes before issuing the TAKEOVER command or include the ! option in the TAKEOVER command.
PAGE 46
Shared Access DDL Operations RDF supports the following new events that enable optimal performance of the NonStop SQL/MP shared access DDL operations on the backup system: • Event 905 • Event 908 • Event 932 For more information, see “Performing Shared Access DDL Operations” (page 141). Configurable Software Location By default, RDF software resides on $SYSTEM.RDF. You can, however, override this location when you configure RDF.
PAGE 47
For information about this capability, see Chapter 15 (page 299). Support for Network Transactions The RDF/IMPX and ZLT products support network transactions: transactions that update data residing on more than one RDF primary system.
PAGE 48
2 Preparing the RDF Environment Before RDF can be run on a NonStop system, the system configurations and user applications must meet certain RDF requirements. This chapter explains how to prepare each system for RDF installation and operation, ensuring that all these requirements are met and that you understand the RDF product’s restrictions.
PAGE 49
Sizing the RDF configuration is a complex task that is best carried out by HP personnel. Those personnel can assist you in configuring and sizing your RDF environment using tools and utilities designed and developed as part of the RDF Professional Service. Contact your service provider for further details.
PAGE 50
1. Enter a FUP INFO command for the current TMF MAT and record the end-of-file (EOF) value; for example: FUP INFO $AUDIT.ZTMFAT.* CODE EOF LAST MODIF OWNER RWEP TYPE REC BLOCK $AUDIT.ZTMFAT AA000003 134 11292672 10:05 -1 GGGG 2. Enter a FUP INFO command for the current MAT 5 minutes later and record the EOF value; for example: FUP INFO $AUDIT.ZTMFAT.* CODE EOF LAST MODIF OWNER RWEP TYPE REC BLOCK $AUDIT.ZTMFAT AA000003 134 11653120 10:10 -1 GGGG 3.
PAGE 51
Table 3 Software Requirements (continued) Software Requirement Communications The RDF/IMP, IMPX, and ZLT products use Expand software to connect the primary system to the backup system. Operating System On the primary and backup systems, the installed release version update (RVU) of the operating system must be supported. TMF Subsystem On both the primary and backup systems, the installed RVU of the TMF subsystem must be compatible with the installed RVU of the operating system.
PAGE 52
if you stop RDF. Audit trail pinning is lost if you stop TMF. See also the description of the UNPINAUDIT command in Chapter 8 (page 175). You can control when TMF dumps an audit trail by configuring TMF for dump to tape. For example, when configured with a tape dump process, TMF issues a prompt for the operator to mount a tape when TMF is ready to dump and purge an old audit trail file.
PAGE 53
• Copies of NonStop SQL views on the backup systems • Placement of partitioned Enscribe files and NonStop SQL tables Audited Files Per Volume on Primary System The RDF updater process has a limit on the number of database files it can have open concurrently on a volume - 3,000. Therefore, when you set up your database on your primary system for RDF protection, you should ensure that you do not have more than 3,000 audited files on any single volume that you want replicated.
PAGE 54
you might want to replicate $CAT.DSMCAT.* on the primary system to $DATA.DSMCAT.* on the backup system. In that way replication of the DSM Tape Catalog and related files from the primary to the backup system does not affect the DSM Tape Catalog and related files in $CAT.DSMCAT.* on the backup system.
PAGE 55
Replicating Database Operations Database administrators preparing to work with RDF should be aware of considerations concerning: • NonStop SQL Data Definition Language (DDL) operations • NonStop SQL DDL operations with Shared Access • Enscribe file-label modifications • Purge operations • Partitioned files • Temporary disk files NonStop SQL DDL Operations Although RDF replicates NonStop SQL Data Manipulation Language (DML) operations, it does not replicate NonStop SQL Data Definition Language (D
PAGE 56
Temporary Disk Files File creation, modification, and updates are not replicated for audited temporary disk files. All audit data is filtered out by the extractor on the primary system for file names of the form $volume.#nnnnnnn. A filename that begins with # (pound sign) indicates a temporary disk file; this type of file name is returned when only the volume name is specified in a call to the file-system CREATE procedure or FILE_CREATE_ procedure.
PAGE 57
Configuring an SMF Environment on the Backup RDF System RDF supports the replication to SMF logical volumes on the backup system, with the following restrictions: • When replicating to an SMF logical volume, the logical volume must belong to an SMF pool that contains 21 or fewer physical volumes, hence each updater can apply audit to up to 21 physical disks. • The RDF/IMP product limits the total number of physical or virtual UPDATE volumes to 255.
PAGE 58
3 Installing and Configuring RDF After preparing your system configurations and user applications to meet RDF requirements, you are ready to install and configure RDF. This chapter, which is intended for system managers, system analysts, and database administrators, describes how to do these tasks.
PAGE 59
Preparing the Tables and Files Now prepare your tables and files. Separating NonStop SQL Tables It is recommended that you avoid registering NonStop SQL tables protected by RDF in the same catalogs as tables that are not protected by RDF. Separating protected tables from unprotected ones simplifies the comparison of primary system catalogs with backup system catalogs.
PAGE 60
• All views and indexes dependent on base tables protected by RDF • All program files for applications that use any base tables protected by RDF if you want the applications to run at the backup site after an RDF takeover operation The backup system should also have copies of the following files in case an RDF takeover operation is necessary: • OBEY command files and TACL scripts containing NonStop SQL/MP or NonStop SQL/MX DDL commands that define the database • SQLCI or MXCI report definitions To m
PAGE 61
1. 2. 3. Place the database creation commands in either an EDIT (command) file or TACL macro or routine. See the TACL Reference Manual for more information. Through the TACL command interpreter, issue an OBEY filename command or run the macro to create the primary database. Copy the command file or TACL macro to the backup system. Now do the following on the backup system: • Change any system references in the command file or TACL macro from the primary system name to the backup system name.
PAGE 62
5. Enter CREATE CONSTRAINT commands for any constraints that values in particular columns of the table must satisfy: CREATE CONSTRAINT EMPNUM_CONSTRNT ON =EMPLOYEE CHECK EMPNUM BETWEEN 1 AND 99999; 6. Create the index for the NonStop SQL/MP table on the primary system: CREATE INDEX =EMPLNAME ON =EMPLOYEE( LAST_NAME, FIRST_NAME ); 7.
PAGE 63
The next examples of BACKUP and RESTORE commands show how to copy all files from the primary system volumes $DATA01, $DATA02, $DATA03, and $DATA04 to the magnetic tape device named $TAPE and how to restore these files to volumes of the same name on the backup system. You must include the AUDITED parameter in both the BACKUP and RESTORE commands. BACKUP $TAPE,($DATA01.*.*,$DATA02.*.*,$DATA03.*.*, $DATA04.*.*), AUDITED RESTORE $TAPE,($DATA01.*.*,$DATA02.*.*,$DATA03.*.*, $DATA04.*.
PAGE 64
Cache for RDF IMAGETRAILS and UPDATER UPDATEVOLUMES When you have determined the volumes you wish to use for Imagetrails and Updatevolumes, you should configure several thousand 4k blocks of cache for each volume. This will considerably increase the performance of the receiver and updaters. Installing RDF The RDF/IMP, IMPX, or ZLT software, and all related documentation, is distributed on three independent product release compact disks (CDs).
PAGE 65
RDF/ZLT (T0618) Product Components The release CD includes the following components for the RDF/ZLT product: RDF/ZLT The RDF/ZLT enabler module Readme The software documentation file To use the RDF/ZLT product, you must purchase both RDF/IMPX and RDF/ZLT (two separate CDs), install RDF/IMPX, and then install RDF/ZLT.
PAGE 66
Table 4 RDF Process and Program Security Attributes (continued) Program Name Run Under a Specific Logon ? LICENSE Required for Object File? MD5SRVO NO NO RDFCOM YES; 255,nnn + YES RDFEXTO YES ++ YES RDFMONO YES ++ YES RDFNETO YES ++ YES RDFPRGO YES ++ YES RDFRCVO YES ++ YES RDFSCAN NO++++ NO RDFSNOOP YES +++ YES RDFUPDO YES ++ YES READLIST NO NO RDIMAGE YES ++ YES + RDFCOM operational commands require super ID group access; however, INFO and STATUS commands can be issu
PAGE 67
• RDFRCVO. The RDF receiver program opens the image files in privileged mode and must be licensed with FUP or by running the RDFINST macro. RDFRCVO can be owned by any user ID. • RDFSCAN. The RDFSCAN program contains no privileged calls or privileged code and need not be licensed. RDFSCAN can be owned and run by any user ID. • RDFSNOOP. The RDFSNOOP program opens the image files in privileged mode and must be licensed with FUP or by running the RDFINST macro. RDFSNOOP can be owned by any user ID.
PAGE 68
If TMF was not running previously on the backup system, after you have installed TMF you must use TMFCOM to issue a START TMF command and one or more ADD DATAVOLS commands to add to the TMF configuration all disk volumes to be used by the RDF updater processes.
PAGE 69
For complete information about the INITIALIZE RDF command, see the description of the INITIALIZE RDF command in Chapter 8 (page 175). Initializing RDF To a TMF Shutdown Timestamp If TMF was running previously on the primary system and did not need to be initialized and configured, you can initialize RDF to a timestamp that reflects the time of the last TMF shutdown. This initialization is typically used when one stops TMF in order to initialize RDF to that TMF stop location.
PAGE 70
Determining a Valid inittime Value When using the INITTIME parameter without the NOW clause, it is important that you specify a valid inittime value. To do so, first issue a STATUS RDF command and take note of the highest updater RTD time. Then round that RTD time up to the next higher minute (0:43 becomes 1:00, 1:27 becomes 2:00, 3:04 becomes 4:00, and so forth). Finally, subtract that rounded-up time from the current system time shown in the status display.
PAGE 71
database. In this particular case, the database is not corrupted, but data corruption could happen for other NonStop SQL/MP or NonStop SQL/MX DDL SHARED ACCESS operations.
PAGE 72
5. 6. 7. Issue the STOP UPDATE command. This command stops the updaters but allows the extractor and receiver to continue to shipping and storing audit, respectively. Install the new RDF software in a different volume.subvolume from that housing the current version of RDF that is running. For example, if you are upgrading to T0346ABS, you might specify $system.rdfabs. Run $system.rdfabs.
PAGE 73
For RDF network environments, you should subtract an additional 15 minutes from the timestamp you calculated in Step 4.
PAGE 74
Setting Global Attributes The SET RDF command establishes values for global attributes that apply either to the entire RDF system or to all updater processes. These attributes and their default values are: • LOGFILE $0 • UPDATERDELAY 10 (seconds) • UPDATERFOPENTHRESHOLD 2400 • UPDATERNSASUSPEND OFF • UPDATERTXTIME 60 (seconds) • UPDATERRTDWARNING 60 (seconds) • UPDATEROPEN PROTECTED • SOFTWARELOC $SYSTEM.
PAGE 75
UPDATERFOPENTHRESHOLD Attribute The UPDATERFOPENTHRESHOLD attribute enables you to set a limit on the number of files that can be opened by a RDF Updater process. The default value for this attribute is 80% of the value of Maximum Number of Concurrent File Opens for an RDF Updater process. The value of Maximum Number of Concurrent File Opens is 3000. Therefore, the default threshold value of UPDATERFOPENTHRESHOLD is 2400. Set the threshold value by using the SET RDF command, when RDF is not running.
PAGE 76
user applications to open backup database files for read access but not for write access while the updater process has the file open. PROTECTED mode, however, is incompatible with taking online dumps and RELOAD operations. Therefore, if you want to perform one of these two operations, you need to change UPDATEROPEN from PROTECTED to SHARED. When you have finished the operation, you should set UPDATEROPEN back to PROTECTED. Previously you had to stop the updaters before you could change the UPDATEROPEN mode.
PAGE 77
REPLICATEPURGE Attribute The REPLICATEPURGE attribute specifies whether Enscribe purge operations on the primary system are to be replicated on the backup system. When set to OFF (the default value), Enscribe purge operations are not replicated. You should use the default (OFF) for all RDF configurations unless you have a specific need for replicating Enscribe purge operations.
PAGE 78
To create secondary image trails, use the ADD IMAGETRAIL command. Later, when you configure your individual updater processes, you assign each of these processes to a specific image trail. By spreading updaters across secondary image trails, you reduce the number of updaters contending for a specific trail. ATINDEX specifies which receiver will write to that trail; 0 is the default. Each secondary image trail contains the audit records needed by the associated updater processes.
PAGE 79
Use SET TRIGGER and ADD TRIGGER commands to configure the following trigger attributes: • PROGRAM • INFILE • OUTFILE • CPUS • PRIORITY • WAIT or NOWAIT The PROGRAM parameter specifies the name of a Guardian object file that is executed once RDF has reached a particular state, either after a STOP RDF, REVERSE, or TAKEOVER operation. The INFILE attribute specifies the name of an edit file that will be passed as the IN file to the trigger process when it is created.
PAGE 80
Thus, within its configuration file, the network master has all necessary information about every system in the RDF network (whereas the other systems have only a pointer enabling them to obtain information about other systems in the network). PRIMARYSYSTEM Attribute The PRIMARYSYSTEM attribute specifies the name of a primary system. There is no default value. Each primary system within an RDF network must be unique within the network.
PAGE 81
The CPUS attribute in the following form specifies the primary and backup processors in which the monitor will run: CPUS primary-CPU:backup-CPU If the primary processor is not available when RDF is started, the monitor executes in the specified backup processor without benefit of a backup process. When the primary processor is brought back online, the monitor creates its own backup process in the primary processor and then switches control to that monitor process.
PAGE 82
to which protected data volumes are configured. You use a SET EXTRACTOR VOLUME statement for each individual volume. You do not need to specify whether the volume is an active volume, restore volume, or overflow volume; you merely specify the volume name. For information about the ZLT capability, see Chapter 17 (page 329).
PAGE 83
audit to the backup database. One can typically observe updater RTD times in the range of 1-20 seconds, although it may only take an updater a fraction of one second to apply 20 seconds worth audit. With FASTUPDATEMODE ON, as a receiver receives an extractor message, it buffers all the audit sent in that message by the extractor, writes those buffers immediately to the image trails, and then makes that data immediately available to the updaters.
PAGE 84
]SET ]SET ]SET ]SET ]SET ]ADD PURGER PURGER PURGER PURGER PURGER PURGER PROCESS $PURG CPUS 0:2 PRIORITY 185 RETAINCOUNT 6 PURGETIME 30 You cannot start RDF until you have configured a purger process. You can issue ADD PURGER commands only when RDF is stopped.
PAGE 85
If the backup volume names are not identical to the corresponding primary volume names, then you will have to update every partitioned file and every file that has alternate keys on the backup system so that each points to the correct volume name. You can use INCLUDE and EXCLUDE lists to specify which files must be be protected by RDF. The ALTER command supports only INCLUDE/EXCLUDE of files and does not support INCLUDEPURGE/ EXCLUDEPURGE.
PAGE 86
If that happens, you should make sure you are using the correct RDFCOM. If you are using the correct version and you get this message, then you must reinitialize RDF. If RDFCOM cannot determine the configuration file version, it prints the following message to the home terminal and aborts the command: RDFCOM version (version) does not match the config file version unknown If that happens, you should make sure you are using the correct RDFCOM.
PAGE 87
If TMF BEGINTRANS is disabled, RDF issues an error message. Unless you explicitly specify otherwise, RDF always starts with updating enabled: all updater processes immediately begin updating their volumes by reading audit images from the RDF image files and applying the appropriate changes to the backup database files.
PAGE 88
4 Operating and Monitoring RDF To operate and monitor RDF, you enter commands through two online utilities: the RDFCOM and RDFSCAN interactive command interpreters. Through these utilities, you initiate communication with RDF, request various RDF operations or information displays, and terminate communication with the subsystem.
PAGE 89
RDFCOM is an implicit RUN command, instructing the TACL command interpreter to run the RDFCOM utility program. IN command-file specifies a command file from which RDFCOM commands are to be read. RDFCOM reads 132-byte records from the specified file until it encounters either the end-of-file mark or an EXIT command. If you do not specify the IN option, TACL automatically supplies the name of its current default input file—usually the terminal from which you issued the RDFCOM command.
PAGE 90
Starting a Session To start an interactive RDFCOM session, enter the RDFCOM keyword at your TACL prompt, followed optionally by the name of the RDF control subvolume: >RDFCOM [control-subvolume] For example, to start a session on a primary system named SANFRAN, you would enter the following command (assuming that no suffix character was specified in the INITIALIZE RDF command): >RDFCOM SANFRAN If the suffix character “3” was specified in the INITIALIZE RDF command, then you would enter the following comma
PAGE 91
Interrupting Command Processing You can interrupt RDFCOM processing by pressing the BREAK key at your terminal. RDFCOM responds as follows: • If you press BREAK at the RDFCOM input prompt (]), RDFCOM returns control of the terminal to RDFCOM’s parent process (typically, TACL) but continues execution. You can resume communication with RDFCOM by entering the operating system command PAUSE at the TACL prompt.
PAGE 92
SET SET SET SET SET SET ADD RDF RDF RDF RDF RDF RDF RDF UPDATERTXTIME 60 UPDATERRTDWARNING 60 UPDATEROPEN PROTECTED NETWORK ON NETWORKMASTER ON UPDATEREXCEPTION OFF To run RDFCOM and execute the commands in this file, supply the command file name in the IN option of the command to start RDFCOM: 4> RDFCOM /IN RDFSET/ control-subvolume When it uses a command file in this way, RDFCOM works in batch mode: RDFCOM begins the session, reads and executes each command from the command file, and displays the asso
PAGE 93
rdfcom; initialize rdf,backupsystem \SF! rdfcom /in $system.boston.rdfcfg/ rdfcom; start rdf You would execute this command as an OBEY file to your TACL prompt. For this example, assume you have been running an RDF subsystem where \Boston is your primary system and \SF is your backup system. You have stopped TMF and RDF, you have reinitialized and reconfigured TMF, and you want to reinitialize, reconfigure, and restart RDF.
PAGE 94
Controlling Multiple RDF Environments Running on Different Nodes with a Single Obey File If you have multiple RDF subsystems running on different nodes, you can control those subsystems from a single obey file. Consider the following environment.
PAGE 95
Table 5 RDFCOM Configuration Commands (continued) Command Object Function RESET RDF; MONITOR; EXTRACTOR; RECEIVER; VOLUME; IMAGETRAIL; PURGER; RDFNET; NETWORK; TRIGGER; Resets all option values in the configuration memory table to their default values for the specified process. SET RDF; MONITOR; EXTRACTOR; RECEIVER; VOLUME; IMAGETRAIL; PURGER; RDFNET; NETWORK; TRIGGER; Adds option values to the configuration memory table for the specified process.
PAGE 96
Table 6 RDFCOM Operational Commands (continued) Command Object Function TAKEOVER - -; Initiates an RDF takeover operation on the backup system. UNPINAUDIT - -; Unpins TMF audit trail files on the primary system VALIDATE CONFIGURATION; Validates the current attribute values in the RDF configuration file. Utility Commands RDFCOM utility commands and their functions are listed in Table 4-3. All of these commands can be issued by anyone.
PAGE 97
Requesting Online Help Through the RDFCOM HELP command, you can display brief descriptions of: • RDFCOM command syntax (including syntax for the HELP command itself) • Numbered RDF messages (such as 700, 705, and so forth) The HELP text is intended as a reminder, not as a substitute for this manual. Help for Command Syntax To obtain syntax information for an individual command, enter HELP followed by the command name.
PAGE 98
FC HELP HISTORY OBEY OPEN OUT RDF Concepts: Abbreviations RDF error messages: error-number E.g., "help 700" prints an explanation for the RDF error message 700 Help for RDF Error Messages For information about a particular error message (its cause, effect, and recommended recovery steps), enter HELP followed by the message number.
PAGE 99
RDFSCAN has two operational restrictions: • RDFSCAN does not use command files; you must enter all RDFSCAN commands from the terminal. • RDFSCAN accepts only one command per prompt. Starting a Session To start an interactive session with RDFSCAN, enter the RDFSCAN keyword at your TACL prompt, followed by the name of the entry-sequenced file you want to open. For example, to begin an RDFSCAN session and open the file $SYSTEM.RDF.MSGLOG for scanning, enter: >RDFSCAN $SYSTEM.RDF.
PAGE 100
Table 8 RDFSCAN Commands (continued) Command Object Function LIST number Beginning at the current record, examines subsequent messages in the message file and displays those that contain the current match pattern. The operation terminates when the specified number of matches has been found or the last record is encountered, whichever happens first. LOG log-file Copies any log messages subsequently displayed on the screen by LIST commands to the specified file.
PAGE 101
Valid RDFSCAN commands are: At Display Exit File Help List LOG Match NOLOG Scan - Sets the current-record pointer to the value given. - Turns ON/OFF the display of line number with each line listed. - Exits. - Changes the RDF message file being scanned. - Help. - Displays "n" lines of the RDFLOG with optional pattern matching. - Sets and Opens the message file for echoing of Listed lines. - Sets the pattern to be matched (or turns it off.) - Closes the log file.
PAGE 102
command. This can also be obtained if you are using ASAP ( see Appendix E (page 492)). The display returned by the RDFCOM STATUS RDF command is as follows: ]STATUS RDF In response, RDF displays: RDFCOM - T0346H09 – 11AUG08 (C)2008 Hewlett-Packard Development Company, L.P. Status of \RDF04 -> \RDF05 RDF 2008/08/11 05:26:49.082 Control Subvol: $SYSTEM.
PAGE 103
Table 9 RDF States (continued) Status Description UPDATERNSASUSPEND attribute). You must consult the RDF LOG for either the RDF event 905 or 908 to determine if it is safe for you to perform the DDL operation on the backup system. * Monitor Unavailable * The monitor is either stopped or is running but unable to respond. The latter situation can happen for several reasons, such as the STATUS RDF command having been issued from the backup system when the Expand connection to the primary system is done.
PAGE 104
become obsolete, but it continues to be displayed for long standing continuity with older RDF releases. The RTD value reported for each updater process is the difference between the “last modified time” of the latest file in the audit trail to which the volume protected by the updater is configured and the timestamp from the latest image record that the updater has read. The RTD value reflects, in the most general sense, the amount of time by which the backup database is behind the primary database.
PAGE 105
column for any RDF process, you should examine the messages in the RDF log file or on the RDF log device to determine what is happening and what corrective action to take. Except for updaters, asterisks in the Error column continue to appear in every STATUS RDF display until the error condition has been corrected.
PAGE 106
These are the only configuration attributes that can be altered while RDF is running. To change any other configuration attributes, you must first stop RDF or UPDATING as directed in “Restarting RDF” (page 125). To change any of the attribute values listed, you start RDFCOM and use the ALTER command. ALTER is a restricted command; it can be issued only by members of the super ID group. See the description of the ALTER command in Chapter 8 (page 175).
PAGE 107
PURGETIME The PURGETIME purger process configuration attribute specifies the number of minutes the purger process waits between attempts to purge redundant image trail files. Altering this attribute causes the purger to perform a purge pass immediately. UPDATERDELAY The UPDATERDELAY global configuration attribute specifies how many seconds the updater processes should delay upon reaching the logical EOF in the image trail before attempting a new read.
PAGE 108
Recovery: Consult the description of the PROCESS_CREATE_ procedure in the Guardian Procedure Calls Reference Manual to determine the cause of the failure. Once the underlying cause is corrected, RDF can be restarted. To isolate RDF messages from the rest of the EMS log, you can use the standard EMS filter RDFFLTO to produce an intermediate entry-sequenced file which you then can scan using the RDFSCAN utility.
PAGE 109
the next line. User input appears in boldface type. Notice also that record numbers, which do not appear in the previous display, have been enabled for this one. >RDFSCAN RDFSCAN - T0346A06 - 14MAR04 (C)1988 Tandem (C)2004 Hewlett Packard Development Company, L.P. File: $SYSTEM.RDF.RDFLOG, current record: 891, last record: 903 Enter HELP ALL for instructions Enter the RDFSCAN function you want: AT 750 File: $SYSTEM.RDF.
PAGE 110
5 Critical Operations, Special Situations, and Error Conditions When running RDF, there are a number of critical operations and situations that need careful consideration. Understanding all aspects of these operations and situations is essential. Understanding critical operations ensures that you perform said operations correctly, quickly, and efficiently. Understanding critical situations and error conditions ensures that you achieve resolution as quickly as possible.
PAGE 111
To analyze a file system error, see the appropriate table in this discussion, reading about any corrective action specific to RDF. Then, for further information about the error (its cause, effect, and general recovery procedures), see the file-system information in the Guardian Procedure Error and Messages Manual. Some errors involving one or more updaters might require you to resynchronize certain files; see the EMS event log for further information.
PAGE 112
Table 10 Recovery From File Modification Failures (RDF Event 700) (continued) File System Error Recovery Action 130 through 139 Repair the device or clear the condition. 157 Check file integrity. 190 Repair the device or clear the condition. 200 through 231 Repair the device or clear the condition. 707 Enable the volume for TMF transaction processing. Table 11 lists the file-system error numbers and recovery actions for RDF event 705, which reports file-opening failures.
PAGE 113
Table 12 Recovery From File Creation Failures (RDF Event 739) (continued) File System Error Recovery Action 50 through 58 Repair the device or clear the condition. 59 Check file integrity. 60 through 66 Repair the device or clear the condition. 100 Repair the device or clear the condition. 103 Repair the device or clear the condition. 120 through 122 Repair the device or clear the condition. 130 through 139 Repair the device or clear the condition. 157 Check file integrity.
PAGE 114
Exceeding the UPDATERFOPENTHRESHOLD Value The UPDATERFOPENTHRESHOLD attribute enables RDF users to set a threshold value on the number of files that can be opened by a RDF Updater process. The default value for this attribute is 80% of the value of the Maximum Number of Concurrent File Opens for a RDF Updater process. The value of Maximum Number of Concurrent File Opens is 3000. Therefore, the default threshold value of UPDATERFOPENTHRESHOLD is 2400.
PAGE 115
4. 5. 6. 7. 8. After one minute, STOP RDF again Restart RDF with UPDATE OFF; this causes the receiver to rollover to a new image file on each image trail. On the image trail for the updater generating the 813 events, move the next file in sequence (the one after the file identified in step 1) to a different subvolume. For example, if the updater is reading file AA000100, then move AA000101. START UPDATE.
PAGE 116
NOTE: If you issue a STOP RDF command on the primary or backup system while the network is down, you must also issue a STOP RDF command on the other system while the network is still down. If you have an RDF network running and the Network Master's RDFNET process encounters a communications line failure when attempting to perform a network transaction on another primary node in the RDF network, then it can lead to an increase in work to be performed during an RDF Takeover operation.
PAGE 117
arriving out-of-order: if a message arrives out-of-order, the receiver simply directs the extractor to restart. When the CPU that failed comes back up, RDF switches the extractor to run on the reactivated primary CPU. If both the primary and backup CPUs of the extractor process fail, RDF aborts. Receiver Failure If the primary CPU of the receiver process fails, the receiver process in the backup CPU takes over and resynchronizes with the extractor process.
PAGE 118
If a state transition failure occurs during execution of a STOP UPDATE command and the operation appears to be stalled, manually stop all of the RDF updaters by issuing the following command on the backup system: STATUS *, PROG RDF-software-loc.RDFUPDO, STOP CAUTION: Issuing this command in this situation is only safe, however, if this is the backup system for a single RDF environment.
PAGE 119
3. From the backup system, restart TMF on the backup system by entering this command through TMFCOM: ~START TMF 4. From the primary system, resume updating of the backup database by entering this command through RDFCOM: ]START UPDATE Volume Recovery Processing RDF handles volume recovery automatically. Volume Recovery Failure RDF cannot recover from a TMF subsystem failure if TMF cannot successfully perform volume recovery.
PAGE 120
You should not perform a file recovery to a timestamp, first purge, or TOMATPOSITION on your backup system if the location occurs prior to an RDF takeover location. Those file recovery operations normally are used to recover a database that has been corrupted. Under normal circumstances, the best way to recover the backup database is to resynchronize it with your primary database.
PAGE 121
When the extractor pins an audit trail file, it does this by sending a message to TMF, asking TMF to keep the file pinned until the extractor no longer needs it, so in reality it is TMF who actually pins the file on behalf of RDF. When the extractor rolls over from one audit trail file to the next, it unpins the earlier file and pins the next file. NOTE: TMF keeps the extractor's audit trail file pinned even if you stop RDF.
PAGE 122
Stopping RDF leaves the backup database in an inconsistent state and also leaves the audit trail file last opened by the extractor pinned. CAUTION: If the primary system crashes, RDF processes on the backup system remain running. If you do not execute a takeover and are able to bring the primary system back up, you must stop the RDF processes on the backup system before you restart RDF on the primary system. While the primary is down issue STOP RDF on backup.
PAGE 123
the extractor can transmit data to the receiver on the backup system. If the extractor is not reading the MAT, it cannot encounter the TMF shutdown message. Two situations could arise: • If the communications lines come back up before you restart TMF, RDF encounters the TMFCOM STOP TMF record in the MAT and then stops processing.
PAGE 124
If the communications lines between the two systems are down when you issue the STOP RDF command, the monitor tells the extractor to stop and writes an error message for every process running on the backup system that the monitor could not access; the monitor then stops itself. If this situation occurs, you must use RDFCOM on the backup system to stop the remaining RDF processes before you can restart RDF.
PAGE 125
not had to stop TMF to get into this state. When you are ready to restart RDF, just enter the START RDF command and it will resume where it left off last. CAUTION: If you do not stop the application that is updating your RDF protected database until after you have issued the STOP RDF, DRAIN command, then the backup database has low probability of being logically identical to the primary database after RDF shuts down.
PAGE 126
Below, the general steps involved in coordinating a switchover of business operations from the primary to backup system and back are provided. These only address the aspects of the switchover itself with regard to RDF operation and general business operations.
PAGE 127
When the extractor receives notice of the operation, it notes where it is in the audit trail and shuts down, and the updaters shut down as soon as they have reached the equivalent location. This is identical to the DRAIN command. Next, RDF automatically executes the REVERSE trigger that you have configured.
PAGE 128
1. 2. 3. 4. 5. On system \B, stop RDF subsystem # 2. Note the local system time; you will need it later. On system \A, stop the business applications that access the primary database (Applications #1). On system \A, stop TMF(or if you do not want to stop TMF, use the STOP RDF, DRAIN command). Wait for RDF subsystem #1 on \A to shut down. On system \B, restart Applications #1.
PAGE 129
takeover operation. There are special considerations that pertain to the Takeover command in a ZLT environment. See Chapter 17 for details. With the RDF/IMPX product, it is possible that some transactions that committed on the primary system might be lost due to an unplanned outage. How many committed transactions are lost depends entirely on whether the extractor was keeping up at the time of the outage or whether the extractor had fallen behind for some reason.
PAGE 130
running, then RDFCOM aborts the command immediately. If you include the ! option, then RDFCOM does not try to reach the monitor and extractor on the primary system. The ! option also determines whether or not you are prompted to confirm your intention of performing the Takeover operation, a topic discussed a little further below. It is highly recommended that you do not use the ! option because it prevents the Takeover command from getting started if the primary system is still running.
PAGE 131
3. To proceed with the takeover operation, enter Y or YES. To abort the takeover operation, enter N or NO. After you enter your response, RDFCOM returns its prompt. Once the Takeover operation is underway, you can use the STATUS RDF command to determine the progress of the takeover operation. If the takeover operation is still in progress, RDF displays the current state as “TAKEOVER IN PROGRESS.
PAGE 132
When all of the updater processes have stopped, the purger logs either the RDF event number 724 or 725 before stopping. Event 724 indicates that the takeover completed successfully. Event 725 indicates that it did not, and you should reissue the TAKEOVER command. Event 724 is always followed by event 735, which indicates the last MAT position seen by the receiver process. The 735 event is used primarily for triple contingency. These events will be followed by either RDF event 888 or 858.
PAGE 133
audit to be undone by an updater is large (for example, thousands of records), then logging an exception record for each record undone could slow down the takeover work of each updater. You can choose whether you want an exception record for each audit record undone during the takeover operation when you configure the RDF UPDATEREXCEPTION attribute. If you set it ON, the updater logs an exception record for each audit record on which it executes undo.
PAGE 134
Therefore, taking an online dump before resuming business operations is important, but when do you do it? If you wait until after the RDF takeover operation has completed, then it could take many hours before the online dumps complete, and only then would it be safe to resume business operations. Thus, not taking regular online dumps of your backup database can lead to a significant length of time before you can safely resume business operations on your backup system.
PAGE 135
c. d. If you use the same application for query processing as well as read/write access, and you are already performing query processing on your backup database, you will need to have the application close all files currently open for read-only access, and then reopen them for read/write access after the RDF takeover. In this sense, it may be advantageous to have one application that performs query processing and another that does read/write operations.
PAGE 136
12. Test Your Switchover/Takeover Procedures You may not know whether you have everything you need on your backup system to move business operations from your primary system to your backup until you perform that task. If you wait until you actually encounter a disaster and must move business operations to the backup system, you may find that you are missing important items that you need.
PAGE 137
Restoring the Primary System After you initiate a takeover, it is possible that the last committed transactions on the primary system did not make it to the backup system (meaning that the backup and primary databases are not synchronized). When the failed primary system is restored to operable condition you have two methods of resynchronizing your primary database with your backup database where your applications are now running. One method is online, and the other is offline.
PAGE 138
8. 9. Turn on updating. When RDF has caught up, do a planned switchover from \B to \A (as described earlier). If you have an RDF Network, there are some situations where File Recovery with the TOMATPOSITION option is not possible. If that is the case, RDF logs an RDF Event 858 at the end of the takeover operation.
PAGE 139
For a complete discussion of FASTUPDATEMODE, see the description for this attribute under the SET RECEIVER command in “Entering RDFCOM Commands” (page 175). While having FASTUPDATEMODE turned on does give you read access to data freshly committed on the primary system as soon as possible, please note that the option still only provides you with BROWSE access.
PAGE 140
Because the updater may have applied some audit for transactions that had not yet committed at the specified timestamp, it then executes an undo pass to undo those specific records. For the undo pass, the purger builds an undo list based on those transactions that the updaters need to undo, the updaters read this list, and they read backwards in the image trail, performing logical undo operations on those records that need to be backed out.
PAGE 141
The only operations that must be performed WITH SHARED ACCESS are merge partitions and move boundaries. It is recommended that you perform all other operations with nonshared access. NOTE: When you make DDL changes to your primary database, you can use the NonStop SQL DDL Replicator product to replicate NonStop SQL/MP DDL changes to your backup database automatically, instead of you having to perform those changes manually on the backup system.
PAGE 142
generates the RDF event 908, you can perform steps 2 and 3. The following is the output of RDFCOM STATUS RDF command when Updaters are shut down is as follows: ] status rdf Status of \POS03 -> \METIS RDF 2011/08/18 9:57:24.070 Control Subvol: $SYSTEM.POS03 Current State : Updated NSA Stopped Reason: NSA operation done on \METIS.$FC0500.DB.
PAGE 143
Purger $TPURP $DATA01->$FC0000 $MU 164 165 1: 0 $SWAP01 12 1044480 1: 0 susp CAUTION: While the NonStop SQL products allow a DDL change with Shared Access where the target is located on a different node, RDF does not support this occurrence. For example, you assigned a Table X on your RDF primary system \A and you want to create a new partition for the table on \B.
PAGE 144
1. 2. 3. Execute a process that opens the image trail file with shared read access. This can be a simple process that you supply to perform only this operation. When the purger determines that all updaters are finished with this image trail file (named, say, AA000007), and have moved on to the next image trail file (named, say, AA000010), then it might try to purge AA000007. The purge operation will fail, however, because your process still has AA000007 open.
PAGE 145
Of course, if you are taking online dumps of your backup database, you must also configure TMF to perform audit dumping either to tape or disk. Doing FUP RELOAD Operations With Updaters Running Because the backup database is audited by TMF, you cannot do FUP RELOAD operations on it unless you have altered the RDF UPDATEROPEN attribute to SHARED. Previously you needed to stop the updaters before you could alter this attribute, but RDF now allows you to do this online.
PAGE 146
NOTE: If you enter the SCF PRIMARY DISK for an updater's UPDATEVOLUME, the affected updater might report a number of RDF 700 events with the file-system errors 10, 11, and 71. If these errors occur, they will be reported immediately following the disk primary event. In this situation, these errors can be expected and they do not indicate that the backup database has become inconsistent with the primary database.
PAGE 147
6 Maintaining the Databases A vital task in working with RDF is to keep the backup and primary databases synchronized with each other.
PAGE 148
being logically identical is the fraction of a second it takes the extractor and updaters to catch up with the MAT. Figure 8 Synchronized Databases During RDF Operations Database MAT T1 T2 T3 T4 T5 T6 T4 T5 T6 Image File Database T4 T5 T4 T1 Extractor T2 T3 Updater Primary Backup Figure 9 shows synchronized databases where the application is running on \PRIMARY and the transaction data for the three new transactions has been applied to the backup database.
PAGE 149
Figure 10 shows synchronized databases where TMF has just been shut down. The databases are synchronized because RDF applies all audit generated on \PRIMARY to the backup database before the subsystem reads the TMF shutdown record and subsequently shuts down (the databases are not, however, logically identical until RDF has actually shut down).
PAGE 150
Figure 11 Unsynchronized Databases Database MAT T1 T2 T3 T4 T5 T6 T1 T2 T3 T4 T5 T6 Image File Database T1 T2 T3 T4 T4 Updater Primary Backup Making Changes to Database Structures When you change the structure of a database on the primary system, you also need to change the structure on the backup system.
PAGE 151
The following guidelines apply to creating catalogs: • If a catalog exists on a volume protected by RDF, this catalog should also be present on the corresponding volume on the backup system. • To avoid errors, create a catalog on the backup system before creating it on the primary system. If audit data is generated for a primary catalog before the corresponding backup catalog exists, every audit record for the catalog causes a file open error.
PAGE 152
Adding a New Column This is an operation that cannot be performed With Shared Access. To minimize application downtime, you can coordinate the operation as follows. Stop the RDF updaters with a simple STOP UPDATE command. When the updaters have stopped, add the column to your backup database and then restart update. Note, at this point, the new column is in the backup database but not yet on the primary.
PAGE 153
Each NonStop SQL/MP index is assigned a unique key specifier that is stored as part of the key for that index. You can explicitly define the key specifier by including the KEYTAG clause in the CREATE INDEX command. If you do not do so, then the CREATE INDEX operation assigns a numeric value based on the order of index creation (1, 2, 3, and so forth). Because the key specifier is part of the key of every index row created on an RDF primary system, it also becomes part of the associated TMF audit record.
PAGE 154
Partition Key Changes If you change a key for any partition on the primary system, you must also change the key for the corresponding partition on the backup system. Table Purges If you use the SQLCI PURGE command to purge a protected table from the primary system, you must also purge the corresponding table from the backup system. You should not purge a table on the backup system until you are sure RDF has completed all processing on the table.
PAGE 155
4. 5. 6. When the purger has logged RDF event 852, perform the same DDL operation on the backup system. START RDF on the primary system. Start application processing on the primary system. Resynchronizing Databases There are two ways of resynchronizing your primary and backup databases: offline and online. With offline resynchronization you must first stop your applications and TMF on the primary system.
PAGE 156
To purge a NonStop SQL/MP or NonStop SQL/MX database, use the SQLCI/MXCI PURGE utility and DROP command, as explained in the SQL/MP Installation and Management Guide and the SQL/MX Installation and Management Guide. To recopy a database to the backup system, follow the instructions in “Synchronizing the Primary and Backup Databases” (page 60).
PAGE 157
7 Online Database Synchronization With RDF/IMP, IMPX, or ZLT you can synchronize entire databases or selected volumes, files, tables or even partitions while your applications continue to run. For information about NonStop SQL/MX databases, see Chapter 16 (page 316). Overview The RDF online database synchronization protocol consists of the following general steps (the details of which are discussed later in this chapter): • Initialize the RDF configuration with the SYNCHDBTIME option.
PAGE 158
NOTE: RDF does not replicate NonStop SQL/MP and NonStop SQL/MX catalogs. Therefore, if you are synchronizing NonStop SQL/MP and NonStop SQL/MX tables, you might need to create NonStop SQL/MP and NonStop SQL/MX catalogs manually on the backup system if they do not already exist. Synchronizing Entire Databases Online To synchronize an entire RDF backup database to the primary database online: 1. If RDF is currently running, issue a STOP RDF command on the primary system. 2.
PAGE 159
of this command is to enable RDF to determine when the synchronization operation has completed and the backup database is synchronized with the primary database. When the extractor completes its role in the online synchronization operation, it generates the RDF Event 782 and then resumes normal operations. For more detailed information, see “Phases of Online Database Synchronization” (page 172). 6.
PAGE 160
Duration and Preparation Issues As indicated in the steps described, getting a complete copy of your entire database and placing it on the backup system can take quite a bit of time, and you cannot start the updaters until the database is fully prepared on the backup system. This leads to an issue that you must consider. While you are making a copy of the database and then getting it prepared on the backup system, you must run RDF with UPDATE OFF.
PAGE 161
If you then initialize the RDF subsystem to a point in the MAT prior to the Stop-RDF-Updater record associated with the partition boundary change and an updater encounters audit records associated a key N through Z, the updater will report an error because it will try to apply the audit records to tableA (which used to contain it, but now does not), and the audit records will not be applied to the backup database.
PAGE 162
Special Consideration for Enscribe Files If you create empty Enscribe files on your primary system, you should create them with the audit attribute set off. This is particularly important if you create them on volumes protected by RDF. If you create them as audited files on database volumes that are being protected by RDF, the updaters also create them on the backup system. You then must purge the files on your backup system before copying the loaded files from the primary system.
PAGE 163
You could also copy the empty file to the backup system, insert a record into the file on the backup system, and delete the inserted record: 1. Start a transaction, do a WRITE to the empty queue file, and commit the transaction. 2. Start a new transaction, do a READUPDATELOCK on the record, and commit the transaction. This procedure pops the inserted record from the file, but leaves the special “dummy” record in the 0th position. You must do this operation before you start RDF updating.
PAGE 164
FUP ALTER $DATA1.TEST.PART0100 ALTFILE (0,\backup.$DATA.TEST.ALTF0100) • If you use the SQLCI DUP or MXCI DUP command to move duplicate partitioned NonStop SQL tables, you must use the MAP NAMES option to specify the backup system name. • If you use the SQLCI DUP or MXCI DUP command to move NonStop SQL tables with index tables, you must use the MAP NAMES option to specify the backup system name.
PAGE 165
create $data3.test.part0200 set altfile (0, $data3.test.altf0201 ) create $data3.test.part0201 After using a VOLUME command to specify the primary database volume from which you want to extract the data, load the empty duplicate files: 5. volume $data0.test load part0100, $data2.test.part0100, load part0101, $data2.test.part0101, load altf0100, $data2.test.altf0100, load altf0101, $data2.test.altf0101, share, share, share, share, sorted sorted sorted sorted volume $data1.test load part0200, $data3.
PAGE 166
Example #1 – Staged Synchronization of an Entire Database Suppose you are synchronizing your entire database by synchronizing selected portions first. Suppose your database is on ten volumes and you want to synchronize two volumes at a time. You would start by synchronizing your first two volumes, following the guidelines for synchronizing an entire database. When this operation has completed and the RDF updaters are fully caught up, you stop the NonStop RDF product.
PAGE 167
synchronized and load duplicate copies of the files or tables to be synchronized. Also, when determining what timestamp to specify with the SYNCHDBTIME attribute, you should follow the guidelines for the INITTIME option. There are a variety of considerations when synchronizing portions of a database. Read the following carefully.
PAGE 168
Relative Files with Create/Load (Step 4, Method 1) First create a non-audited duplicate file on the primary system. You must create the entire file with all its partitions. Unlike key-sequenced files, you must load the entire file. For example, assume the file has two partitions: $DATA1.TEST. PART0100 (primary) and $DATA2.TEST.PART0100 (secondary). Issue the following command: FUP CREATE $DATA1.TEMP.PART0100, LIKE $DATA1.TEST.PART0100, NO AUDIT That command creates the two files $DATA1.TEMP.
PAGE 169
and RESTORE 2) when restoring them on the backup system in order to specify the correct system name. You cannot, however, include both the MAP NAMES (LOCATION option with BACKUP and RESTORE 2) and PARTONLY options in the RESTORE operation. Therefore, because you must use MAP NAMES or LOCATION, you cannot restore only a single partition. Described below is a set of steps that can be used to synchronize individual partitions of NonStop SQL/MP tables (either primary or secondary partitions).
PAGE 170
1. 2. If RDF is currently running, issue a STOP RDF command on the primary system. Purge the RDF control subvolume and then issue an INITIALIZE RDF command of the following form on the primary system: INITIALIZE RDF, BACKUPSYSTEM \system, SYNCHDBTIME ddmmmyyyy hh:mm For the timestamp, follow the guidelines for the INITTIME option. 3. 4. Configure RDF and then issue a START RDF, UPDATE OFF command on the primary system.
PAGE 171
14. Use the RESTORE utility with the PARTONLY option to put the loaded primary partition of the duplicate table into the correct location. MAP NAMES is not required because the loaded partition now has the correct name on tape and can be restored directly. 15. When the extractor has logged the message indicating it has completed its role in the online synchronization operation, issue the RDFCOM START UPDATE command on your primary system.
PAGE 172
7. If you created the duplicate table on the primary system, then use the BACKUP utility to put the entire duplicate table with all partitions onto tape. If you created the duplicate table directly on the backup system, skip this step. 8. If you created the duplicate table on the primary system, then use the RESTORE utility to put the entire duplicate table with all its partitions onto disk on the backup system. You must use MAP NAMES to correct the system name. $DATA.DUP.
PAGE 173
after the load or backup operation completed. At this time, the extractor begins building a list of all transactions that might have been started during the create/load or backup operation. Upon completion of phase 1, part 1, the extractor logs message 766. Phase 1, Part 2 The extractor has reached the next TMP control point record in the audit trail and now has a list of all transactions that might have been started during the create/load or backup operation.
PAGE 174
functions. Where it resumes, however, depends upon where it was when the restart condition occurred. • If the restart condition occurs prior to the start of phase 1, the extractor resumes wherever the receiver tells it to. • If the restart condition occurs after phase 1 has begun, the extractor might choose to resume at an earlier position than the receiver tells it to. It does this to ensure that it has handled all committed and aborted transactions correctly.
PAGE 175
8 Entering RDFCOM Commands To manage, operate, and control RDF and its environment, you enter commands through the RDFCOM online utility. This chapter, directed to system managers and operators, describes the RDFCOM commands and their attributes.
PAGE 176
command, the default security requirements appear under the heading “Security Restrictions.” In general, the default security restrictions for RDFCOM commands are: • The EXIT, FC, HELP, HISTORY, INFO, LIST, OBEY, OPEN, OUT, SHOW, and STATUS commands can be used by all users. • The START RDF and TAKEOVER commands can only be used by the member of the super ID group who initialized RDF. • The other RDFCOM commands can be used only by members of the super ID group.
PAGE 177
Table 13 Systems for RDFCOM Commands (continued) Extractor Image Monitor RDF Receiver Purger Trail Update Volume RDFNET Network Trigger RESET P P P P P P P P P P SET P P P P P P P P P P SHOW P E E E E E E E E E E E START STATUS P E E STOP Other Objects P E E E E** P P* TAKEOVER B UNPINAUDIT P VALIDATE P Legend P = Primary only B = Backup only E = Either * = SYNCH ** = RTDWARNING Table 14 Default User Security for RDFCOM Commands Extractor Image Monitor
PAGE 178
Table 14 Default User Security for RDFCOM Commands (continued) Extractor Image Monitor RDF Receiver Purger Trail Update Volume RDFNET Network Trigger HISTORY INFO Other Objects A A A A INITIALIZE A A A A A A A A* O* LIST A OBEY X OPEN A OUT A RESET S S S S S S S S S S SET S S S S S S S S S S SHOW A A A A A A A A A A A* A A START STATUS O* A STOP A* S* S* A* A* S* A** S*** TAKEOVER O UNPINAUDIT S VALIDATE S Legend: A = All users S = Su
PAGE 179
RDFCOM-Related Filenames and Process Identifiers File names and process identifiers sometimes appear as attributes in RDFCOM commands. These names typically identify objects such as disk files, log devices, and processes. Errors can result from improperly specifying these names in RDFCOM commands. In almost all commands, these names are governed by the common syntax rules described in the following paragraphs. Where exceptions to these rules occur, they are noted in the individual command descriptions.
PAGE 180
[system.]ldev-number system specifies the name of the system on which the device resides. A system name consists of a backslash (\) followed by one to seven alphanumeric characters; the first alphanumeric character must be a letter. device-name specifies the name of a device. A device name consists of a dollar sign ($) followed by one to seven alphanumeric characters; the first alphanumeric character must be a letter. qualifier is an optional qualifier.
PAGE 181
session—when you start a new session, the values that existed at the end of the last session survive into the new session. Descriptions of all RDFCOM commands follow in alphabetical order. ADD The ADD command applies configuration parameter values for the specified process or other object from the RDF configuration memory table to the RDF configuration file.
PAGE 182
RDF State Requirement Except for the ADD VOLUME and ADD TRIGGER commands, you can issue any ADD command only after initializing RDF but before entering the first START RDF command. After RDF is initialized, you can issue an ADD VOLUME or ADD TRIGGER command anytime RDF is stopped.
PAGE 183
To define $SYSTEM.RDFIMP as the location of the RDF software, enter the following commands: ]SET RDF SOFTWARELOC $SYSTEM.RDFIMP ]ADD RDF When the preceding command sequence is executed, all of the other RDF global parameters are set to their default values: (In this list, \LONDON is the system at which you issued the command sequence.
PAGE 184
global-option, extractor-option, monitor-option, receiver-option, purger-option, netsynch-option, trigger-option and updater-option are described under the SET RDF, SET EXTRACTOR, SET MONITOR, SET RECEIVER, SET PURGER, SET RDFNET, SET TRIGGER, and SET VOLUME commands, respectively. trigger-type is REVERSE or TAKEOVER. This command parameter alters a trigger that has already been added to the RDF configuration.
PAGE 185
execution of the command. Only when the user confirms the command execution will the command includes or excludes the file specified. NOTE: The RDF administrator must ensure that the data of the files included or excluded must be consistent. The ALTER command will neither ensure data consistency of files being included or excluded nor it will report or correct any data inconsistency that the execution of the ALTER command.
PAGE 186
NOTE: The RDF administrator must ensure that the data of the files included or excluded must be consistent. The ALTER command will neither ensure data consistency of files being included or excluded nor it will report or correct any data inconsistency that the execution of the ALTER command. In an emergency, if RDF initialization is not possible, HP recommends that you use the ALTER command to modify the INCLUDE/EXCLUDE clause while RDF is running.
PAGE 187
Assume you have lost the original primary system (\A), you have successfully completed a takeover on both backup systems (\B and \C), and the MAT positions displayed by the respective 735 messages are: \B: \C: 735 LAST MAT POSITION: Sno 10, RBA 100500000 735 LAST MAT POSITION: Sno 10, RBA 100000000 500 kilobytes of audit records is missing at \C.
PAGE 188
If RDFCOM encounters network problems during any other phase of COPYAUDIT execution, it does not abend. Instead, it logs a message to the home terminal and aborts the COPYAUDIT command. Example Assume you have established two RDF configurations to provide triple contingency protection (\A to \B and \A to \C) and that the RDF control subvolume of the \A to \B configuration is A1 and the RDF control subvolume of the \A to \C configuration is A2.
PAGE 189
Usage Guidelines For the DELETE command to have any effect, a configuration record must already exist for the secondary image trail or updater process associated with the volume name supplied (that is, someone must have previously issued an ADD IMAGETRAIL or ADD VOLUME command for the volume). When you issue a DELETE VOLUME command, RDF responds: • The extractor process stops sending image data for the volume specified in the DELETE VOLUME command.
PAGE 190
Security Restrictions None; anyone can enter the EXIT command. RDF State Requirement You can issue the EXIT command at any time, whether or not RDF has been started. Usage Guidelines If you issue the EXIT command in your current RDFCOM session, RDFCOM terminates the session and returns control to the operating system. If the EXIT command appears in a command file, RDFCOM stops reading the command file and ignores any commands in the file that follow the EXIT command.
PAGE 191
RDF State Requirement You can enter the FC command at any time, whether or not RDF has been started. Usage Guidelines When you issue an FC command, the requested command appears, followed by a subcommand prompt (.). At the prompt, you can enter the subcommands R, I, or D to respectively replace, insert, or delete characters in the command line. As a simpler alternative to the R subcommand, you can simply enter the replacement character directly under the character you want to replace.
PAGE 192
. RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF SOFTWARELOC $SYSTEM.RDF LOGFILE $0 PRIMARYSYSTEM \MICKEY UPDATERDELAY 10 UPDATERNSASUSPEND ON UPDATERFOPENTHRESHOLD 2700 UPDATERTXTIME 60 UPDATERRTDWARNING 60 UPDATEROPEN PROTECTED NETWORK OFF NETWORKMASTER OFF UPDATEREXCEPTION ON REPLICATEPURGE OFF OWNER SUPER.RDF HELP The HELP command displays explanatory text about RDFCOM commands and RDF messages.
PAGE 193
{ RDFNET { NETWORK { TRIGGER { VOLUME $volume { $volume Cannot be performed with RDF } } } } } running. Only a user in the SUPER group can execute this command.
PAGE 194
Cause: The primary process of a NonStop process pair has stopped. This probably was the result of an operator inadvertently issuing a STOP command from TACL. Effect: The backup process takes over, but not in fault-tolerant mode, until the primary process can be re-created. Recovery: This is an informational message; no recovery is required. HISTORY The HISTORY command displays the ten most recently issued RDFCOM commands (including the HISTORY command itself).
PAGE 195
INFO {* {IMAGETRAIL {RDF {MONITOR {EXTRACTOR {RECEIVER {RDFNET {NETWORK {PURGER {TRIGGER trigger-type {VOLUME * {[VOLUME] $volume } [ATINDEX audittrail-index-num] } [,OBEYFORM] } } } } } } } } } } * displays the current configuration parameter values for the RDF global options, for all updater volumes, and for all RDF processes. [ATINDEX audittrail-index-num] is an integer value from 0 through 15 identifying the TMF audit trail on the primary system with which the particular RDF object is associated.
PAGE 196
RECEIVER displays the current configuration parameter values for the receiver process. RDFNET displays the current configuration parameter values for the RDFNET process. NETWORK displays the current configuration parameter values for an RDF network. PURGER displays the current configuration parameter values for the purger process. TRIGGER trigger-type displays the current configuration parameter values for the specified trigger type (REVERSE, TAKEOVER, or * ).
PAGE 197
RDF UPDATERFOPENTHRESHOLD 2700 RDF UPDATERTXTIME 60 RDF UPDATERRTDWARNING 60 RDF UPDATEROPEN PROTECTED RDF NETWORK OFF RDF NETWORKMASTER OFF RDF UPDATEREXCEPTION ON RDF REPLICATEPURGE OFF RDF OWNER SUPER.
PAGE 198
TRIGGER TRIGGER TRIGGER TRIGGER TRIGGER TRIGGER TRIGGER PROGRAM $SYSTEM.RDF.RDFCOM INFILE $DATA01.RDF.RDFCONF OUTFILE $DATA01.RDF.OUTFILE CPUS 0:1 PRIORITY 150 NOWAIT REVERSE The primary system name is set implicitly and the backup system name is set in the INITIALIZE RDF command.
PAGE 199
RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF SOFTWARELOC $SYSTEM.RDF LOGFILE $0 PRIMARYSYSTEM \MICKEY UPDATERDELAY 10 UPDATERNSASUSPEND ON UPDATERFOPENTHRESHOLD 2700 UPDATERTXTIME 60 UPDATERRTDWARNING 60 UPDATEROPEN PROTECTED NETWORK OFF NETWORKMASTER OFF UPDATEREXCEPTION ON REPLICATEPURGE OFF OWNER SUPER.RDF The primary system name is set implicitly and the backup system name is set using the INITIALIZE RDF command.
PAGE 200
VOLUME VOLUME VOLUME VOLUME VOLUME VOLUME VOLUME $DATA03 ATINDEX 0 CPUS 4:5 IMAGEVOLUME $SECIT2 PRIORITY 160 PROCESS $UP03 UPDATEVOLUME $DATA3 INFO PURGER Command To display the current configuration parameters for the purger process, enter the following command: ]INFO PURGER The output shows that the purger is configured with the following parameter values: running in CPUs 3 and 2, with a priority of 165, a retaincount of 50, a purgetime of 60, and the process name $PURG: PURGER PURGER PURGER PURGER PUR
PAGE 201
RDF displays the following: RDFNET PROCESS $MNET RDFNET CPUS 0:1 RDFNET PRIORITY 180 INFO NETWORK Command To display the current RDF network configuration parameters, enter the following command: ]INFO NETWORK RDF displays the following: NETWORK NETWORK NETWORK NETWORK PRIMARYSYSTEM \RDF04 BACKUPSYSTEM \RDF06 RCSV RDF04 PNETTXVOLUME $DATA07 INITIALIZE RDF The INITIALIZE RDF command creates the RDF configuration and context files for establishment of a new RDF configuration.
PAGE 202
month is the first three letters of the month, such as JAN, FEB, MAR. year is a four-digit number greater than 1996. hour is a number from 0 to 23. min is a number from 00 to 59. min must be preceded by a colon (:). INITTIME : | NOW is a timestamp used for online product initialization. It has the same format as the timestamp parameter described. NOW causes RDF to be initialized at the current date and time.
PAGE 203
Please wait while RDF searches for the specified timestamp. TMF shutdown at 12JAN2004 14:30 has been found. RDF will start at RBA: 376275 MAT file: $AUDIT.ZTMFAT.AA000414 Do you still wish to start at this point? [Y/N] Enter Y or YES to proceed; • enter N or NO to cancel the command. If you include the INITTIME option without the ! option, RDFCOM displays: Do you wish to proceed? [Y/N] Enter Y or YES to proceed; enter N or NO to cancel the command.
PAGE 204
Usage Guidelines If your RDF subsystem is running and you do not include the TIMESTAMP, INITTIME, or SYNCHDBTIME options in the INITIALIZE RDF command, then you must stop, delete, and reconfigure TMF before entering the INITIALIZE RDF command. Before issuing the INITIALIZE RDF command within an existing RDF configuration, you must first purge all files from the control subvolume on both the primary and backup systems or you must use the # option that authorizes RDFCOM to purge all of the files.
PAGE 205
the INITTIME or SYNCHDBTIME option, RDFCOM searches backwards in the MAT for the first commit or abort record whose timestamp is less than the specified timestamp. When it finds the shutdown record or commit/abort record, RDFCOM sets the context of the extractor to the record following that record. • When RDF is initialized, the contexts of the receiver and all updaters are initialized to the beginning of the first image file (AA000001).
PAGE 206
The following INITIALIZE RDF command, issued on the primary system \LON after TMF was stopped, deleted, and reconfigured, initializes RDF at once, without prompting you to confirm your request: ]INITIALIZE RDF, BACKUPSYSTEM \CHI, SUFFIX 2 ! In the first example, the RDF control subvolume is implicitly named LON while in the second example it is explicitly named LON2.
PAGE 207
] LIST UPDATERFILEOPENS $UPD, REPORT FILE a. In the example, $UPD is an Updater in the backup system, ‘REPORT’ is the option being passed to the command to generate a report and ‘file’ is the file to which the report needs to be redirected. • When a fully qualified nonexistent file is specified, the command creates the file in the specified path. ] LIST UPDATERFILEOPENS $UPD, REPORT \YOSQA2.$SYSTEM.SUBVOL.FILE In this case, the file \YOSQA2.$SYSTEM.SUBVOL.
PAGE 208
Lines 9 – 16 display the file statistics for a single Updater. The file statistics display the following: • The ATINDEX of the Updater. • Mapping between the primary RDF protected volumes and the backup updater volumes. • The count of files currently open by the Updater. • The count of files that were closed in the last close cycle by the Updater. • The count of process restarts that occurred due to the Updater exceeding the Maximum Number of Concurrent File Opens.
PAGE 209
OBEY The OBEY command executes a series of commands entered in a command file. OBEY [\system.][$volume.][subvolume.]file system identifies the system on which the command file is stored. volume identifies the disk volume on which the command file is stored. subvolume identifies the subvolume on which the command file is stored. file identifies the command file, which contains one or more valid RDFCOM commands. Where Issued Primary or backup system.
PAGE 210
control-subvol is the name of the RDF control subvolume on both primary and backup systems. The control subvolume name is comprised of the primary system name of the RDF configuration (without the backslash) plus the optional character suffix if you included one in the INITIALIZE RDF command. Where Issued Primary or backup system. Security Restrictions None; anyone can enter the OPEN command. RDF State Requirement Before you can enter the OPEN command, RDF must have been initialized.
PAGE 211
In the two OPEN commands, you do not include a backslash (\) because you are specifying the RDF control subvolume name (not a system name). OUT The OUT command redirects the output of the current RDFCOM session to the specified device or file. OUT [\system.][$volume.][subvolume.][file] system identifies the system on which the output file is stored. volume identifies the disk volume on which the output file is stored. subvolume identifies the subvolume on which the output file is stored.
PAGE 212
The next OUT command establishes the destination of the text produced by the OBEYFORM option in the subsequent INFO RDF command as a command file named CONFY. The second OUT command in this sequence redirects later output back to your terminal: ]OUT CONFY ]INFO RDF, OBEYFORM ]OUT RESET The RESET command resets all configuration parameters for the specified entity to their default values within the RDF configuration memory table.
PAGE 213
Security Restrictions You can issue the RESET command if you are a member of the super ID group. RDF State Requirement You can enter the RESET command at any time, whether or not RDF has been started. Certain constraints, however, apply to the subsequent ADD commands that apply the RESET values to the configuration file. For further information, see the ADD command description.
PAGE 214
PROCESS process-name identifies the process name for the extractor process; process-name is any unique, valid process name of up to six characters; the first character must be a dollar sign ($). You cannot specify any of the reserved process names listed in the Guardian Procedure Calls Reference Manual. This parameter is not optional. You must explicitly name the extractor process.
PAGE 215
Furthermore, RDF objects with a particular ATINDEX value greater than 0 must together constitute a complete set: • If there is an extractor with an ATINDEX value of 1, there must also be a receiver with an ATINDEX value of 1. • If there is a receiver with an ATINDEX value of 1, there must also be a secondary image trail with an ATINDEX of 1.
PAGE 216
Usage Guidelines For ATINDEX values greater than 0, the specified value must match the audit trail number of a configured auxiliary audit trail. If you specify SET IMAGETRAIL ATINDEX 2, for example, there must be a configured auxiliary audit trail AUX02. Furthermore, RDF objects with a particular ATINDEX value greater than 0 must together constitute a complete set: • If there is an extractor with an ATINDEX value of 1, there must also be a receiver with an ATINDEX value of 1.
PAGE 217
values do not affect the subsystem until they are applied to the RDF configuration file with the ADD command. Example To configure a monitor process named $MON1 to run in CPUs 0 and 1 at a priority of 180, issue the following commands after RDF has been initialized: ]SET ]SET ]SET ]ADD MONITOR PROCESS $MON1 MONITOR CPUS 0:1 MONITOR PRIORITY 180 MONITOR SET NETWORK The SET NETWORK command sets RDF network configuration parameters within the RDF configuration memory table.
PAGE 218
Example To configure the primary system \RDF04 and backup system \RDF06, issue the following commands after RDF has been initialized: SET SET SET SET ADD NETWORK NETWORK NETWORK NETWORK NETWORK PRIMARYSYSTEM \RDF04 BACKUPSYSTEM \RDF06 REMOTECONTROLSUBVOLUME RDF04 PNETTXVOLUME $DATA07 SET PURGER The SET PURGER command sets purger process configuration parameters within the RDF configuration memory table.
PAGE 219
Suppose that the image trail files are relatively small, such that the audit record at MAT 10, 100000010 was placed at the start of image trail file AA000025 on \B. If the purger on \B is allowed to purge AA000025 before the takeovers occur, the triple contingency protocol will fail because \C is missing some of the purged audit records (Sno 10, RBA 100000010 through Sno 10, RBA 100500000).
PAGE 220
]SET ]SET ]SET ]ADD PURGER PROCESS $PRG PURGER CPUS 0:1 PURGER RETAINCOUNT 8 PURGER By default, in this example the purger process will run at a priority of 165 and the purger purgetime is set to 60 minutes. SET RDF The SET RDF command sets RDF global configuration parameters within the RDF configuration memory table. The supplied values are not applied to the RDF configuration file, however, until you issue an ADD RDF command.
PAGE 221
UPDATERNSASUSPEND {ON | OFF} Specifies whether Updaters will shut down completely or suspend during a NonStop Shared Access Operation on the primary when they encounter a Stop-RDF-Updater record. When this attribute is OFF (the default value), Updater processes on the backup system will shut down completely after completing their current transaction during a NonStop Shared Access DDL operation.
PAGE 222
NETWORK {ON | OFF} specifies whether or not you are configuring an RDF network. When set to OFF (the default value), RDF takeover operations execute and database consistency is not guaranteed for transactions spanning more than one RDF backup database. When set to ON, the RDF subsystem guarantees database consistency across multiple RDF backup systems configured within an RDF network.
PAGE 223
This parameter specifies the user ID under which all RDF processes will always run. This global configuration parameter provides functionality whereby any super ID group user ID can start and stop RDF. Once the OWNER attribute is set, you must limit EXECUTE access to the RDFCOM object so that only those super group users authorized to manage RDF can run RDFCOM. Failure to do so is a serious security risk because, thereafter, all RDF objects run as the user ID of the RDF OWNER.
PAGE 224
PRIORITY priority identifies the execution priority for the RDFNET process; priority is the execution priority, from 10 through 199. The default priority is 165. PROCESS process-name identifies the process name for the RDFNET process; process-name is any unique, valid process name of up to six characters; the first character must be a dollar sign ($). You cannot specify any of the reserved process names listed in the Guardian Procedure Calls Reference Manual. This parameter is not optional.
PAGE 225
CPUS primary-CPU : backup-CPU identifies the CPUs in which the receiver process is to run as a process pair on the backup system; primary-CPU is the primary CPU; backup-CPU is the backup CPU. Values range from 0 through 15. The default is 0:1. EXTENTS ( primary-extent , secondary-extent ) specifies the extent sizes to be used for the RDF image files on the backup system; primary-extent-size is the primary extent size in pages; secondary-extent-size is the size of each secondary extent in pages.
PAGE 226
Where Issued Primary system only. Security Restrictions None. RDF State Requirements None. Usage Guidelines The SET RECEIVER command enters the parameter values specified for the receiver in this command into the RDF configuration table in memory. This table serves as an input buffer only, and so these values do not affect the subsystem until they are applied to the RDF configuration file with the ADD command.
PAGE 227
SET TRIGGER trigger-option where trigger-option is: {PROGRAM program-file {INFILE infile {OUTFILE outfile {CPUS primary-CPU : alternate-CPU {PRIORITY priority {WAIT |NOWAIT } } } } } } program-file is the name of any Guardian object file. This object file is run once RDF has reached a particular state, either after a STOP RDF, REVERSE, or TAKEOVER operation. program-file must be a properly-formed Guardian disk file name. The file does not have to exist. This parameter is mandatory.
PAGE 228
do not affect the subsystem until they are applied to the RDF configuration file through the ADD command. Example In the following example, you are configuring an RDF environment to run from \Boston to \London. You start by initializing RDF to run from \Boston to \London. ] INITIALIZE RDF, BACKUPSYSTEM \LONDON ! Now assume that you have configured an extractor, receiver, purger, a set of updaters, and now you want to configure a Takeover trigger. For this trigger, you have a TACL script file $SYSTEM.RDF.
PAGE 229
ATINDEX audittrail-index-number is an integer value from 0 through 15 specifying the audit trail on the primary system to which the data volume being protected is mapped. 0 specifies the MAT. 1 through 15 specifies auxiliary audit trails AUX01 through AUX15, respectively. The default is 0. CPUS primary-CPU : backup-CPU identifies the CPUs in which the updater process is to run as a process pair on the backup system; primary-CPU is the primary CPU; backup-CPU is the backup CPU.
PAGE 230
Only files are supported when this updater option is used with the ALTER VOLUME command. Subvolumes and wild cards in filenames are also not supported. For more information on the usage of this option with ALTER command, see “ALTER” (page 183). EXCLUDEPURGE subvol.file specifies what the subvolume(s) and file(s) on the primary system data volume for which the enscribe purge operations are not to be replicated by the updater process on the backup system. subvol.
PAGE 231
NOTE: The RDF administrator must ensure that the data of the files included or excluded must be consistent. The ALTER command will neither ensure data consistency of files being included or excluded nor it will report or correct any data inconsistency that the execution of the ALTER command. In an emergency, if RDF initialization is not possible, HP recommends that you use the ALTER command to modify the INCLUDE/EXCLUDE clause while RDF is running.
PAGE 232
SHOW The SHOW command displays the current parameter values contained in the RDF configuration memory table for the specified process. With this command, you can confirm the parameter values before issuing the ADD command that actually applies them to the configuration file. SHOW {RDF {MONITOR {EXTRACTOR {RECEIVER {IMAGETRAIL {TRIGGER {VOLUME {PURGER {RDFNET {NETWORK } } } } } } } } } } RDF displays the current configuration parameter values for the RDF global options.
PAGE 233
If you want to see what parameter values are already set in the configuration file, use the INFO command. Output Displayed The parameters displayed for the RDF global options and the individual processes are explained under the SET EXTRACTOR, SET IMAGETRAIL, SET MONITOR, SET NETWORK, SET PURGER, SET RDF, SET RDFNET, SET RECEIVER, SET TRIGGER, and SET VOLUME command descriptions.
PAGE 234
SHOW PURGER Command Suppose that a series of SET PURGER commands specifies that a purger process named $PURG is to run in CPUs 3 and 2 at priority 165 with a RETAINCOUNT of 50.
PAGE 235
To display the values specified by those SET NETWORK commands, enter: ]SHOW NETWORK RDF displays: NETWORK NETWORK NETWORK NETWORK PRIMARYSYSTEM \RDF04 BACKUPSYSTEM \RDF06 RCSV RDF04 PNETTXVOLUME $DATA07 SHOW TRIGGER Command If you have entered a series of SET TRIGGER commands and you want to review them before issuing the ADD TRIGGER command, type: ]SHOW TRIGGER RDF displays a list like this: TRIGGER TRIGGER TRIGGER TRIGGER TRIGGER TRIGGER PROGRAM $SYSTEM.RDF.RDFCOM INFILE $DATA01.RDF.
PAGE 236
TMF must be started and transactions enabled on both primary and backup systems before you issue the START RDF command. When RDF starts, it automatically executes an implicit VALIDATE CONFIGURATION command with these results: • If any parameter value in the RDF configuration file is invalid, RDFCOM displays an error message, and the START RDF operation fails.
PAGE 237
STATUS The STATUS command displays current configuration information and operational statistics for the RDF environment, or specified portions thereof. All forms of the STATUS command, except STATUS RTDWARNING, automatically include information and statistics for the monitor process. STATUS {MONITOR {RDF {EXTRACTOR {RECEIVER {PURGER {PROCESS procname {VOLUME {RTDWARNING {RDFNET } [, PERIOD seconds[, COUNT repeat]] } } } } } } } } MONITOR requests information and statistics for the monitor process.
PAGE 238
RDF State Requirement You can enter a STATUS command at any time after RDF has been initialized. Usage Guidelines The STATUS command provides you with the most current information about RDF and its operational status, presenting data for the specified RDF processes. STATUS RDF Command Output Display The output of the STATUS RDF command shows all critical information about each configured RDF entity. Here are two examples: RDFCOM - T0346H09 – 11AUG08 (C)2008 Hewlett-Packard Development Company, L.P.
PAGE 239
• Stop Update, Timestamp Pending • * STOP RDF In Progress * • * TMF STOP In Progress * • * TAKEOVER In Progress * • WRONG PROGRAM VERSION • NSA Stop Update Pending • Update NSA Stopped • *Monitor Unavailable* The rest of the display provides current information about each RDF process configured. For extractors, receivers, and image trails, the configured ATINDEX value is displayed in parentheses following the object name.
PAGE 240
the application program each process is running. Please note that an RTD time is not a precise indication of how far an RDF process is behind. An RTD time is only relative and is an approximation. The more accurate RTD time is that of the extractor. An updater's RTD is even more relative because it may show 20 seconds one instance and then show 0 seconds in the next instance.
PAGE 241
• The image trail entries reflect the names of the secondary image trail files to which each receiver is writing ($RRCV0 writes image records for updater $RUPD1 to $IMAGE0; $RRCV1 writes image records for updaters $RUPD2 and $RUPD3 to $IMAGEA1 in this example). • Each updater entry reflects the name of the secondary image file from which it is reading ($DATA03.RDF04.AA000020 for $RU01, $DATA04.RDF04.AA000003 for $RU02, and $DATA05.RDF04.AA000003 for $RU03 in this example).
PAGE 242
undo During a takeover or stop-update-to-time operation, after the update has finished its forward moving redo pass, it reports "undo" when it performs its reverse moving undo pass in order to back out any updates for transactions that need to be undone. In a takeover operation for an RDF network, "undo" indicates the first undo pass when local transactions are undone because their outcomes are unknown.
PAGE 243
DRAIN causes the following actions: • All TMF audit records up to the time the command is entered are stored in the image trails on the backup node. • The RDF processes shut down in a manner similar to when a stop TMF record is encountered in the audit trail. • Each updater shuts down after it has applied all audit records up to the stop point. • The purger process reports event 852, indicating that all updaters have stopped and the drain has completed.
PAGE 244
the applications updating the RDF-protected database, and when you are certain they have completed, then issue the STOP RDF, DRAIN command. Because your applications have stopped, then the RDF-protected database on the primary system is closed and is in the same state as if TMF was stopped. See “Critical Operations, Special Situations, and Error Conditions” (page 110) for a discussion on how this operation may be of value to you.
PAGE 245
RDF must be running in the Normal state (with Update On) to issue a STOP RDF, DRAIN or STOP RDF, REVERSE command.
PAGE 246
NOTE: The timestamp you specify must be at least 5 minutes later than the current time at your primary system. If you specify an earlier time, an error message appears. Additionally, all transactions that committed prior to the timestamp are applied and retained in the backup database. Any transactions that committed at or after the specified timestamp are backed out of the backup database.
PAGE 247
by the specified timestamp. Any transactions backed out are reapplied when you issue the next START UPDATE command. If you issue the STOP UPDATE command without the TIMESTAMP option, the RDFCOM prompt is not returned until all updaters have stopped. If you include the TIMESTAMP option, then the RDFCOM prompt is returned immediately since the stoppage is required to be at least 5 minutes in the future.
PAGE 248
Usage Guidelines The TAKEOVER command is customarily issued when the primary system fails or otherwise becomes unavailable, and you want to make the backup database your new database of record for your applications. CAUTION: The TAKEOVER command is not a normal operational command. Operators should never issue this command strictly on their own initiative. Issue this command only when specifically told to do so by someone in high authority.
PAGE 249
updaters; if UPDATEREXCEPTION is ON, then each update of the batch needs to be undone and an exception record written. • Auxiliary Audit and a Comm Problem If your RDF environment includes extractor-receiver pairs associated with auxiliary audit trails, then if one extractor-receiver pair has fallen way behind because of a communications problems, then all affected transactions must be undone by all affected updaters, and this can lead to a lot of audit being undone with exception records.
PAGE 250
After you enter your response, RDFCOM prompts you for your next command. 4. Having initiated the RDF TAKEOVER operation, you can then use a STATUS RDF command to determine the status of the TAKEOVER operation. If the TAKEOVER operation is still in progress when you enter the STATUS RDF command, the subsystem displays the current state as “TAKEOVER IN PROGRESS.
PAGE 251
VALIDATE CONFIGURATION The VALIDATE CONFIGURATION command validates the parameters in the RDF configuration file and optionally generates a report on the status of validations. VALIDATE CONFIGURATION [, REPORT [filename]] where, REPORT [filename] is an optional parameter used to generate report on the result of validations. Validation reports for the following scenarios are generated: • A filename is not provided, the report is generated on the screen.
PAGE 252
• The volumes for the image files (specified by the RDFVOLUME option of a SET RECEIVER command and any ADD IMAGETRAIL commands) are valid and exist on the backup system. • The volumes for the image files have enough room for two more image files (for an RDF restart). • The primary volumes associated with the updater processes are valid and are being audited to the TMF audit trail.
PAGE 253
Receiver Purger Updater Updater Program File License Check.........Passed CPU UP Check.......................Both CPUs Up - N:$MRCV PF:\TESTSYS2.$SYSTEM.BKPRDF.RDFRCVO Program File License Check.........Passed CPU UP Check.......................Both CPUs Up - N:$MR04 PF:\TESTSYS2.$SYSTEM.BKPRDF.RDFPRGO Program File License Check.........Passed CPU UP Check.......................Both CPUs Up - N:$MR10 PF:\TESTSYS2.$SYSTEM.BKPRDF.RDFUPDO Program File License Check.........Passed CPU UP Check..............
PAGE 254
9 Entering RDFSCAN Commands All RDF messages are directed to an EMS event log (collector). To examine that log without looking at all events for the entire system, you first use the standard EMS filter RDFFLTO to create an intermediate entry-sequenced file copy of the RDF log, and then enter commands through the RDFSCAN online utility. This chapter, which is written for system managers and operators, describes the RDFSCAN commands and their attributes.
PAGE 255
In addition, this element is included only if applicable: • Output Displayed: Only two RDFSCAN commands (LIST and SCAN) produce output, although others influence its content and destination. For information about the other elements, see “Command Description Elements” in Chapter 8 (page 175). Except for the LOG and NOLOG commands, you can abbreviate the command name by entering only the first character (such as L for LIST) or any number of the leading characters (such as DIS for DISPLAY).
PAGE 256
OFF disables the display of record numbers. Usage Guidelines The DISPLAY function is automatically enabled if pattern matching is enabled and is automatically disabled if pattern matching is disabled. For information about enabling and disabling pattern matching, see the MATCH command description in “MATCH” (page 260). Examples Suppose that $SYSTEM.SANFRAN.
PAGE 257
Examples If you issue an EXIT command in response to the RDFSCAN prompt, RDFSCAN terminates the session and displays a logoff message: Enter the next RDFscan function you want: Thank you for using RDFscan EXIT If you press Ctrl-Y in response to the RDFSCAN prompt, RDFSCAN terminates the session and displays an end-of-file indication followed by the logoff message: Enter the next RDFscan function you want: EOF! Ctrl-Y Thank you for using RDFscan FILE The FILE command selects a file generated by the RDFF
PAGE 258
HELP The HELP command displays the syntax of RDFSCAN commands or introductory information about the RDFSCAN utility. HELP [ ALL ] [ INTRO ] [ command ] ALL displays the syntax of all RDFSCAN commands. INTRO displays information on how to use the RDFSCAN utility. command displays the syntax of the RDFSCAN command indicated by command.
PAGE 259
If pattern matching is disabled, the LIST command displays the specified number of messages starting at the current record. This behavior is identical to using the SCAN command with pattern matching disabled. For information about enabling and disabling pattern matching, see the MATCH command description in “MATCH” (page 260).
PAGE 260
Usage Guidelines The LIST command always transmits its output to the standard output device for RDFSCAN, which is normally your terminal. When you specify a destination file in the LOG command, RDFSCAN directs subsequent LIST command output to that destination file as well as producing it on the standard output device. That is, with the LOG command, LIST output goes both to your terminal and the file specified in LOG.
PAGE 261
To disable pattern matching, merely press the RETURN key at the prompt without entering a pattern. When entering a match pattern, you can use asterisks (*) and question marks (?) as wild-card characters. When pattern matching is enabled, the DISPLAY function is automatically enabled; when pattern matching is disabled, the DISPLAY function is automatically disabled. Table 15 shows the symbols RDFSCAN uses in pattern matching.
PAGE 262
Examples This command disables the copying of LIST command output: Enter the next RDFSCAN function you want: NOLOG File: $SYSTEM.SANFRAN.RDFLOG, current record: 9454, last record: 9466 Enter the next RDFSCAN function you want: SCAN The SCAN command scans a specific number of messages in the file and displays all of those in that range that contain the current match pattern. SCAN number number is the number of messages to scan within the log file.
PAGE 263
Record number: 1011 2004/06/08 04:13:49 \LAB1 $AU02 790 Backup Process Created in Processor 03 Record number: 1342 2004/06/08 04:13:49 \LAB1 $AU02 718 Switched to original Primary Processor Record number: 1792 2004/06/08 05:01:35 \LAB1 $AU02 790 Backup Process Created in Processor 03 Record number: 1933 2004/06/08 05:01:35 \LAB1 $AU02 718 Switched to original Primary Processor File: $SYSTEM.SANFRAN.
PAGE 264
10 Triple Contingency The triple contingency feature makes it possible for your applications to resume running with full RDF protection within minutes after loss of your primary system. NOTE: Replication of network transactions is not supported in conjunction with the triple contingency feature, nor is the replication of auxiliary audit trails.
PAGE 265
(that is, which system had received the least amount of audit data from the extractor by the time the primary system was lost). • On the backup system that was further behind (had the least amount of audit data), issue the COPYAUDIT command specifying the name of the other backup system and its RDF control subvolume. That command copies over all missing audit records from the designated system. • Upon successful completion of the COPYAUDIT operation, do a second takeover on that system.
PAGE 266
The RETAINCOUNT Configuration Parameter The purger RETAINCOUNT parameter specifies how many image trail files (including the one currently in use) must be retained on disk for each image trail. The default value for this parameter is two. This parameter is important because if you lose the primary system, the triple contingency protocol will work only if one of the backup systems has retained all of the audit records that the other is missing.
PAGE 267
where num is once again within the range 2 through 5000. (Before entering this command, however, you must first stop RDF.) The COPYAUDIT Command If the primary system fails, you must execute two takeovers: one on each backup system.
PAGE 268
If the takeover completes successfully (the receiver logs an RDF message 724 followed by a 735 message containing the same detail as in the 735 message associated with the takeover on \B), the two databases are logically identical. At that point you can initialize, configure, and start RDF on both systems and then resume application processing on the new primary system with full RDF protection. COPYAUDIT Restartability The COPYAUDIT command is restartable.
PAGE 269
When the takeover operations are complete, the databases on systems \B and \C are logically identical to one another, you have not lost any committed data regardless of the number of auxiliary audit trails involved.
PAGE 270
3. 4. 5. 6. On the system with the least amount of audit records, issue a COPYAUDIT command specifying the name of the other backup system and its RDF control subvolume. When the COPYAUDIT command has completed successfully, issue a second TAKEOVER command on that same system. Initialize, configure, and start RDF on whichever system you want to be the primary in the new configuration. Start application processing on the new primary system.
PAGE 271
11 Subvolume-Level and File-Level Replication By default, RDF provides volume-level protection, wherein changes to all audited files and tables on each protected primary system data volume are replicated to an associated backup system data volume. RDF/IMP, IMPX, and ZLT also support subvolume-level and file-level replication. To use this capability, you supply INCLUDE and EXCLUDE clauses when configuring updaters to identify specific subvolumes and files you want either replicated or not replicated.
PAGE 272
In this example, changes to all audited files and tables on $DATA01 are replicated, except MMTEST10.CONC0826: SET SET SET SET SET SET ADD VOLUME VOLUME VOLUME VOLUME VOLUME VOLUME VOLUME CPUS 1:2 IMAGEVOLUME $IMAGE PRIORITY 185 PROCESS $MM01 UPDATEVOLUME $DATA01 EXCLUDE MMTEST10.CONC0826 $DATA01 Wildcard Character (*) The asterisk (*) can be used as a wildcard character in both subvolume and file names. Within Subvolume Names When used to designate subvolume names, the * must always be used as a suffix.
PAGE 273
With this set of configuration commands, both updaters have the same file-sets included.
PAGE 274
INCLUDEPURGE and EXCLUDEPURGE These updater attributes work exactly the same as for INCLUDE and EXCLUDE, with the exact same wildcard functionality, and with the exact same performance ramifications. There is one additional consideration. The total number of INCLUDE, EXCLUDE, INCLUDEPURGE, and EXCLUDEPURGE clauses that you can have for one updater is 100. This means, for example, that you can have 25 for each of the these clauses, but not one more.
PAGE 275
Replicating purge operations with INCLUDE, EXCLUDE, INCLUDEPURGE, and EXCLUDEPURGE lists You can combine INCLUDE, EXCLUDE, INCLUDEPURGE, and EXCLUDEPURGE settings for a volume. The RDF extractor decides the effect of these clauses on a purge operation (file/fileset) for an RDF protected volume. Table 16 lists the possible combinations and the effect of combining the clauses. NOTE: Only logical combinations that are frequently used are listed.
PAGE 276
SET SET SET SET SET SET SET SET VOLUME VOLUME VOLUME VOLUME VOLUME VOLUME VOLUME VOLUME PROCESS $MM01 UPDATEVOLUME $DATA01 INCLUDE MMTEST10.* EXCLUDE MMTEST10.CONC0826 INCLUDE DATA*.* EXCLUDE DATA*.C* INCLUDEPURGE MMTEST10.RR* EXCLUDEPURGE MMTEST10.RR1234 There is still one updater responsible for replicating changes from $DATA01 on the primary system to $DATA01 on the backup system, but the INCLUDE and EXCLUDE clauses explicitly identify which subvolumes and files on \PRIMARY.
PAGE 277
12 Subvolume Name Mapping RDF allows users to replicate data from primary system source subvolumes to differently named destination subvolumes on the backup system. However, the recommended configuration is still one-to-one mapping between source subvolumes on the primary system and their corresponding destination subvolumes on the backup system. One-to-one mapping ensures that each partition of a partitioned file or table is mapped to the correct backup subvolume.
PAGE 278
• Node names are not allowed in mapping strings. • Volume names are not allowed in mapping strings. If the updater detects a $ character, it logs an error. • Reserved names are not allowed in mapping strings. See the examples of invalid mapping strings listed below. • When two or more mapping rules are present in a mapfile, the rule listed first always takes precedence if it fits. For example, assume these two mapping strings are present: MAP NAMES SUBVOL1.* TO SUBVOL2.* MAP NAMES SUBVOL*.
PAGE 279
How an Updater Manages Filename Collisions If you inadvertently map two subvolumes on the primary system to the same subvolume on the backup system for an updater, the updater detects the filename collision, logs EMS event 927, and abends. This approach prevents possible data corruption or disk failure. To illustrate how a filename collision might occur, assume that the mapping string for the updater that replicates from $DATA01 on the primary system to $DATA01 on the backup system is: MAP NAMES TEST1.
PAGE 280
In this example, $DATA01 is the name of the volume on the primary system, and MAPLOG is the keyword. Because MAPLOG is followed by end of line, it indicates that the maplog file on the backup system be turned off. You can also alter the maplog file to a different path. For example: ALTER VOLUME $DATA01 MAPLOG $DATA05.NAPCONFG.MAPLOG2 If a maplog is not properly constructed or formatted, the updater generates errors.
PAGE 281
To illustrate this problem scenario, assume these circumstances: • You create an audited, partitioned, key-sequenced file (Enscribe, SQL/MP, or SQL/MX) on the primary system where the primary and secondary partitions are on the same subvolume at $DATA01.SVOL.FILE and $DATA02.SVOL.FILE. • One updater replicates the changes for the primary partition $DATA01.SVOL.FILE on the primary system to $DATA11.SVOL1.FILE on the backup system using this mapping string: MAP NAMES SVOL.* TO SVOL1.
PAGE 282
13 Auxiliary Audit Trails In addition to the Master Audit Trail (MAT), RDF/IMPX and ZLT support protection of up to 15 auxiliary audit trails. If you want to protect data volumes associated with an auxiliary audit trail, you must configure an auxiliary extractor and an auxiliary receiver for that trail. Thus, for each auxiliary audit trail, there will be one auxiliary extractor-receiver pair. Auxiliary Extractor An auxiliary extractor can only be configured to a single auxiliary audit trail.
PAGE 283
• It is an error if the specified atindex does not correspond to a valid index of a configured auxiliary audit trail. That is, if you have configured two TMF auxiliary audit trails with the respective audit trail numbers of 1 and 2, you cannot configure an auxiliary extractor with an atindex value of 3. • It is an error to specify two extractors or two receivers with the same atindex value.
PAGE 284
the TMF shutdown operation proceeds. The same can happen to the updaters when a stop-update-to-time operation enters the RDF subsystem, wherein the updaters configured to an auxiliary audit trail may take longer time to shutdown, if the auxiliary extractor has fallen behind. When extractor finally catches up, the affected updaters are able to shut down.
PAGE 285
For more information about Expand multi-CPU paths, see the Expand Configuration and Management Manual.
PAGE 286
14 Network Transactions The RDF/IMPX and RDF/ZLT products are able to guarantee backup database consistency for transactions that update data residing on more than one RDF primary system. RDF/IMPX and RDF/ZLT can map the volumes being protected to both the MAT and auxiliary audit trails. NOTE: Network transaction processing is currently not supported in configurations that use the triple contingency feature. You must use RDF/IMPX or RDF/ZLT to protect all databases open to network transactions.
PAGE 287
NETWORKMASTER Attribute This attribute, located in the RDF configuration record, specifies whether or not the particular system is the master of the RDF network. Each RDF network has one, and only one, network master. To set this attribute, use this RDFCOM command: SET RDF NETWORKMASTER {ON | OFF} When this attribute is set to OFF (the default value), the particular system is not the network master. When this attribute is set to ON, the particular system is the network master of the RDF network.
PAGE 288
REMOTECONTROLSUBVOL (RCSV) Network Attribute The remote control subvolume (RCSV) is the name of the control subvolume used by the RDF subsystem configured for the specified primary and backup systems. It is set by this RDFCOM command. SET NETWORK REMOTECONTROLSUBVOL subvolume-name There is no default value. PNETTXVOLUME Network Attribute You only use this attribute when configuring the network master.
PAGE 289
(except the RDFNET process within the network master primary system) interacts with any other RDF subsystem in the RDF network. Therefore, the performance of an individual RDF subsystem is unaffected by its inclusion within an RDF network. RDF Takeovers Within a Network Environment With RDF/ZLT, no committed data from any primary system in the RDF network is lost. The discussions that follow regarding loss of data in a network takeover only apply to non-RDF/ZLT environments.
PAGE 290
The purger of the network master determines what network transactions are incomplete across the different backup systems, and it produces the master network undo list. Each purger then uses this master list to ascertain the transaction data that must be undone on its backup database. For example, if a network transaction involved only four of the ten primary systems in an RDF network, then that transaction only needs to be undone on the backup databases where that data was replicated.
PAGE 291
operation on the primary system. If, however, “kept-commits” have been encountered during phase 2 processing, a File Recovery position is not available; this is reported in RDF event 858. This last situation will never occur in an RDF/ZLT environment because a File Recovery position is always available with RDF/ZLT. If an RDF event 888 is reported, then the specified File Recovery position is based on both phase 1 and phase 3 processing. Each system logs its own File Recovery position.
PAGE 292
7. 8. 9. 10. 11. 12. T13 T13 T14 T15 T14 T15 (network transaction started on \B) commit (non-network transaction) (network transaction started on \A) commit commit At approximately the same time system \B executes: 1. T10 (network transaction started on \A) 2. T20 (non-network transaction) 3. T12 (network transaction started on \A) 4. T13 (network transaction started on \B) 5. T21 (non-network transaction) 6. T22 (non-network transaction) 7. T36 (network transaction started on \C) 8. T21 commit 9.
PAGE 293
that is where T102 originated. Thus, on \M, the sequence of commit records on the audit trail will likely be T101 followed by T102, whereas on \N it will likely be T102 followed by T101. For these two reasons, we can be certain T101 and T102 did not alter the same data: • Transaction record locking would have prevented these transactions from altering the same data.
PAGE 294
5. 6. 7. When all other RDF subsystems in the RDF network list 0:00 extractor RTD times, issue the RDFCOM START RDF command on the system where you had stopped RDF. Perform your shared access NonStop SQL/MP DDL operation on your primary system. Follow the normal method for replicating shared access NonStop SQL/MP DDL operations on your backup system.
PAGE 295
special file on its PNETTXVOLUME volume. If the communications line to one of those primary systems is down, and you then issue a STOP RDF command on the network master’s primary system, the STOP RDF command could appear to hang. The reason for this is that the RDFNET process might be trying to open a file for the system whose path is down. In such a case, the RDFNET process waits until either the line comes back up or the Expand level-4 timer expires.
PAGE 296
SET SET ADD SET SET SET ADD RDF NETWORKMASTER ON RDF UPDATEREXCEPTION OFF RDF MONITOR CPUS 1:2 MONITOR PRIORITY 185 MONITOR PROCESS $MMON MONITOR SET SET SET SET SET ADD EXTRACTOR EXTRACTOR EXTRACTOR EXTRACTOR EXTRACTOR EXTRACTOR SET SET SET SET SET SET SET ADD RECEIVER RECEIVER RECEIVER RECEIVER RECEIVER RECEIVER RECEIVER RECEIVER SET SET SET SET SET ADD PURGER PURGER PURGER PURGER PURGER PURGER ATINDEX 0 CPUS 1:2 PRIORITY 185 PROCESS $MEX1 RTDWARNING 60 ATINDEX 0 CPUS 3:2 EXTENTS (100,100) PRIORI
PAGE 297
SET SET SET SET SET SET SET SET SET SET SET ADD RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF SOFTWARELOC $SYSTEM.
PAGE 298
RDFCOM - T0346H09 – 11AUG08 (C)2008 Hewlett-Packard Development Company, L.P. Status of \RDF04 -> \RDF05 RDF 2008/08/11 05:26:49.082 Control Subvol: $SYSTEM.
PAGE 299
15 Process-Lockstep Operation The RDF/IMPX products include the process-lockstep operation, which is process-based. That is, when a process invokes the lockstep operation for a business transaction, the process must wait until all audit records associated with that business transaction are safely stored in image trails on the backup system before continuing. Process-lockstep is not needed with RDF/ZLT because ZLT functionality provides means whereby no committed data is ever lost during an unplanned outage.
PAGE 300
While the process waits until DoLockstep completes, other processes can view and modify the just-changed records, and this must be understood and taken into consideration by the application designer. NOTE: The lockstep capability cannot be used in an RDF network environment. Furthermore, you can have only one RDF subsystem configured for lockstep on a given node because the gateway can only be configured to a single extractor process.
PAGE 301
After recompiling your program, you must then decide whether you want to bind the object explicitly into your program or treat the object as a user library. Typically you should explicitly bind the object into your program. The object file (LSLIBTO) is very small, and there are no benefits to treating it as a user library. To bind LSLIBTO into your program, issue this statement: Select Search $vol.subvol.LSLIBTO where $vol.subvol is the location where you have placed the lockstep library file.
PAGE 302
LockStepNotDone (value is 31426) The RDF gateway process cannot be started. This status has the same ramifications as LockstepDisabled, and what your application does next is your decision. The Lockstep Transaction Remember, you must commit your business transaction before you call DoLockstep.
PAGE 303
MAT and associated auxiliary audit trails, when the lockstep transaction begins. Thus, when the lockstep audit data is committed on the backup system, all audit data generated in the MAT prior to that data is also guaranteed to be committed on the backup system. Lockstep and Auxiliary Audit Trails From T1226AAC, lockstep gateway supports transactions involving RDF protected auxiliary audit trail volumes. T1226AAC must be used only with T0346H11ACS and later versions of RDF/IMPX product.
PAGE 304
3. Run the changed SCF script. When SCF restarts the gateway, lockstep processing is disabled. Thus, if your application calls DOLOCKSTEP, the gateway will return control immediately to the application without doing lockstep processing. Disabling lockstep processing is a very useful feature. Suppose that the communications lines from your RDF primary to your RDF backup systems are down.
PAGE 305
Lockstep Gateway Event Messages Lockstep gateway messages are sent to the configured EMS event log (collector). You specify the EMS event log by using the SET RDF command described in Chapter 8 (page 175). 1 I/O completed on an unknown file number. Cause While reading $RECEIVE, the lockstep gateway received an I/O completion on an unknown file number. Effect The lockstep gateway stops. Recovery This is an internal error, but the gateway is restarted.
PAGE 306
indexnum is the audit trail index number of the specified extractor. Cause The RDF extractor is no longer responding, and it might be stopped. Effect The lockstep gateway stops. Recovery Determine why the RDF subsystem stopped, correct the problem, and then restart the subsystem. 5 The lockstep gateway received error errnum from the RDF extractor procname with ATINDEX indexnum. errnum is a file-system error number. procname is the name of the process that is in use.
PAGE 307
Recovery Correct the error condition and restart the lockstep gateway. 7 Open error errnum on filename for RDF extractor procname with ATINDEX indexnum. errnum is a file-system error number. filename is the name of a lockstep file. procname is the name of an RDF extractor process. indexnum is the audit trail index number of the specified extractor. Cause The lockstep gateway received the specified error while attempting to open the specified lockstep file. Effect The lockstep gateway stops.
PAGE 308
filename is the name of a lockstep file. procname is the name of an RDF extractor process. indexnum is the audit trail index number of the specified extractor. Cause The lockstep gateway received the specified error while attempting to update the specified lockstep file. Effect The lockstep gateway stops. Recovery Correct the error condition and restart the lockstep gateway. 10 The RDF lockstep file filename for RDF extractor procname with ATINDEX indexnum has an incorrect file code.
PAGE 309
Recovery Either the file was not created by the lockstep process, or the audit attribute was erroneously turned off. Purge the file and restart the lockstep gateway. 12 Invalid message-id msgid returned from RDF extractor procname with ATINDEX indexnum. msgid is the invalid message id. procname is the name of an RDF extractor process. indexnum is the audit trail index number of the specified extractor.
PAGE 310
Recovery This is an informational error, unless the gateway stops. If it stops, correct the condition that caused the error and then restart the gateway. 15 Read error errnum on $RECEIVE. errnum is a file-system error number. Cause The lockstep gateway received the specified error when reading $RECEIVE. Effect The lockstep gateway stops. Recovery This is an internal error, but the gateway is restarted.
PAGE 311
procname is the name of an RDF extractor process. indexnum is the audit trail index number of the specified extractor. Cause The lockstep gateway received the specified error while attempting to format the lockstep filename. Effect The lockstep gateway stops. Recovery This is an internal error, but the gateway is restarted. If the problem persists, contact the Global Mission Critical Solution Center (GMCSC) or your support representative. 19 Invalid process name procname for lockstep gateway.
PAGE 312
Recovery You must remove the node name for the extractor from your SCF script. 22 Open error errnum on RDF master extractor procname with ATINDEX indexnum. errnum is a file-system error number. procname is the name of an extractor process. indexnum is the audit trail index number of the specified extractor. Cause The specified error was returned when the lockstep gateway attempted to open the RDF extractor.
PAGE 313
filename is the name of a lockstep file. procname is the name of an extractor process. indexnum is the audit trail index number of the specified extractor. Cause The specified error was returned when the lockstep gateway attempted to lock the specified file. Effect The lockstep gateway stops. Recovery SCF automatically restarts the gateway. If the problem persists and the autorestart count is exhausted, correct the condition that caused the error and then restart the gateway.
PAGE 314
Cause The lockstep gateway is started. Effect The lockstep gateway continues its initialization activity. Recovery This is an informational message; no recovery is required. 28 RDF extractor procname with ATINDEX indexnum responded with error errnum to lockstep request. procname is the name of an extractor process. errnum is a file-system error number. indexnum is the audit trail index number of the specified extractor.
PAGE 315
Effect If the error is retryable, the lockstep gateway starts a new transaction. If the error is unexpected, the gateway stops. Recovery This is an informational error, unless the gateway stops. If it stops, correct the condition that caused the error and then restart the gateway. 31 Invalid process name: procname for the RDF master extractor. procname is the invalid process name.
PAGE 316
16 NonStop SQL/MX and RDF RDF supports replication of NonStop SQL/MX user tables (file code 550) and indexes (file code 552). These operations are supported in much the same way as they are with NonStop SQL/MP, and the same types of data and DDL operations are replicated.
PAGE 317
CREATE CATALOG BCAT LOCATION $DATA01; 3. If you want each catalog to be seen from both systems, register your primary and backup catalogs. To register the primary catalog on the backup system, issue a REGISTER CATALOG command on the primary system. To register the backup catalog on the primary system, issue a REGISTER CATALOG command on the backup system. The format of the REGISTER CATALOG command is: REGISTER CATALOG catalog ON node.
PAGE 318
schema, you must use the same subvolume here. If you did not specify the LOCATION clause when creating the primary system's schema, you must query the primary system to obtain the Guardian subvolume name, and you must use the Guardian subvolume name with the LOCATION clause here. For example, if issued on the backup system, this command creates a schema on the backup system called SCH in catalog BCAT using subvolume ZSDXYZ3A: CREATE SCHEMA BCAT.SCH LOCATION ZSDXYZ3A; 6.
PAGE 319
name but the Guardian filename will not normally be the same, even if the partitions are on different volumes. 7. Create each object on the backup system. The ANSI name of the object must be constructed as follows: • catalog name: use the name of the backup catalog you created in Step 2. • schema name: use the name you used in Steps 4 and 5. • table or index name: must match on the primary and backup systems. This command creates a table called TAB1 in the schema BCAT.
PAGE 320
CREATE SCHEMA BCAT.SCH LOCATION ZSDXYZ3A; You must use the LOCATION clause. If you specified the LOCATION clause when creating the primary system's schema, you must use the same subvolume here.
PAGE 321
6. At the backup system, use the RESTORE utility to place the objects on the backup system, specifying the ANSI names for the backup system. Use the LOCATION clauses to have RESTORE place the objects in the correct Guardian locations. See “Restoring to a Specific Location” for general restore syntax for NonStop SQL/MX databases. For example, assume you have the objects on your primary system that have these fully qualified Guardian names: \pnode.$DATA01.ZSDABCDEF.FILE100 \pnode.$DATA02.ZSDABCDEF.
PAGE 322
described in “Creating a NonStop SQL/MX Backup Database From an Existing Primary Database” (page 319) to generate the LOCATION clauses for the temporary objects, modifying the volume names as necessary and using the primary node name for the -node option. Alternatively, you can use the SHOWDDL command to obtain the fully qualified filenames of the objects you want replicated and specify the same Guardian subvol.filenames in the corresponding LOCATION clauses when creating the temporary objects.
PAGE 323
The backup database is now ready for RDF replication, and you can drop the temporary catalog.schema.objects on your primary system. Creating the Fuzzy Copy on the Backup System The advantage of this method is that it eliminates the use of temporary objects as well as tape handling because you create your backup objects directly on the backup system.
PAGE 324
Indirectly From the Primary to the Backup By Way of a Temporary File If the number of rows to load over the network is too great, you can use a temporary file on the primary system: 1. Create a temporary catalog on your primary system to correspond to your regular catalog on your primary system whose objects you want RDF to replicate. 2. Create a temporary schema for your temporary catalog. Follow the instructions given in “Creating a NonStop SQL/MX Backup Database From an Existing Primary Database”.
PAGE 325
and then load the data into the backup partition using the INSERT statement also described in that topic. Correcting Incorrect NonStop SQL/MX Name Mapping Primary and Backup ANSI Catalog Are the Same If you created the primary and backup catalogs and used the same name for both, you cannot use the REGISTER CATALOG command to make either catalog visible on the other system.
PAGE 326
necessarily correspond to the sequence in which the user specifies columns on CREATE TABLE.
PAGE 327
4. 5. 1. 2. You are restoring four tables from two different schemas in catalog PCAT. Schema information: Primary schema name Schema subvolume Backup schema name PCAT.MYSCHEMA ZSDAAAAA BCAT.MYSCHEMA PCAT.MYSCHEMAX ZSDBBBBB BCAT.MYSCHEMAX Table and Index information: Table or Index Name Guardian Names for partitions and indexes PCAT.MYSCHEMA.MYTABLE1 PCAT.MYSCHEMA.MYINDEX1 \P.$data01.ZSDAAAAA.HEBFRW00 \P.$data02.ZSDAAAAA.HEBFRX00 \P.$data03.ZSDAAAAA.HEBFRY00 \P.$data02.ZSDAAAAA.YREWPO00 PCAT.
PAGE 328
Comparing NonStop SQL/MX Tables While the unsupported RDFCHEK utility program can be used to compare Enscribe files or NonStop SQL/MP tables, it cannot be used to compare NonStop SQL/MX tables. If you need to compare a NonStop SQL/MX table on your primary against a NonStop SQL/MX table on your backup system, for example, one method of doing so is as follows: 1. Use the NonStop SQL/MX Select statement to select all rows in the primary table, and then store them in an Enscribe entry-sequenced file. 2.
PAGE 329
17 Zero Lost Transactions (ZLT) Zero Lost Transactions (ZLT), functionality that is available only with the RDF/ZLT product, ensures that no transactions that commit on the primary system are lost on the RDF backup system if that primary system is downed by an unplanned outage. RDF achieves this though the use of remote mirroring for the relevant TMF audit trail volume(s).
PAGE 330
Figure 14 ZLT Configuration With a Single Standby/Backup System RDF traffic (Expand lines) X fabric 1 1 Y fabric 2.. .. .. .. n 2.. .. .. .. n Audit-Trail Disk System A (RDF primary) (Limited by Disk technology) System B (RDF backup and standby) Figure 15 shows the configuration where a single system serves as both the standby and backup systems, and the remote mirror is located at an intermediate site.
PAGE 331
Figure 16 shows the configuration where individual standby and backup systems are located at separate sites. Figure 16 ZLT Configuration With Standby and Backup Systems Located at Separate Sites System A (RDF primary) System B (RDF backup) RDF traffic (Expand lines) 1 1 2.. .. .. .. n 2.. .. .. .. n X fabric Audit-Trail Disk Y fabric (Limited by disk technology) ZLT recovery traffic (Expand lines) 1 2.. .. .. ..
PAGE 332
If you lose your primary system due to an unplanned outage, you connect the remote mirrors to the standby system, and then initiate a takeover operation on the backup system. Before performing the takeover, RDF reads the remaining audit records from the remote mirrors, and processes those audit records. Thus, RDF can read absolutely all of the audit records that were generated on the primary system prior to the system failure, and no committed data is lost.
PAGE 333
NOTE: Because the remote mirrors will be connected to your standby system in the event of an unplanned takeover, you should choose disk names that will not conflict with disks already connected to the standby system. ZLT is currently only supported with an HP StorageWorks XP disk array.
PAGE 334
NOTE: The hardware configuration for use of a remote mirror on the audit trail is not part of the RDF configuration, nor is it part of any RDF validation. You must have placed the remote mirror in a location where you can connect it to the standby system at the time of a takeover. If you fail to do this, the RDF takeover operation fails until you have connected the remote mirror to the standby system or turned off remote mirroring. See “ALTER RDF Remote Mirror Configuration” (page 334).
PAGE 335
NOTE: Before issuing the TAKEOVER command, you must have connected the remote mirrors to the standby system. When the remote mirrors are connected to the standby system, the audit records on the remote mirrors have no relationship to the audit trail on the standby system. The remote mirrors are not part of the TMF configuration of the standby system. Phase 1 (ZLT Processing) RDFCOM stops all RDF processes on the backup system.
PAGE 336
If an extractor cannot find an audit file it needs because the disk has not yet been mounted, the extractor abends and the takeover operation aborts. If you have not yet mounted the disk the extractor needs, you must mount it before reissuing the TAKEOVER command. If the remote mirror cannot be mounted and you want to do the takeover without the ZLT guarantee, you can alter the RDF REMOTE MIRROR attribute on the backup system to off.
PAGE 337
1. Determine which disks (the local disk on the primary system or the remote mirror on the standby system) for all audit trails in the RDF configuration received the most audit records. The example that follows shows how to do so for the MAT. If your RDF configuration includes one or more auxiliary audit trails, you must do the same for each auxiliary audit trail.
PAGE 338
e. Start TMF. When startup is complete, the database on the primary system contains the same data that the database on the backup system had at the conclusion of the RDF takeover operation. CommitHoldMode OFF or Disabled If any local disk (the MAT or any auxiliary audit trail) has more audit records than the corresponding remote mirror (this can only happen if CommitHold was not configured or was configured but disabled on the primary system when the outage occurred): 1.
PAGE 339
A RDF Commands Quick Reference The syntax rules for the RDFCOM and RDFSCAN commands, explained in detail in Chapter 8 (page 175) and Chapter 9 (page 254), are summarized in this appendix. This appendix, which is written for system managers and operators, summarizes the syntax descriptions for: • The command to run RDFCOM from the Guardian user interface to the NonStop operating system. See “RDFCOM Run Syntax”. • The RDFCOM commands, listed in alphabetical order, beginning with the ADD command.
PAGE 340
{RECEIVER {PURGER {RDFNET {TRIGGER {VOLUME receiver-option } purger-option } netsync-option } {trigger-type } {trigger-option } } updater-option } COPYAUDIT The COPYAUDIT command copies missing audit records from the backup system that has the most to the backup system that has the least. This command is only for use with the triple contingency feature. Where Issued: Backup system only (the backup system with the least amount of audit records).
PAGE 341
Security: Any user. HISTORY INFO The INFO command displays the current configuration parameter values from the configuration file for the specified process or other object. Where Issued: Primary or backup system. Security: Any user.
PAGE 342
filename can be a spooler location or a file. In case of a file, it must be an EDIT file. seconds must be from 1 to 32767. repeat must be from 1 to 32767. OBEY The OBEY command executes a series of commands entered in an OBEY command file. Where Issued: Primary or backup system. Security: Any user. OBEY [\system.][$volume.][subvolume.]file OPEN The OPEN command specifies the RDF control subvolume to which subsequent commands in this RDFCOM session apply. Where Issued: Primary or backup system.
PAGE 343
{PRIORITY priority {PROCESS process-name {ATINDEX audittrail-index-number {RTDWARNING rtd-time {VOLUME volume-name } } } } } SET IMAGETRAIL The SET IMAGETRAIL command associates an image trail with a specific audit trail on the primary system. The supplied value is not applied to the RDF configuration file, however, until you issue an ADD IMAGETRAIL command. Where Issued: Primary system only. Security: Super-user group member.
PAGE 344
SET RDF The SET RDF command sets the designated RDF global configuration parameters to the supplied values within the RDF configuration memory table. The supplied values are not applied to the RDF configuration file, however, until you issue an ADD command. Where Issued: Primary system only. Security: Super-user group member.
PAGE 345
{ATINDEX atindex } {CPUS primary-CPU : backup-CPU } {EXTENTS (primary-extent-size,secondary-extent-size)} {PRIORITY priority-number } {PROCESS process-name } {RDFVOLUME volume } {FASTUPDATEMODE on-off value } SET TRIGGER The SET TRIGGER command sets trigger parameters within the RDF configuration memory table. The supplied values are not applied to the RDF configuration file, however, until you issue an ADD TRIGGER command. The trigger type (REVERSE or TAKEOVER) is specified in the ADD TRIGGER command.
PAGE 346
{IMAGETRAIL {TRIGGER {VOLUME {PURGER {RDFNET {NETWORK } } } } } } START RDF The START RDF command starts the RDF subsystem. Where Issued: Primary system only. Security: Super-user group member with remote password from the primary system to the backup. START RDF [,UPDATE {ON | OFF}] START UPDATE The START UPDATE command starts all updater processes on the backup system. Where Issued: Primary system only. Security: Super-user group member with remote password from the primary system to the backup.
PAGE 347
STOP UPDATE The STOP UPDATE command suspends updating of the backup database and stops all updater processes. Where Issued: Primary system only. Security: Super-user group member with remote password from the primary system to the backup. STOP UPDATE [, TIMESTAMP timestamp ] TAKEOVER The TAKEOVER command causes the backup database to become the database of record. Where Issued: Backup system only. Security: Super-user group member.
PAGE 348
FILE The FILE command selects the RDF log file to which subsequent RDFSCAN commands apply. FILE [\system.][$volume.][subvolume.]file HELP The HELP command displays the syntax of RDFSCAN commands or introductory information about the RDFSCAN utility. HELP [ ALL ] [ INTRO ] [ command ] LIST The LIST command displays a specified number of log messages that contain the current match pattern.
PAGE 349
[\system.][$volume.]temp-file Nondisk Device Names The syntax for a file name that identifies a nondisk device is: [\system.]device-name[.qualifier] or [\system.]ldev-number Process File Names RDFCOM commands can refer to (and display information about) named processes. In these commands, process names can include no more than six characters: a dollar sign followed by one letter followed by one to four alphanumeric characters.
PAGE 350
B Additional Reference Information This appendix provides additional reference information about: • “Default Configuration Parameters” (page 350) • “Sample Configuration File” (page 351) • “RDFSNOOP Utility” (page 353) • “RDF System Files” (page 354) • “RDF File Codes” (page 355) Process names are also reserved: $X* , $Y* , and $Z*. Certain keywords in the NonStop SQL/MP product are reserved words in SQL commands. Those reserved words are listed in the SQL/MP Reference Manual.
PAGE 351
Parameter Default Value(s) MIN MAX RECEIVER EXTENTS (100,100) 10 65500 RECEIVER PRIORITY 165 10 199 RECEIVER RDFVOLUME $SYSTEM n.a. n.a. RECEIVER FASTUPDATEMODE off n.a. n.a. TRIGGER CPUS 0:1 0 15 TRIGGER PRIORITY 150 10 199 TRIGGER WAIT WAIT n.a. n.a. TRIGGER NOWAIT WAIT n.a. n.a.
PAGE 352
| *** | *** Set the receiver parameters. | *** $REC is the name of the receiver process. | *** SET RECEIVER CPUS 1:2 SET RECEIVER EXTENTS (1000,1000) SET RECEIVER PRIORITY 165 SET RECEIVER RDFVOLUME $GOLD SET RECEIVER FASTUPDATEMODE ON SET RECEIVER PROCESS $MRECV | *** | *** Add the receiver parameters to the | *** RDF configuration file. | *** ADD RECEIVER| *** | *** Add secondary image trails.
PAGE 353
| *** | *** Set the updater parameters for the third | *** volume to be protected by the RDF product. | *** $U03 is the name of this updater. Volume | *** $DB3 on the backup node corresponds to | *** the volume $DB03 on the primary node. | *** Note that the IMAGEVOLUME parameter is omitted; | *** it defaults to $SECIT2 because it was not reset | *** after the previous ADD VOLUME command.
PAGE 354
RDF System Files The following files are created by the RDF subsystem and used by RDF processes: • Configuration file This is a key-sequenced file with record length 4062. The configuration file contains an internal representation of the configuration parameters that are set through RDFCOM commands. The configuration file resides on both the primary and backup node; on both nodes, the configuration file is named: $SYSTEM.control-subvolume.
PAGE 355
because the volumes were down. A record for each transaction and file is stored in the ZFILEINC file. If a volume is re-enabled on the primary system and TMF Backout is able to undo the audit data it could not previously undo, then the corresponding records are removed from the ZFILEINC file. The ZFILEINC file resides on the backup node and is named $SYSTEM.control-subvolume.ZFILEINC. • RDFTKOVR file This file records whether an RDF Takeover operation has completed successfully.
PAGE 356
C Messages This appendix describes the messages generated by RDF.
PAGE 357
on the sending system. (3) The name of the system on which the particular RDF process is running. (4) The name or process ID of the RDF process that issued this message. (5) The message number. (6) The message text that explains the log entry. If the EMS event log is $0 (the default collector), only items (3), (4), (5) and (6) are logged because of file-length restrictions. The pages that follow list all the RDF messages that RDF produces. The messages appear in ascending order by message number.
PAGE 358
Effect Variable; depends upon the process and the particular error, it might be retried or it might cause RDF to abort. Recovery Correct the error reported in the message and if RDF aborted, restart RDF. 702 Program version is inconsistent program expected-expected received-received program is the name of the program file that RDF tried to execute. expected is the expected version number of the program. received is the actual version number of the program, as reflected by the program file.
PAGE 359
705 File Open Error error on [ANSI-object-typeANSI-name, Partition partition-id,] file filename error is the file-system error number that identifies the specific error. ANSI-object-type is the ANSI object type (for example, table, index, and so on). ANSI-name is the ANSI name of the SQL/MX object that encountered the error. partition-id is the partition ID of the SQL/MX object that encountered the error. filename is the Guardian file name of the file that encountered the error.
PAGE 360
707 TMF is not yet started Cause The extractor detected that TMF has not been started yet. Effect RDF cannot run if TMF is not also running. Normally RDFCOM will recognize that TMF has not been started and will prevent RDF from starting. In the case of an RDF 707 event, TMF was running when RDFCOM verified that TMF was started, but TMF was then stopped before the extractor was started. If the extractor detects that TMF is not started, it aborts itself, and the monitor aborts the receiver and itself.
PAGE 361
Cause The updater failed to obtain process information about itself. This is a fatal error. Effect The updater process abends. Recovery This is an internal error. Contact your service provider. 712 Process creation error nnn nnn, file filename nnn nnn are the upper and lower bytes, respectively, of the status code reported by the NEWPROCESS procedure. filename is the name of the program file that was to be executed. Cause The monitor encountered an error while attempting to create an RDF process.
PAGE 362
error is the file-system error number that identifies the specific error. Cause A call to the checkpoint procedure failed, and the backup process of a NonStop process pair is still running. The message includes the number of the file-system error that was encountered when the primary process was trying to communicate with the backup process. Effect The backup process is stopped, and a new one is created after about 15 seconds.
PAGE 363
Cause The original backup process of a NonStop process pair has successfully created a backup process in the configured primary processor and has successfully switched processing to that process. Effect The NonStop process pair is switching primary and backup roles so that the primary process is now running in the CPU configured as the primary processor. Recovery This is an informational message; no recovery is required.
PAGE 364
Tell your database administrator that this error occurred. The database administrator should consider checking the synchronization of the primary and backup databases. 722 Waiting for audit trail file restoration, SNO sno sno is the sequence number of the audit trail for which the extractor is waiting. Cause The extractor has requested that the specified audit trail file be restored. Effect The extractor waits until the audit trail file is restored.
PAGE 365
failure or a STOP command entered manually from TACL. Because the updater might not have processed all image audit, the RDF TAKEOVER operation cannot be considered complete. Scan the EMS event log for RDF message 726: this message identifies the updater process that did not complete TAKEOVER processing. Effect Normal purger shutdown processing continues. Recovery If UPDATE was OFF at the time of the RDF TAKEOVER, then a second RDF TAKEOVER operation is automatically started, and no recovery is required.
PAGE 366
priority is the priority requested for the process. Cause An attempt to alter the priority of an RDF process to the indicated priority has failed. Effect The process continues to run at its current priority. Recovery This is an informational message; no recovery is required. Reissue the ALTER command. 730 Process priority altered priority priority is the priority requested for the process. Cause The operator successfully changed the priority of an RDF process to priority.
PAGE 367
filename is the Guardian file name of the file that is affected by the DDL operation. Cause The updater has found a Stop-RDF-Updater record in the image trail. This special record is generated in the TMF audit trail on the primary system when an SQL DDL operation WITH SHARED ACCESS involving the specified file has completed. Each updater will stop when it reaches this record in the image trail. Effect The updaters stop.
PAGE 368
Cause The purger logs this message after the successful completion of an RDF takeover operation. • If all data volumes on the primary system are configured to the MAT, then the reported position is the end of the last record received from the extractor. • If any data volumes on the primary system are configured to auxiliary audit trails, then the reported position is the end of the last commit or abort record received from the extractor for which no data from any auxiliary audit trail is missing.
PAGE 369
738 RDF extractor synch established SNO sno RBA rba sno is the sequence number of the TMF Master Audit Trail (MAT) for which the synchronization point was established. rba is the relative byte address of the synchronization point. Cause This message indicates that the receiver has sent the extractor a starting position in the TMF audit trail, and that the extractor has thereby become synchronized with the receiver. Effect Extractor is synchronized with the receiver.
PAGE 370
Either the file must be redefined on the primary node, or the other volume must be made protected by RDF. In the latter case, the backup file must then be resynchronized with the primary file. 741 RDF extractor message out of order Cause The receiver has received a message from the extractor that is out of order. When this event occurs, the extractor automatically reestablishes synchronization with the receiver.
PAGE 371
filename is the Guardian file name of the file that encountered the error. Cause The updater was previously delayed in obtaining information about the specified object. See RDF error 736. The information has now been obtained. Effect The updater continues processing. Recovery This is an informational message; no recovery is required. 745 Audit record conversion error error error is the error number.
PAGE 372
748 Internal error - RDF extractor abending Cause The extractor has detected an audit record of an unknown version. Effect The extractor process abends. Recovery This is an internal error. Contact your service provider. 749 Old audit record format encountered Cause The extractor has detected an audit record generated by an unsupported version of TMF. Effect The extractor abends. Recovery Reinitialize RDF. You might need to resynchronize the primary and backup databases.
PAGE 373
Recovery See the description of the FILE_OPEN_CHKPT_ procedure in the Guardian Procedure Calls Reference Manual to determine the cause of the failure. If possible, correct the underlying cause to avoid its reoccurrence. 752 Audit block RBN out of sequence. File filename RBN rbn RBA rba filename is the name of the audit trail file that contained the error. rbn is the relative block number of the block where the error occurred in the audit trail file.
PAGE 374
Effect Processing continues from the point at which the network failed. Recovery This is an informational message; no recovery is required. 755 CHECKMONITOR failure - backup abended Cause The primary process of a process pair stopped after creating its backup process, but before completing the backup initialization. Effect This is a catastrophic error; the process abends, and RDF stops. Recovery Restart the RDF product and report the error to your service provider.
PAGE 375
759 Secondary partition on unknown node filename filename is the name of the affected file. Cause An updater has encountered an audit record associated with either an Enscribe create, an increase of MAXEXTENTS for an Enscribe file, or a PURGEDATA operation for an Enscribe file, and the file on the primary system has secondary partitions that are located on different systems in the network. Effect The updater skips this record.
PAGE 376
Recovery Purge all existing context and configuration files on the primary and backup system. Then initialize the RDF subsystem. 763 Process incompatible with local system Cause The process reporting the error has determined that it has been installed on the wrong operating system. Effect The process abends. Recovery Install the version of the RDF product that is compatible with the installed release of the operating system.
PAGE 377
Effect The extractor continues with processing the second part of phase one. Recovery This is an informational message; no recovery is required. 767 Phase one part 2 database synchronization complete Cause The second part of phase one of a database synchronization operation has completed. Effect The extractor continues with the third part of phase one. Recovery This is an informational message; no recovery is required.
PAGE 378
Cause The receiver has successfully completed its initialization. Effect The receiver is prepared to receive data from the extractor. Recovery This is an informational message; no recovery is required. 772 TMF is not running on the remote system Cause The receiver has determined that TMF is not started on the RDF backup system. Effect The receiver abends. Recovery Start TMF on the backup system and then restart RDF.
PAGE 379
Recovery This is an informational message; no recovery is required. 776 Remote RDF receiver shutdown complete Cause The receiver has terminated normal processing as the result of a STOP TMF, STOP RDF, or TAKEOVER command. Effect Normal RDF shutdown processing continues. Recovery This is an informational message; no recovery is required. 777 Unexpected STOP SYNCH message received Cause The extractor has received a STOP SYNCH message, but it is not involved in a database synchronization operation.
PAGE 380
filename is the name of the exception file. Cause An updater is unable to apply image records for some transactions because a TAKEOVER command was executed and the commit, abort, or data records for the transactions were not sent to the backup system. The message includes the name of the exception file containing information about the image records that were not applied. Effect If all the records for a transaction are not received on the backup node, the transaction is treated as if it aborted.
PAGE 381
Effect If the monitor or extractor process receives a file-system error 14 (process does not exist), RDF will shut down on the primary node. Recovery If RDF was stopped on the remote node by a STOP RDF command while the communications lines were down, simply restart RDF by issuing a START RDF command. 784 Shutdown pending STOP UPDATE, TIMESTAMP timestamp timestamp is the specified timestamp. Cause The process has received notice that an RDFCOM STOP UPDATE, TIMESTAMP command was executed.
PAGE 382
filename is the name of the image trail file that contained the error. sno is the sequence number where the error occurred. rba is the relative byte address where the error occurred. Cause The receiver or an updater has encountered the indicated error while attempting to position into an image file. Effect The process abends. Recovery Correct the problem that caused the error and then restart RDF. 788 ALLOCATESEGMENT failure.
PAGE 383
Effect The primary process will now run in fault-tolerant mode. Recovery This is an informational message; no recovery is required. 796 Image file creation error error on filename error is the file-system error number that identifies the specific error. filename is the name of the image file associated with the error. Cause The receiver or purger process could not create the specified file due to the specified file-system error. Effect This is a catastrophic error; the process abends and RDF stops.
PAGE 384
798 Image trail file open error error on filename error is the file-system error number that identifies the specific error. filename is the name of the image file associated with the error. Cause An RDF process encountered the specified file-system error while attempting to open the specified file. Effect The process abends, and RDF stops. The exception to this is an error 12 (file in use) issued when either the receiver or purger attempts to open the file.
PAGE 385
Effect The message includes the error number returned by the WRITE system procedure followed by the file name. For error 43 (unable to obtain disk space for extent), the receiver retries the write operation. All other errors are fatal; the receiver abends, and RDF stops. Recovery The only recovery from an error 43 condition is to free some disk space.
PAGE 386
Recovery You should determine the cause of the error and take appropriate corrective action. 804 READUPDATELOCK error error on filename error is the file-system error number that identifies the specific error. filename is the name of the file on which the error occurred. Cause The RDFNET process has encountered the specified error on the specified file. Effect The RDFNET process aborts its current transaction, posts a timer, and waits for that timer to expire before attempting a new transaction.
PAGE 387
Recovery This is an informational message; no recovery is required. 808 Update mode has been set OFF Cause The operator issued a STOP UPDATE command. Effect RDF stops updating the backup database. Recovery This is an informational message; no recovery is required. 809 Shutting down in response to STOP RDF Cause The operator issued a STOP RDF command. Effect The RDF process stops normally. Recovery This is an informational message; no recovery is required.
PAGE 388
Cause The updater encountered a file-system error while attempting to communicate with the receiver or purger. The file-system error number and the name of the receiver or purger are included in the message. Effect This is a catastrophic error; the updater abends, and RDF will abort. Recovery Determine the cause of the error. If the receiver or purger did not abend, correct the condition, and restart RDF.
PAGE 389
816 Image trail file SETMODE error error on filename error is a file-system error number. filename is the name of the image file associated with the error. Cause The receiver or purger process has encountered an error while attempting to perform a setmode operation on the specified file. Effect The process abends. Recovery Correct the problem that led to the error and restart RDF.
PAGE 390
Recovery A subsequent warm start of RDF might be possible, but the success of the restart depends on the nature of the failure that caused the original process to stop. If the message is issued during ZLT processing, no recovery is required. 820 RDF receiver stopped unexpectedly, receiver receiver is the name of the receiver process that stopped. Cause The receiver has stopped unexpectedly. The message includes the name of the stopped process. Effect This message is issued by the RDF monitor.
PAGE 391
transaction marked for undo on a different node in the RDF network. Note, however, that this transaction could still be undone during final checking for business consistency across all backup nodes. Effect This is an internal event. There is no effect. Recovery This is an informational message; no recovery is required. 824 Missing RDF extractor config record, ATINDEX audit-trail-index Cause The RDF monitor was unable to find an extractor configuration record when performing a START RDF command.
PAGE 392
828 Killing backup process ... Cause The primary process of a process pair has detected a problem in communicating with the backup process. An earlier message will have indicated the communications problem. Effect Because of the severity of the problem, the primary process attempts to stop the backup process. If the backup process stops, the primary process then attempts to create a new backup process. Recovery This is an informational message; no recovery is required.
PAGE 393
You might be able to correct the underlying problem and restart RDF. Otherwise it might be necessary to reinitialize RDF. 832 Open error error on filename error is the file-system error number that identifies the specific error. filename is the name of the affected file. Cause The RDFNET process obtained the specified error in attempting to open the specified file. Effect The RDFNET process restarts. Recovery You should determine the nature of the error and take corrective action.
PAGE 394
835 RDFCOM csv command-text [ issued by userid ] csv specifies the RDF control subvolume of the affected RDF environment. command-text is the text of the command that was issued. userid if present, is the Guardian userid (group.user) of the user who issued the command. Cause RDFCOM logs this message whenever you issue any of these commands: ALTER, INITIALIZE RDF, START RDF, START UPDATE, STOP RDF, STOP UPDATE, or TAKEOVER. command-text is the command text.
PAGE 395
an updater restart, file-system errors 10, 11, and 71 are not reported by the updater because they probably represent database operations that have already been performed. Recovery Perform any corrective actions suggested by the preceding messages (actions such as reloading the appropriate CPU, correcting the underlying file error condition). 838 RDFNET process has terminated unexpectedly Cause The RDFNET process has terminated unexpectedly. Effect This message is issued by the RDF monitor.
PAGE 396
841 Error - Unable to complete STOP UPDATE. Error error error is a file-system error number. Cause The monitor was unable to send a shutdown message to an updater because of the indicated file-system error. Effect The monitor terminates the attempt to send STOP UPDATE messages to any other updaters. It then sends an ABORT RDF message to all the other RDF processes and waits for them to stop.
PAGE 397
845 Initialization synchronization completed Cause The updater that generated this message has completed its synchronization work for RDF initialization to an initialization timestamp. Effect The updater resumes its normal processing. Recovery This is an informational message; no recovery is required. 846 RDF TAKEOVER during database synchronization Cause When the updater completed its RDF Takeover operation, it had not yet completed its role in an online database synchronization.
PAGE 398
Cause TMF was stopped during an RDF online database synchronization operation, before the extractor had completed its phase one processing. Effect The extractor abends because the database synchronization operation can no longer succeed. Recovery You must reinitialize the RDF product and restart the online database synchronization operation.
PAGE 399
Recovery Contact your service provider for assistance with recovering from this situation. 854 ZTXUNDO file cannot be opened Cause While attempting to write the list of transactions that need to be undone to the ZTXUNDO file, that file could not be opened. Effect RDF aborts. Recovery If the operation involves an RDF takeover, take corrective action to enable the file to be opened and then reissue the TAKEOVER command.
PAGE 400
858 A safe File Recovery position does not exist Cause A network takeover operation has completed, but, for this particular node in the RDF network, there is no safe MAT position with which you can issue a File Recovery operation on your primary system should that node become available again. Effect There is no effect. Recovery This is an informational message; no recovery is required. 859 Error error on BEGINTRANSACTION encountered error is an error number.
PAGE 401
Cause The extractor has fallen behind the configured RTDWARNING threshold specified in the RDF configuration. Effect The extractor continues normal processing. Recovery This is an informational message. You should, however, try to determine why the extractor has fallen behind and take corrective action if necessary. 862 Updater processname RTD (rtd) exceeds RTD warning threshold (threshold#) processname is an updater process name. rtd is an RTD value. threshold# is an RTDWARNING warning threshold value.
PAGE 402
865 Missing purger config record Cause The purger configuration record is not in the RDF configuration file. Effect The reporting process abends and RDF will abort. Recovery Restart RDF. If the problem persists, contact your service provider. 866 RDF purger stopped unexpectedly Cause The purger process has terminated unexpectedly. Effect RDF aborts. Recovery Determine why the purger stopped, and then restart RDF. If the problem persists, contact your service provider.
PAGE 403
Recovery This is an internal error. Report this to your service provider. 872 Warning; Lockstep operation is denied Cause An application has attempted an RDF lockstep operation but you have not configured RDF for lockstep operations. Effect No lockstep operations can take place, and the lockstep gateway process abends. Recovery This is a warning. If you want lockstep operations, you must stop RDF and create a new RDF configuration with the RDF LOCKSTEPVOL attribute set.
PAGE 404
Effect The process abends and RDF will abort. Recovery The file must be altered or recreated with the correct file format and then RDF can be restarted. 876 Imagetrail safe position: SNO sno RBA rba sno is the sequence number. rba is the relative byte address. Cause This is an imagetrail safe position. Effect This is an internal event. There is no effect. Recovery This is an informational message for historical purposes about a pending undo pass. No recovery action is required.
PAGE 405
Effect The purger process will not process any more files in that particular subvolume. The operation will be attempted again after PURGETIME minutes. Recovery The file or table being reported was not created by RDF and is not part of the RDF environment. It must be manually purged or renamed out of the image subvolume before the purger process can continue normal processing.
PAGE 406
Cause The named RDF process' current transaction has been aborted by TMF and the disk process. Effect The process restarts. Recovery This is an informational message. You must examine the event log to determine why the process is restarting and if any recovery action is required. 883 Physical volumes in pool exceed the limit of 21 Cause The updater is configured to a virtual SMF disk that consists of more than 21 physical disks. This configuration is not supported by the RDF product.
PAGE 407
887 Process trapped. Signal sig-num sig-num is the signal number associated with the trap. Cause The process reporting the event experienced an internal error and has trapped. The message indicates the signal number associated with the trap. Effect The process abends. Recovery This is an internal error. Preserve the saveabend file created and report the incident to your service provider. 888 MAT position for File Recovery: SNO num RBA num Cause A successful takeover has completed.
PAGE 408
891 First network transaction to be undone: %identifier identifier is the transaction identifier. Cause This is the first transaction that requires network undo with respect to business consistency across all backup nodes. All transactions that committed after this transaction are undone on this node except those transactions that can be safely kept because they actually committed before this transaction on one or more different primary nodes in the RDF network.
PAGE 409
Cause The auxiliary receiver has detected information about an SQL SHARED ACCESS DDL operation associated with the specified SNO and RBA in the Master Audit Trail (MAT) on your primary system. Effect The auxiliary receiver coordinates stopping its updaters at the correct location. Recovery This is an informational message. When all updaters on all trails have shut down for the SQL NSA operation, you can execute the same DDL operation on your backup system.
PAGE 410
Recovery If this happens during a takeover operation, reissue the TAKEOVER command. When the updater restarts the table will automatically be resized to accommodate the required number of transactions. If this happens in a stop-update-to-time operation, restart RDF and issue a new STOP UPDATE, TIMESTAMP command at a time when there are fewer transactions active.
PAGE 411
Effect If this message comes from the master receiver, then normal RDF TAKEOVER processing is ready to proceed. If it comes from an auxiliary receiver, then normal RDF TAKEOVER processing does not start until the master receiver has finished ZLT processing. Recovery This is an informational message; no recovery is required. 904 Auxiliary receiver is catching up Cause The master receiver has found a TMF shutdown record.
PAGE 412
907 Backup process creation error nnn nnn, file filename nnn nnn are the error number and error detail returned by the PROCESS_CREATE_ system procedure. filename is the name of the program file that was to be executed. Cause The primary process of an RDF NonStop process pair encountered an error while attempting to create its backup process.
PAGE 413
Cause The purger logs this event whenever all updaters have stopped following a STOP UPDATE command. Effect The updater processes are stopped. Recovery This is an informational message; no recovery is required. 911 Updaters stopped before STOP RDF, DRAIN has completed Cause The purger has detected that all the updaters have stopped, but at least one updater stopped prematurely and did not drain all audit. Effect The STOP RDF, DRAIN is not complete.
PAGE 414
915 Drain operation complete but a primary volume is down Cause A STOP RDF, DRAIN or STOP RDF, REVERSE command has completed but RDF has detected that a volume on the primary node is down. Effect Any transactions that touched the affected volume that were active when the volume went down are unresolved (on both the primary and backup systems). If this event is the result of a STOP RDF, REVERSE, the REVERSE trigger is not executed.
PAGE 415
Recovery Restart RDF and the STOP TMF record will be processed as normal. Once RDF has stopped as a result of the STOP TMF record, RDF can be restarted and a new STOP RDF, DRAIN (or REVERSE) command can be issued.
PAGE 416
index is the string index at which the mapping string is invalid. filename is the name of the updater mapfile specified in the updater configuration. cause identifies the reason for the mapping string to be invalid. Cause The updater has detected that the specified mapping string is invalid. The position, if included in the event, indicates the string index at which the mapping string is invalid. The event also includes a description of the reason for the string to be invalid.
PAGE 417
mapping-string identifies the invalid mapping string. index is the string index at which the $ character is detected. filename is the name of the updater mapfile specified in the updater configuration. Cause The updater has detected the $ character in the specified mapping string. The volume name is not allowed in the mapping string. The position indicates the string index at which the $ character is found. Effect The updater stops and RDF aborts.
PAGE 418
ANSI-object-type is the ANSI object type (for example, table and index.) ANSI-name is the ANSI name of the affected SQL/MX object on the backup system. partition-id is the partition ID of the affected SQL/MX object on the backup system. name is the Guardian filename of the affected object on the backup system. Cause The Updater has found a Stop-RDF-Updater record in the image trail.
PAGE 419
935 The number of image trail files on image trail is greater than the threshold value of 25. is the number of image trail files. is the respective image trail volume. Cause After completing a purge pass for the specified image trail volume, the purger found that the specified number of image trail files still remained in the image trail volume. Effect Purger continues normal execution and the purge pass terminates as expected.
PAGE 420
Effect The command fails. Recovery See the Guardian Procedure Errors and Messages Manual for a description of and recovery actions for the file-system error. Correct the error indicated by error#. This might necessitate altering the IMAGETRAIL configuration parameter to specify a new volume for the secondary image trail. Then, reenter the START RDF or VALIDATE command. ALTER Failed: error# error# is the file-system error number that identifies the specific error. Cause An ALTER command failed.
PAGE 421
Cause RDFCOM could not resolve an ambiguous timestamp after a change to Local Civil Time (LCT) when the LCT was changed relative to Greenwich Mean Time (GMT). In this situation, two LCTs map to the same GMT and it is impossible to determine the correct intended time. This problem typically occurs in Autumn, when the clock is set back from Daylight Savings Time. If you specify an RDF initialization timestamp between 1:00AM and 2:00AM on that day, the intended time is ambiguous.
PAGE 422
Recovery You must execute the RDF TAKEOVER operation on both backup systems before you can use the COPYAUDIT command of the triple contingency protocol. An RDF TAKEOVER has completed Cause The operator issued a TAKEOVER command at the backup node and the TAKEOVER operation has completed. Effect The backup database becomes the database of record. Recovery This is an informational message; no recovery is required.
PAGE 423
Cause You tried to add an updater with the specified ATINDEX, but there is no receiver with that value. Effect The ADD command fails. Recovery Reenter the ADD command specifying a correct ATINDEX. Backup node in network master record is incorrect primary system Cause The network master network record does not have the have the specified backup system name for the local RDF subsystem. Effect Validation fails. Recovery You must reconfigure your network master.
PAGE 424
Messages Manual. If possible, correct the error and reenter the command that encountered the error. Otherwise, see your system manager. Cannot delete NETWORK PNETTXVOLUME volume volume is the name of an RDF data volume. Cause You have attempted to delete an updater that protects the specified PNETTXVOLUME. This is not allowed because it would break your RDF network. Effect Delete fails.
PAGE 425
Messages Manual. If possible, correct the error and reenter the command that encountered the error. Otherwise, see your system manager. Couldn't create or clear the PRIMARYSYSTEM CONTEXT (error#) error# is the file-system error number that identifies the specific error. Cause The context file data cannot be created or cleared while START RDF processing is performed after INITIALIZE RDF. Effect START RDF processing is aborted. Recovery See the Operator Messages Manual for a description of the error code.
PAGE 426
Effect The command fails. Recovery Wait until both CPUs become available, and reenter the START RDF command. If necessary, see your system manager. cpu:cpu CPUS are not SYSGEN’d cpu:cpu are the primary and backup CPUs, respectively. Cause A START RDF command failed because the specified CPUs do not exist (they were not configured during SYSGEN). Effect The command fails. Recovery Reconfigure RDF to use other CPUs or, if you must use the specified CPUs, see your system manager.
PAGE 427
Cause You tried to add a volume to the RDF configuration and use the online database synchronization feature, but RDF/IMP or IMPX is not installed. Effect The ADD command fails. Recovery If you want to perform online database synchronization, RDF/IMP or IMPX must be installed on both the primary and backup systems.
PAGE 428
error# is the file-system error number that identifies the specific error. filename is the name of an RDF configuration file on the control subvolume. Cause While RDF was attempting to check if an RDF control file existed in $SYSTEM.control-subvolume on the backup system, file-system error error# was returned. Effect The INITIALIZE RDF command aborts. Recovery See the Operator Messages Manual for a description of the error code.
PAGE 429
Effect The command fails. Recovery See the Guardian Procedure Errors and Messages Manual for a description and recovery actions for the file-system error. Correct the error indicated by error#, and then reenter the command. Error error# obtaining FILECODE and EOF of the MAPLOG filename error# is the file-system error number that identifies the specific number. filename is the name of the updater maplog specified in the updater configuration.
PAGE 430
filename is the name of the image trail file associated with the error. Cause The COPYAUDIT command encountered the specified error while attempting to create the specified image file on the local image trail volume. Effect The COPYAUDIT command aborts. Recovery See the Operator Messages Manual for a description of the error code. For additional details about understanding and correcting file-system errors, see the Guardian Procedure Errors and Messages Manual.
PAGE 431
filename is the name of the updater maplog specified in the updater configuration. Cause Purgedata operation returned an error when RDFCOM tried to clean the updater maplog file when an ADD VOLUME, ALTER VOLUME, or START RDF command was being executed. Effect The command fails. Recovery See the Guardian Procedure Errors and Messages Manual for a description of the recovery actions for the file-system error. Correct the error indicated by error#, then reenter the command.
PAGE 432
Messages Manual. Take appropriate corrective action, and then reissue the COPYAUDIT command. Error error# on setmode for large transfers error# is the file-system error number that identifies the specific error. Cause The COPYAUDIT command encountered the specified error while attempting to perform a SET MODE operation to enable large transfers of data. Effect The COPYAUDIT command aborts. Recovery See the Operator Messages Manual for a description of the error code.
PAGE 433
Effect The command fails. Recovery Add the keyword MAP in the mapping string, then reenter the command. Expected NAMES in the mapping string mapping-string in the MAPFILE filename mapping-string is the erroneous mapping string specified in the mapfile. filename is the name of the updater mapfile specified in the updater configuration.
PAGE 434
Recovery Check the syntax rules for the command you entered. Perhaps you misspelled a keyword parameter or misplaced a delimiter. Expecting 'Yes' or 'No' response. Cause You have entered an unexpected response to an RDFCOM prompt that requires only either YES (or Y) or NO (or N) as verification to proceed with your request. Effect The requested operation does not take place. Recovery Reenter your request, this time specifying either YES, Y, NO, or N to the prompt. Expecting 'Yes' or 'No' response.
PAGE 435
Cause An ADD EXTRACTOR command was issued when the configuration file already contained an extractor record. Effect The command fails. Recovery No recovery is required if you want to use the existing extractor process as configured. If you want to change any of the extractor’s configuration options, however, enter an ALTER EXTRACTOR command that specifies those changes. EXTRACTOR record NOT found Cause The INFO command could not find an extractor record in the configuration file. Effect The command fails.
PAGE 436
File is not an edit file filename. filename is the name of the user-specified file for generating the report. Cause When a VALIDATE CONFIGURATION or LIST UPDATERFILEOPENS command is executed, RDFCOM determines that a user specified file name for report generation already exists and it is not an edit file. Effect The command fails. Recovery Provide an EDIT file and reenter the command.
PAGE 437
Effect The command fails. Recovery When issuing the INITIALIZE RDF using an IN file to RDFCOM with "#" operator, specify "!" operator also. Illegal File Format: error# error# is the NEWPROCESS error number that identifies the specific error. Cause A NEWPROCESS error occurred during START RDF or TAKEOVER processing. Effect The START RDF or TAKEOVER operation is aborted. Recovery See the Operator Messages Manual for a description of the error code.
PAGE 438
Recovery Add the image trail. Then add the updater. IMAGETRAIL for IMAGEVOLUME vol-name does not exist or the atindex of the IMAGEVOLUME does not match the updater’s atindex vol-name is the image trail volume Cause You tried to add an updater for a particular ATINDEX, but there is no imagetrail configuration for that value. Either you have not yet added the imagetrail or you added it with a different ATINDEX. Effect The ADD command fails. Recovery Review and revise your RDF configuration.
PAGE 439
Cause While validating your configuration, RDFCOM determined that the image trail on the volume volume-name is not referenced by any updater processes. Effect The validation operation aborts. Recovery Delete this image trail IMAGETRAIL volume-name record not found; DELETE aborted volume-name is the name of the secondary image trail that was specified in the command. Cause You tried to delete a secondary image trail that does not exist. Effect The command fails.
PAGE 440
Recovery Reenter the command, using a correct volume name. Inconsistent network options are not allowed Cause You have attempted to add the RDF configuration record with the RDF NetworkMaster attribute on but the Network attribute off. Effect The configuration command fails. Recovery You must determine whether you are in an RDF network or not. Initialization point for timestamp has been found timestamp is an INITTIME timestamp specified previously by an operator in an RDFCOM INITIALIZE RDF command.
PAGE 441
Cause This message follows a previous error message that indicates why the INITIALIZE RDF command failed. Effect RDF is not initialized. Recovery Examine the error message immediately preceding this one, correct the condition reported, and reenter the INITIALIZE RDF command. Internal consistency error on Network records Cause RDFCOM has detected an internal error that indicates inconsistency. Effect The configuration command fails.
PAGE 442
Invalid spooler location specified filename. filename is the user-specified spooler location for generating the report. Cause When a VALIDATE CONFIGURATION or LIST UPDATERFILEOPENS command is executed, RDFCOM determined that a user specified spooler location for report generation is not a spooler location. Effect The command fails. Recovery Reenter the command with a valid filename or spooler location.
PAGE 443
Effect The START RDF or TAKEOVER operation is aborted. Recovery See the Operator Messages Manual for a description of the error code. For additional information about understanding and correcting process errors, see the Guardian Procedure Errors and Messages Manual. Correct the error and reenter the START RDF or TAKEOVER command. LIST not allowed when RDF is not running. Cause LIST UPDATERFILEOPENS command was executed when RDF is not running. Effect LIST UPDATERFILEOPENS command fails.
PAGE 444
subvolume-name is the erroneous subvolume specified in the mapping string. Cause RDFCOM expected * in the filename portion of the subvolume indicated by subvolume-name when an ADD VOLUME, ALTER VOLUME, START RDF, START UPDATE, or VALIDATE CONFIGURATION command was being executed. Effect The command fails. Recovery Correct the mapping string, then reenter the command.
PAGE 445
mapping-string is the erroneous mapping string specified in the MAPFILE. index is the erroneous position in the mapping string. filename is the name of the updater MAPFILE specified in the updater configuration. Cause RDFCOM detected that the mapping string is invalid at the position indicated by index when an ADD VOLUME, ALTER VOLUME, START RDF, START UPDATE, or VALIDATE CONFIGURATION command was being executed. Effect The command fails. Recovery Correct the mapping string, then reenter the command.
PAGE 446
MONITOR Record exists, use ALTER MONITOR Cause An ADD MONITOR command was issued when the configuration file already contained a monitor record. Effect The command fails. Recovery No recovery is required if you want to use the existing monitor process as configured. If you want to change any of the monitor’s configuration options, however, enter an ALTER MONITOR command that specifies those changes. MONITOR record NOT found Cause The INFO command could not find a monitor record in the configuration file.
PAGE 447
Effect RDF can not be started. Recovery Correct the VOLUME INCLUDE associated with the PNETTXVOLUME so that the file $volume.control-subvolume.ZRDFNETX is included. Network synch file ZRDFNETX must not be EXCLUDED Cause An EXCLUDE pattern has been specified which will case audit associated with the NetSynch data file to be filtered out. Effect RDF can not be started. Recovery Correct the VOLUME EXCLUDE associated with the PNETTXVOLUME so that the EXCLUDE pattern does not exclude the file $volume.
PAGE 448
Effect The command fails. Recovery See the Guardian Procedure Errors and Messages Manual for a description and recovery actions for the file-system error. Correct the error indicated by error#, then, reenter the command. No image files could be found in the imagetrail on volume-name.subvolume-name volume-name is the name of the image trail’s volume subvolume-name is the name of the image trail’s subvolume. Cause The COPYAUDIT command could not find any image files on the remote image trail.
PAGE 449
Cause The RDF configuration file is invalid. Effect The configuration validation fails. Recovery Check the updater process parameters in the configuration file for invalid values and correct any errors found. No VOLUMES are configured for ATINDEX atindex Cause You added an extractor and receiver with the specified ATINDEX, but there are no updaters with that value. Effect The validation fails. Recovery Add at least one updater with the same ATINDEX value or delete the particular extractor-receiver pair.
PAGE 450
Recovery Select another command, or ask an authorized person in the super ID group to issue the command for you. Open error error# on filename error# is the file-system error number that identifies the specific error. filename is the name of the TMF Master Audit Trail (MAT). Cause An open error occurred on the TMF Master Audit Trail (MAT). Effect The file is not opened. Recovery See the Guardian Procedure Errors and Messages Manual for a description of the recovery actions for the file-system error.
PAGE 451
Cause The command can be issued only at the backup system. Effect The command fails. Recovery Enter another command. Operation can only be performed on the PRIMARYSYSTEM \primary primary is the name of the primary node that can perform the operation. Cause The command can be issued only at the primary node. Effect The command fails. Recovery Enter another command. Operation is NOT allowed when RDF is running Cause The command is not allowed while RDF is running. Effect The command fails.
PAGE 452
Recovery This is an informational message; no recovery is required. You can terminate the operation at any time by pressing the BREAK (or equivalent) key. PNETTXVOLUME volume for ctrl-subvol must be protected by a MAT based-updater volume is the name of an RDF data volume. ctrl-subvol is the name of an RDF subsystem control subvolume. Cause The specified volume for the RDF subsystem with the specified control subvolume is not configured to the Master Audit Trail (MAT). Effect Validation fails.
PAGE 453
Cause The COPYAUDIT command encountered the specified error while attempting to position into a local image file on the local image trail. Effect The COPYAUDIT command aborts. Recovery See the Operator Messages Manual for a description of the error code. For additional details about understanding and correcting file-system errors, see the Guardian Procedure Errors and Messages Manual. If possible, correct the error and reenter the COPYAUDIT command. Otherwise, contact your service provider.
PAGE 454
Effect The Report generation fails. Recovery See the Guardian Procedure Errors and Messages Manual for a description of the recovery actions for the file-system error. Correct the error indicated by error#, then reenter the command. Primary node in network master record is incorrect primary-system primary-system is an RDF system name. Cause The network master network record does not have the specified primary system name. Effect Validation fails. Recovery You must reconfigure your network master.
PAGE 455
Effect The command fails. Recovery Assign a new name to the process. Then, reenter the START RDF or VALIDATE command. Process procname already used in local system. procname is the name of the RDF process you specified in the current configuration. Cause When a START RDF command or a VALIDATE CONFIGURATION command is executed, RDFCOM determined that a user specified process name is already in use in a local system. Effect The command fails. Recovery Assign a new name to the process.
PAGE 456
RDF already in TAKEOVER processing Cause The operator issued a TAKEOVER command at the backup node while RDF was performing a TAKEOVER operation. Effect The last TAKEOVER command is ignored. Recovery This is an informational message; no recovery is required. RDF configuration file is not open, use OPEN command Cause The configuration file must be open before any RDFCOM commands other than OPEN or OBEY can be executed. Effect The attempted operation is aborted.
PAGE 457
Recovery Validate all your non network master subsystems and then validate your local subsystem. RDF (\primary -> \backup) is NOT running Cause A STATUS RDF or STOP RDF command was issued while RDF was stopped. Effect The command fails. Recovery Reissue the command after RDF is started. RDF record exists, use ALTER RDF Cause An ADD RDF command was issued when the configuration file already contained an RDF global record. Effect The command fails.
PAGE 458
ctrl-subvol is the name of an RDF subsystem control subvolume. Cause The RDF subsystem that you specified as your network master has not been configured as a network master. Effect Validation fails. Recovery You need to reconfigure your local subsystem and specify the control subvolume of your network master. You might also need to reconfigure your network master. RDF subsystem ctrl-subvol stopped. TMF audit trails remain pinned. ctrl-subvol is the name of an RDF subsystem control subvolume.
PAGE 459
Effect The configuration command fails. Recovery Do not add this record. RDFVOLUME in network master record is incorrect rdf-vol. rdf-vol is the RDFVOLUME specified in the network record of the network master. Cause The RDFVOLUME of the current RDF configuration does not match the value specified in the network record of the network master. Effect Validation fails. Recovery You must reconfigure your network master and possibly your local configuration. RDFVOLUME in network master record is invalid.
PAGE 460
Messages Manual. If possible, correct the underlying error and reenter the COPYAUDIT command. Otherwise, contact your service provider. Read error error# on remote image file error# is the file-system error number that identifies the specific error. Cause The COPYAUDIT command encountered the specified error while attempting to read data from a remote image file on the remote image trail. Effect The COPYAUDIT command aborts. Recovery See the Operator Messages Manual for a description of the error code.
PAGE 461
Messages Manual. Make sufficient space available on disk, and then reenter the START RDF command. RECEIVER RDFVOL error error# on creation error# is the file-system error number that identifies the specific error. Cause During execution of a START RDF command or a VALIDATE CONFIGURATION command, RDFCOM determined that sufficient disk storage for image files did not exist on the RDFVOLUME.
PAGE 462
Recovery Add the receiver’s record. Then, add the secondary image trail. RECEIVER record exists, use ALTER RECEIVER Cause An ADD RECEIVER command was issued when the configuration file already contained a receiver record. Effect The command fails. Recovery No recovery is required if you want to use the existing receiver process as it is configured. If you want to change any of the receiver’s configuration options, however, enter an ALTER RECEIVER command that specifies those changes.
PAGE 463
Effect The COPYAUDIT command aborts. Recovery When the remote system becomes available, reissue the COPYAUDIT command. Remote system for Triple Contingency CopyAudit command is unknown: remote-system remote-system is the name of the RDF backup system that received the most audit. Cause You entered a COPYAUDIT command, but remote-system is unknown to RDF. Effect The COPYAUDIT command aborts. Recovery Find the correct node name for the other RDF backup system, and reissue the COPYAUDIT command.
PAGE 464
volume is one of the RDF image trail volumes on the remote system named in the COPYAUDIT command. Cause The COPYAUDIT command is about to search for missing audit; this audit reached the specified image trail on the remote system but did not reach the local system before the original primary system was lost. Effect The COPYAUDIT command begins the search. Recovery This is an informational message; no recovery is required.
PAGE 465
Recovery See the Operator Messages Manual for a description of the error code. For additional details about understanding and correcting errors, see the Guardian Procedure Errors and Messages Manual. If possible, correct the error and reenter the command that encountered the error. Otherwise, see your system manager. Specified network backup system name is not defined. Cause The specified backup system does not exist. Effect The configuration command fails.
PAGE 466
Recovery If all files of the MAT are currently on disk (for instance, the files from AA000001 to the current audit file), then the specified timestamp is earlier than the last time TMF was initialized. To recover, you need to reexamine the EMS log or the OPRLOG for a later TMF shutdown point or stop TMF and use that shutdown point.
PAGE 467
Cause You are attempting to execute the RDFCOM STOP SYNCH command, but either the RDF product is not running or another critical operation is already in progress. Effect The RDFCOM STOP SYNCH command aborts. Recovery Correct the situation and then reissue the command. STOP SYNCH command is aborted because database synchronization is not in progress. Cause You are attempting to execute an RDFCOM STOP SYNCH command, but online database synchronization is not in progress.
PAGE 468
SUFFIX must be a single alphanumeric character Cause You specified an incorrect value for the SUFFIX option of the INITIALIZE RDF command. Effect The INITIALIZE RDF command is aborted. Recovery Reenter the command, specifying a single alphanumeric character for the suffix character. Swap File Error: error# error# is the file-system error number that identifies the specific error. Cause A NEWPROCESS error occurred during START RDF or TAKEOVER processing. Effect The command fails.
PAGE 469
Cause A takeover operation is underway. Effect The takeover operation continues. Recovery This is an informational message; no recovery is required. TAKEOVER command is not allowed in an OBEY/IN file without the bang (!) option. Cause TAKEOVER command has been issued through an OBEY/IN file without bang (!) option. Effect The TAKEOVER operation is aborted. Recovery Specify the bang (!) option along with the TAKEOVER command in the OBEY/ IN file or issue the TAKEOVER command from RDFCOM command prompt.
PAGE 470
Cause You must specify a process name for the extractor process before issuing an ADD command. Effect The start command fails. Recovery You must reconfigure RDF with a named extractor process. The last record in the local imagetrail on volume-name.subvolume-name could not be found in the remote trail volume-name is the name of the image trail’s volume subvolume-name is the name of the image trail’s subvolume.
PAGE 471
error# is the file-system error number that identifies the specific error. Cause Create operation returned an error when RDFCOM tried to create the updater maplog file when an ADD VOLUME, ALTER VOLUME, START UPDATE, or START RDF command was being executed. Effect The command fails. Recovery See the Guardian Procedure Errors and Messages Manual for a description of the recovery actions for the file-system error. Correct the error indicated by error#, then reenter the command.
PAGE 472
The number of physical UPDATEVOLUMES exceeds 255. RDF/IMPX is required for this many volumes. Cause RDF has detected that the user is running RDF/IMP and the total number of physical volumes for all UPDATEVOLUMEs exceeds 255. This can happen where some, or all, of the UPDATEVOLUMEs are SMF virtual disks. Effect RDF will not start.
PAGE 473
Cause You tried to execute an INITIALIZE RDF command, but the RDF control files (such as CONFIG or CONTEXT) already exist on the remote control subvolume. If these files are on the backup system, then that name is specified. Effect The INITIALIZE RDF command aborts. Recovery You must purge \$SYSTEM.subvol.* on the backup systems before you can retry the INITIALIZE RDF command. Before doing so, however, be sure that the existing files do not belong to a different RDF configuration that is still valid.
PAGE 474
Recovery This is an informational message; no recovery is required. The total number of items in the INCLUDE, EXCLUDE, INCLUDEPURGE, and EXCLUDEPURGE lists has exceeded 100. Cause The total number of items in the INCLUDE, EXCLUDE, INCLUDEPURGE, and EXCLUDEPURGE lists, are more than 100. Effect The volume is not added for RDF protection. Recovery Reduce the total number of items in the INCLUDE, EXCLUDE, INCLUDEPURGE, and EXCLUDEPURGE lists to less than or equal to 100.
PAGE 475
Cause An illegal command was encountered within an OBEY command file. Effect The command fails. Recovery Remove the command from the OBEY command file, and reenter the command directly from your terminal. This RDF subsystem is not configured in the network master subsystem Cause Your current RDF subsystem is not listed in your the configuration of your network master. Effect Validation fails. Recovery You must reconfigure your network master and possibly your local configuration. TMF is having trouble.
PAGE 476
Recovery Check the contents of the RDF configuration file, issue a VALIDATE RDF command to verify the configuration, and reissue your request for the RDFCOM operation you originally wanted to perform. TMF NAT table is full. Cause There is a problem with TMF. Effect The configuration validation fails. Recovery Check the status of TMF. When TMF is operational, reenter the command. TMF Shutdown at timestamp has been found.
PAGE 477
max is 64 for RDF, and 255 for the RDF/MP, MPX, IMP, or IMPX Cause The maximum number of volumes that can be protected on a node (64 for RDF, 255 for RDF/MP, MPX, IMP, or IMPX) has been exceeded. Effect The configuration validation fails. Recovery Delete some of the volumes. Too many volumes are specified in this ALTER command Cause You specified too many volumes in the command parameter list of an ALTER command. Effect The ALTER command aborts.
PAGE 478
seq-num is the sequence number of the MAT file. Cause RDFCOM could not obtain the fully qualified name of the audit trail file with the specified sequence number from the TMP. Effect RDFCOM terminates the search for a TMF shutdown timestamp and then its attempt to initialize RDF. Recovery Check to see if TMF is started: • If it is not, start TMF before you again attempt to initialize RDF. • If it is running and this error occurs, this is an internal error.
PAGE 479
Undefined Externals Cause A NEWPROCESS error occurred during START RDF or TAKEOVER processing. Effect The operation is aborted. Recovery This is an internal error. Contact the Global Mission Critical Solution Center (GMCSC) or your service provider. Unidentifiable newprocess error: newproc0:7:newproc8:15 newproc# identifies the new process error. Cause A NEWPROCESS error occurred during START RDF or TAKEOVER processing. Effect The command fails.
PAGE 480
Effect The command fails. Recovery Scan the EMS event log to determine why the command could not be performed. Correct the error condition, if possible, and request the update operation again. VOLUME device is NOT a disk volume device is the non-disk device assigned to the updater. Cause The RDF configuration file is invalid. Effect RDF will not start. Recovery Change the RDF configuration to reflect a valid disk volume.
PAGE 481
VOLUME volume does NOT exist volume is the volume on the primary node for which the updater is responsible. Cause The RDF configuration file is invalid. Effect RDF will not start. Recovery Bring the volume up or delete it from the RDF configuration. VOLUME volume is NOT configured within TMF volume is the volume on the primary node for which the updater is responsible. Cause The RDF configuration file is invalid. Effect RDF will not start.
PAGE 482
Recovery Alter the particular updater’s ATINDEX value to match the appropriate audit trail number or delete the updater. VOLUME volume UPDATEVOLUME does NOT exist volume is the volume on the primary node for which the updater is responsible. Cause The RDF configuration file is invalid. Effect RDF will not start. Recovery Bring the volume up, or delete it from the RDF configuration. VOLUME volume record NOT found volume is the volume on the primary node for which the updater is responsible.
PAGE 483
WARNING: No backup cpu has been configured for the procname procname is the RDF process without a backup CPU, which is one of: EXTRACTOR, MONITOR, RECEIVER, or $volume UPDATER. Cause RDF is started without a backup process for the process identified in this message. Effect RDF is started. Recovery Stop RDF, reconfigure it to include a backup CPU for the RDF process, and start the subsystem once again. * * * WARNING * * * NSA SQL DDL operation encountered in the audit trail.
PAGE 484
Cause You are attempting to initialize RDF in conjunction with a complete database synchronization. Effect If an audit record can be found whose timestamp is less than the specified timestamp, RDF is initialized to that record. Recovery This is an informational message; no recovery is required. * * * WARNING * * * RDF will start at the first record in the TMF master audit trail whose timestamp is less than the specified timestamp. The timestamp you specified must follow the documented guidelines.
PAGE 485
Recovery Online modification of INCLUDE/EXCLUDE lists does not support Wildcard characters. Please re-issue the ALTER command without the wildcard characters. Write error error# on attempt to reach the extractor, STOP SYNCH command aborted. error# is the file-system error number that identifies the specific error. Cause You are attempting to execute an RDFCOM STOP SYNCH command, but RDFCOM encountered the specified error while sending the message to the extractor. Effect The STOP SYNCH command aborts.
PAGE 486
Cause The COPYAUDIT command encountered the specified error while attempting to write data into the specified image file on the local image trail volume. Effect The COPYAUDIT command aborts. Recovery See the Operator Messages Manual for a description of the error code. For additional details about understanding and correcting file-system errors, see the Guardian Procedure Errors and Messages Manual. If possible, correct the error and reenter the COPYAUDIT command. Otherwise, see your system manager.
PAGE 487
Recovery Respond “yes” or “no” to the prompt. You are attempting a TAKEOVER operation immediately after the receiver has crashed. Please contact your HP analyst before proceeding with the TAKEOVER operation. Cause RDFCOM has detected that the receiver stopped prematurely the last time it was running. Because you are attempting the takeover operation before RDF has been allowed to restart, the probability is high that your database is already inconsistent with respect to transaction boundaries.
PAGE 488
Cause During execution of an ALTER VOLUME command on the backup system, RDFCOM determined that the primary system is accessible. Effect The command fails. Recovery Reenter the command on the primary system. You cannot START RDF after an RDF Takeover operation. Cause You tried to start RDF after an RDF takeover operation has been performed. Effect The START RDF command fails. Recovery You must initialize RDF on the primary system. You should also ensure that the takeover operation completed successfully.
PAGE 489
Recovery Reissue the command, specifying a four-digit year in the timestamp. RDFSCAN Messages The following RDFSCAN messages (listed alphabetically by text) can appear on your terminal screen during an RDFSCAN session. Beyond eof! Cause The AT position specified is beyond the end-of-file mark in the current log file. Effect The AT command fails. Recovery Reenter the AT command, this time with a record-number parameter that indicates a position before the end-of-file mark.
PAGE 490
Effect The command fails. Recovery See the Operator Messages Manual for a description of the error code. For additional details about understanding and correcting file-system errors, see the Guardian Procedure Errors and Messages Manual. If possible, correct the error and reenter the command that encountered the error. Otherwise, see your system manager. Filename given could not be opened, old file still in use Cause A command tried to access a file that was in use. Effect The command fails.
PAGE 491
D Operational Limits Table 17 Operational Limits for RDF/IMP, IMPX, and ZLT Limit Description Maximum Value Number of volumes being protected 255 Number of volumes in an SMF pool on backup system 21 Number of auxiliary image trails 255 Number of files per updater 3000 Number of RDF configurations with the same primary system 37 Number of systems that can contribute audit to a primary system 255 Maximum number of image trail file primary and secondary extents 65,500 Maximum number of primary
PAGE 492
E Using ASAP ASAP (Availability Statistics and Performance) allows many subsystem entities to be monitored across a network of NonStop servers. The status and statistics for the entities are collected on a single system, and are then monitored either through the ASAP command interface or through the ASAP graphical user interface (GUI) PC client.
PAGE 493
Figure 17 The RDF/ASAP Environment RDF SGP RDF SGP Updaters Monitor Extractor Purger Receiver Image Trail Primary Backup Installation The RDF SGP is packaged with the RDF/IMP and IMPX products and, by default, is installed on $SYSTEM.RDF. You might, however, place this object file wherever you want. If you install the SGP object file somewhere other than $SYSTEM.
PAGE 494
DOMEE is the control subvolume and TANDA is the RDF Backup System without '\'. Adding and Removing RDF Environments The RDF SGP performs the auto detection and processing of the RDF environments added through the MONITOR command when the process starts. If RDF environments are added or removed while the RDF SGP is running, ASAP does not monitor them until the next time the RDF SGP is stopped and restarted.
PAGE 495
Table 18 RDF Metrics Reported by ASAP (continued) Information Passed to ASAP Monitor Extractor Receiver Imagetrail Purger RDFNET Updater RTD Time X X X — — — X 1 Primary CPU X X X — X X X Backup CPU X X X — X X1 X Priority X X — X X1 X 1 2 X Only in an RDF Network environment Only reported by the master receiver where the master image trail (MIT) volume is reported RDF Metrics Reported by ASAP 495
PAGE 496
Index * wildcard character, 261 900, File code, 53 ? wildcard character, 261 ] prompt, 88 views, 54 volume names, 49 BACKUPSWAP parameter, 220 BACKUPSYSTEM network attribute, 287 BACKUPSYSTEM parameter, 201 Bracket prompt (]), 88 A C Abbreviations, 192, 340 ADD command, 181, 339 ADD EXTRACTOR command, 77, 81 ADD MONITOR command, 80 ADD RECEIVER command, 82, 83 ALTER command, 183, 339 ALTER command, FUP, 63 Altering TMF configuration, 67 ASAP, Using with RDF, 47, 492 Asterisk wildcard character, 261 AT c
PAGE 497
VALIDATE CONFIGURATION, 251, 347 RDFSCAN AT, 255, 347 DISPLAY, 255 EXIT, 256, 347 FILE, 257, 348 HELP, 258, 348 LIST, 258, 348 LOG, 259, 348 MATCH, 260, 348 NOLOG, 260, 261, 348 SCAN, 262, 348 RDFSCAN commands DISPLAY, 347 STATUS RDF, 102 Communications estimating required resources, 49 RDF requirements, 48, 50 Communications line failure, 115 Comparing SQL/MX tables, 328 CONFIG file description, 354 Configuration backup system, 48 command file, creating for RDF, 85 extractor process, 77, 81 monitor process
PAGE 498
Error messages, file system, 357 Error recovery create operation, 112 modify operation, 111 open operation, 111 RDF error 700, modify operation, 111 RDF error 705, open operation, 111 RDF error 739, create operation, 112 Event log, scanning messages in EMS, 29, 107 Exception files description, 354 examining, 353 records, 28 EXCLUDE clauses, 271 EXIT command, 189, 256, 340, 347 Expand estimating required resources, 49 multi-CPU paths (superpaths), 284 EXPAND line failure, 115 Extractor process, 32 attributes
PAGE 499
TMF, 67 INITTIME parameter, 70, 202 Installing the RDF subsystem, 64 K Keywords, 92 L Label modifications, file, 55 Licensed programs, 65 Line failure, 115 LIST command, 258, 348 LOCATION option, 326 Lockstep gateway messages, 305 Lockstep operation, 46 LOG command, 259, 348 Log device, messages sent to, 356 Log file, 254 $0, 357 description, 28 example, 108, 356 messages in, 107 scanning messages in, 29 specifying in RDFSCAN, 257, 348 Log, EMS event, 107 LOGFILE parameter, 220 M Managing the RDF subsyst
PAGE 500
OBEYFORM option, 195 of INFO command, 85 OBEYVOL command, 209 ODBC catalog changes, 151 Offline synchronization for a single partition, 323 Online database synchronization, 157 phases of, 172 Online help RDFCOM, 97 RDFSCAN, 100 Online initialization, 69 OPEN command, 209, 342 Open operation error recovery, 111 file-system errors, 111 RDF errors, 111 Operating system RDF requirements, 50 security, 65 Operating the RDF subsystem, 88 Operations, RDF subsystem, 31 OUT command, 85, 211, 342 Outages planned, 25 u
PAGE 501
log file, 28, 254 log file example, 356 managing, 88 messages, 356, 357 messages, scanning, 29, 107 network transactions, 286 NonStop process pairs, 31 operating, 88 operations, 31 parameters, 220 BACKUPSWAP, 220 BACKUPSYSTEM, 201 CPUS, 213, 215, 216, 217, 218, 223, 224, 226, 228, 343 EXTENTS, 218, 224 image trail, 77 IMAGETRAIL, 188 IMAGEVOLUME, 228 LOGFILE, 220 monitor process, 80 PRIMARYSWAP, 220 PRIORITY, 213, 215, 216, 217, 218, 223, 224, 226, 228, 343 PROCESS, 213, 215, 216, 217, 218, 223, 224, 226, 2
PAGE 502
RDFRCVO licensed program, 65 receiver object file, 64 security requirements, 66, 67 RDFSCAN description, 254 description of use, 29, 107 ending a session, 99 help text file, 64 messages, 489 object code file, 64 online help, 100 running, 98 security requirements, 67 starting a session, 99 wildcard characters in match patterns, 261 RDFSCAN commands AT, 255, 347 DISPLAY, 255, 347 EXIT, 256, 347 FILE, 257, 348 HELP, 258, 348 LIST, 258, 348 LOG, 259, 348 MATCH, 260, 348 NOLOG, 260, 261, 348 quick reference, 347
PAGE 503
restoring, 326 START RDF command, 86, 235, 346 START TMF command, TMFCOM, 86 START TRANSACTION command, TMFCOM, 63, 67 START UPDATE command, 87, 236, 346 Starting the RDF subsystem, 86 Starting the TMF subsystem, 86 STATUS RDF command, 102, 237, 346 STOP RDF command, 242, 346 STOP SYNCH command, 245, 346 STOP UPDATE command, 139, 245, 347 stop-update-to-time, 139, 245, 409 Stopping the backup system after a primary system failure, 122 Stopping the RDF subsystem by stopping the TMF subsystem, 122 from the ba
PAGE 504
RDF errors, 110 restart point, 39 restart points, error recovery, 110 Updater, failure, 117 UPDATERDELAY parameter, 220 UPDATEVOLUME parameter, 228 User interfaces, RDF subsystem, 28 V VALIDATE CONFIGURATION command, 251, 347 Views, NonStop SQL/MP, 54, 60 Volume audited on backup system, 59 configuration, 49 failure, TMF, 118 limit, 49 mapping, 49 mapping primary to backup, 59 names, 179 names, different on primary and backup, 59 VOLUME command, 209 VOLUME parameter, 188 W Wildcard characters in match pat