HP NonStop RDF System Management Manual for J-series and H-series RVUs (RDF 1.9) HP Part Number: 529826-006 Published: June 2009 Edition: J06.03 and subsequent J-series RVUs and H06.
© Copyright 2009 Hewlett-Packard Development Company, L.P. Legal Notice Confidential computer software. Valid license from HP required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor’s standard commercial license. The information contained herein is subject to change without notice.
Table of Contents About This Document.......................................................................................................23 Supported Release Version Updates (RVUs)........................................................................................23 Intended Audience................................................................................................................................23 New and Changed Information in This Edition...............................................
Online Database Synchronization...................................................................................................54 Online Dumps of the Backup Database..........................................................................................54 Subvolume-Level and File-Level Replication..................................................................................54 Shared Access DDL Operations................................................................................................
Separating NonStop SQL Tables................................................................................................70 Compressing Audit Data for Tables and Files...........................................................................70 Preparing the Backup System...............................................................................................................70 Synchronizing the Primary and Backup Databases........................................................................
PRIMARYSYSTEM Attribute................................................................................................90 BACKUPSYSTEM Attribute.................................................................................................91 REMOTECONTROLSUBVOL Attribute..............................................................................91 PNETTXVOLUME Attribute................................................................................................
Process Priority.........................................................................................................................117 EMS Logs (Collectors)..............................................................................................................117 RETAINCOUNT.......................................................................................................................117 PURGETIME.............................................................................................
Monitor Considerations...........................................................................................................143 Updater Considerations...........................................................................................................143 Takeover and Triple Contingency..................................................................................................143 Checking Exception Files for Uncommitted Transactions............................................................
Duration and Preparation Issues..............................................................................................170 SYNCHDBTIME Issues............................................................................................................170 Enscribe Create Records......................................................................................................170 Stop-RDF-Updater Records................................................................................................
Usage Guidelines...........................................................................................................................188 Output Displayed..........................................................................................................................190 Examples........................................................................................................................................190 RDFCOM-Related Filenames and Process Identifiers...............................
RDF State Requirement............................................................................................................205 Examples..................................................................................................................................205 INFO..............................................................................................................................................206 Where Issued...........................................................................
Examples..................................................................................................................................223 SET IMAGETRAIL.........................................................................................................................224 Usage Guidelines......................................................................................................................224 SET MONITOR..................................................................................
Output Displayed.....................................................................................................................240 Examples..................................................................................................................................240 SHOW RDF Command............................................................................................................240 SHOW RECEIVER Command....................................................................................
Where Issued............................................................................................................................255 Security Restrictions.................................................................................................................255 Usage Guidelines......................................................................................................................255 Limitation......................................................................................
10 Triple Contingency...................................................................................................271 Overview.............................................................................................................................................271 Requirements......................................................................................................................................271 How Triple Contingency Works.........................................................
NETWORKMASTER Attribute.....................................................................................................296 Network Configuration Record.....................................................................................................296 PRIMARYSYSTEM Network Attribute....................................................................................296 BACKUPSYSTEM Network Attribute.....................................................................................
16 NonStop SQL/MX and RDF...................................................................................323 Including and Excluding SQL/MX Objects.........................................................................................323 Creating NonStop SQL/MX Primary and Backup Databases.............................................................323 Creating a NonStop SQL/MX Backup Database From an Existing Primary Database......................
ALTER............................................................................................................................................349 COPYAUDIT..................................................................................................................................350 DELETE..........................................................................................................................................350 EXIT....................................................................
RDF System Files.................................................................................................................................362 RDF File Codes....................................................................................................................................364 C Messages...................................................................................................................365 About the Message Descriptions.....................................................
List of Figures 1-1 1-2 1-3 1-4 1-5 1-6 6-1 6-2 6-3 6-4 6-5 10-1 17-1 17-2 17-3 E-1 20 Basic RDF Configuration...............................................................................................................33 RDF Topologies.............................................................................................................................37 RDF Tasks to Maintain a Copy of a Database...............................................................................
List of Tables 1-1 2-1 2-2 3-1 4-1 4-2 4-3 4-4 4-5 5-1 5-2 5-3 8-1 8-2 9-1 D-1 E-1 Audit Records at the Time of a Primary System Failure..............................................................34 RDF Hardware Requirements.......................................................................................................57 Software Requirements.................................................................................................................
List of Examples 1-1 1-2 1-3 22 Reciprocal Replication...................................................................................................................50 Chain Replication..........................................................................................................................51 Invalid Chain Replication..............................................................................................................
About This Document The Remote Database Facility (RDF) subsystem enables users at a local (primary) system to maintain a current, online copy of their database on one or more remote (backup) systems, protecting stored information from damage that might occur at the primary system. RDF accomplishes this by sending audit trail information, generated at the primary system by the NonStop Transaction Management Facility (TMF) product, over the network to the backup system.
• • • • • • • • Added information on running a TAKEOVER command using an OBEY file/IN File in “Issuing the TAKEOVER Command in an Obey File” (page 142) and “TAKEOVER” (page 255). Added information about FASTUPDATEMODE in “Near Real Time Read Access to Updates on the Primary System” (page 149) and “SET RECEIVER” (page 232). Added information on support for long filenames in “Process File Names” (page 358). Added the figure for Triple contingency under “Using ZLT to Achieve the same Protection” (page 276).
where to look for the information you need, based upon the responsibility you have or the kind of tasks you perform at your site: Responsibility Chapter/Appendix System manager All System operator 1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, A, C, D, E Database administrator 1, 2, 3, 5, 6, 7, 10, 11, 12, 13, 14, 15, 16, 17, A, B, C, D, E System analyst 1, 2, 3, 10, 11, 12, 13, 14, 15, 16, 17 Application designer 1, 2, 10, 11, 12, 14, 15, 17, A, B, C, D The chapters and appendixes contain t
Notation Conventions General Syntax Notation This list summarizes the notation conventions for syntax presentation in this manual. UPPERCASE LETTERS Uppercase letters indicate keywords and reserved words. Type these items exactly as shown. Items not enclosed in brackets are required. For example: MAXATTACH Italic Letters Italic letters, regardless of font, indicate variable items that you supply. Items not enclosed in brackets are required.
{ } Braces A group of items enclosed in braces is a list from which you are required to choose one item. The items in the list can be arranged either vertically, with aligned braces on each side of the list, or horizontally, enclosed in a pair of braces and separated by vertical lines. For example: LISTOPENS PROCESS { $appl-mgr-name } { $process-name } ALLOWSU { ON | OFF } | Vertical Line A vertical line separates alternatives in a horizontal list that is enclosed in brackets or braces.
!i and !o In procedure calls, the !i notation follows an input parameter (one that passes data to the called procedure); the !o notation follows an output parameter (one that returns data to the calling program). For example: CALL CHECKRESIZESEGMENT ( segment-id , error ) ; !i !o !i,o In procedure calls, the !i,o notation follows an input/output parameter (one that both passes data to the called procedure and returns data to the calling program).
A group of items enclosed in brackets is a list of all possible items that can be displayed, of which one or none might actually be displayed. The items in the list can be arranged either vertically, with aligned brackets on each side of the list, or horizontally, enclosed in a pair of brackets and separated by vertical lines.
Manuals for other software products that contain information helpful to RDF users include: • • • • • • • SQL/MX Installation and Management Guide and the SQL/MP Installation and Management Guide, which explain how to install the NonStop SQL/MX and SQL/MP relational database management systems and how to plan, create, and manage SQL/MX and SQL/MP databases and applications.
1 Introducing RDF This manual describes the Remote Database Facility (RDF) subsystem as implemented in version 1, update 9 of the HP NonStop RDF/IMP, IMPX, and ZLT independent products. Customers who install RDF 1.9 can use existing RDF configuration scripts provided the scripts are not making use of new functionality.
operates, such as transactions, audit trails, and audit volumes. You should understand how TMF software uses elements like before-images, after-images, and control records. In addition, you should also understand the TMF processes that perform backout, volume recovery, and file recovery. If you are not familiar with this information, you should read TMF Introduction.
Figure 1-1 Basic RDF Configuration In Figure 1-1, there are 20 audited volumes on the primary system ($D1 through $D20). Only volumes $D1 through $D15, however, are configured for RDF protection. Audit records for volumes $D1 through $D10 and $D16 through $D20 are sent to the master audit trail (MAT). The RDF master extractor process reads the MAT and sends audit records associated with volumes $D1 through $D10 to the RDF master receiver process on the backup system.
An unplanned outage typically occurs as the result of a sudden disaster that prevents the database on the primary system from being used. The classic purpose of RDF is to make rapid recovery from an unplanned outage possible by maintaining a replicated database on a backup system. When the primary system is unexpectedly affected by a disaster, you can shift operations to the replicated database on the backup system after having the RDF updaters bring the backup database to a consistent state.
101, a single update was logged in the MAT and sent to the backup system, but the primary system was brought down before the transaction was completed. When the command for a takeover is issued, the updater processes treat all transactions whose outcomes are not known as aborted transactions. In this scenario, only the changes related to transactions known with certainty to have been committed on the primary system are left in the backup database.
— — — — • A single primary system whose database changes are replicated to databases on multiple backup systems. Such an environment makes possible simultaneous read-only access to all of the backup databases (this is desirable for query-intensive applications such as telephone directory assistance). Triple contingency—a special instance of the database replication feature whereby a single primary system is protected by two identical backup systems.
Figure 1-2 RDF Topologies • • Supports master and auxiliary audit trail protection; RDF can protect all tables and files that are being audited by TMF, whether they are associated with the Master Audit Trail (MAT) or an auxiliary audit trail. Subvolume and file replication In addition to volume replication, the RDF/IMP and IMPX products support replication of selected subvolumes and files.
• Economical processing RDF conserves resources at both sites. The extractor typically uses 1% of the resources used by the application on the primary and 4% of the Expand resources. On the backup system the cost of an updater process replicating an update operation is typically 15-25% of the original cost to do the operation on the primary system. On the primary system RDF uses just one process (the extractor) per audit trail to read and transmit audit records to the backup system.
You can peruse messages in the EMS log on your terminal screen by using Viewpoint or whatever other tool you normally use for monitoring $0. When you do that, you are dealing with the entire EMS log (not just RDF messages). To isolate RDF messages from the rest of the EMS log, you can use the supplied EMS filter RDFFLTO with an EMS printing distributor to produce an intermediate entry-sequenced file that you then can scan using the RDFSCAN utility.
Figure 1-3 RDF Tasks to Maintain a Copy of a Database RDF Processes To accomplish its four major tasks, RDF runs different processes on the primary system and the backup system. These processes (the monitor and extractor on the primary system and the receiver, updaters, and purger on the backup system) divide these tasks as summarized in the following pages. The relationship of these processes to one another is illustrated in Figure 1-4.
Figure 1-4 RDF Subsystem Processes Primary System Processes On the primary system: • • The monitor process coordinates most RDFCOM commands involving the main RDF processes (for example, start and stop). Each extractor process reads an audit trail (the MAT or a particular auxiliary audit trail), filters out audit records not relevant to the backup database, transforms the audit record into an image record, and then transmits the image records to an associated receiver process on the backup system.
Backup System Processes On the backup system: • • • There is one receiver process for each configured extractor process. A receiver accepts the image records from its extractor, sorts them, and then writes them to the appropriate RDF image trail. There is one updater process for each primary system volume being protected by RDF. Updater processes read image records from their RDF image trails and pass them to the disk process so that the disk process can perform the logical REDO operations.
NOTE: The discussion and figure that follow are both oriented to the extractor associated with the MAT. For information about protecting auxiliary audit trails, see Chapter 13 (page 291).
explicitly excluded by INCLUDE/EXCLUDE lists), most of the physical audit records generated either for block splits or during FUP RELOAD operations, and all audits generated by the RDF updaters. The extractor always tries to fill the buffer to be sent to the receiver.
Sorted Image Trails RDF maintains its image data on disk volumes specified during RDF configuration. On each of these volumes, the collection of files that contains image data is known as an image trail; that is, there is one image trail per individual image trail volume. The standard image trail used by RDF, called the master image trail, contains the transaction status records that hold key information about whether a transaction has committed or aborted.
With sorted image trails, the activity of any one image file typically remains so low that it can be stored on the same disk volumes as the main database with no significant I/O impact. This approach is not recommended, however, if you require very high RDF performance or if RDF is running with the UPDATE option turned off; in this case, the image trails could eventually fill the volume; in such cases, it is best to have volumes exclusively dedicated to the image trails.
• • • • Issues a logical REDO request to the disk process (during the normal forward pass over the image trail) for each update associated with its volume. Issues logical UNDO requests to the disk process when backing out changes associated with transactions that need to be undone during RDF takeover or stop-update-to-timestamp operations. Bundles the REDO and UNDO requests into batch TMF transactions, the duration of which is specified by the UPDATERTXTIME configuration parameter.
Each updater maintains a file status table to keep track of the files it has open. An updater closes any database file that has not been updated recently. Updaters also close database files when a STOP RDF or STOP UPDATE command is issued, or when the updater restarts because of error conditions. Additionally, if you alter the updater's OPENMODE while UPDATE is ON, then the updater closes all its file and then reopens them with the new OPENMODE.
partition, regardless of whether it is a primary or secondary partition. RDF does not use the file system for partition mapping. Furthermore, because updates to the backup database are applied by logical REDO/UNDO operations, alternate key files and NonStop SQL indexes are not affected by an update to a file or table. Alternate key files or NonStop SQL indexes are updated independently as a consequence of the individual audit records generated on the primary system by TMF software.
Second, because considerable checking must be done across all trails to determine what files can be purged based on what transactions might be represented in the various files on the various image trails, the purger process performs this task. The purger process is a restartable process pair that runs on the backup system (it is started during START RDF and runs even when the updaters are stopped; image files are purged, however, only when updating is enabled).
Example 1-2 Chain Replication System \A System \B System \C RDF Subsystem 1 Primary DB 1 ---------> Backup DB 1 Primary DB 2 ----------> Backup DB 2 RDF Subsystem 2 Thus, system \B is both the backup system in RDF subsystem 1 and the primary system in RDF subsystem 2.
when the updater for RDF Subsystem 2 on \A applies this record to Primary DB 1, it thereby backs out the committed update of your application. Additionally, Primary DB 1 and Backup DB 1 are no longer in synch. Even though the updater on \B had its transaction aborted, that updater will re-apply the application update to Backup DB 1. When done, Primary DB no longer has the update, but Backup DB 2 does.
In the preceding examples, each RDF configuration operates entirely independently of the other RDF configuration primaried on the same node; that is, each RDF system has its own extractor and monitor process. In this way, Expand problems affecting one configuration might not necessarily affect the others (depending on the configuration). RDF Control Subvolume The INITIALIZE RDF command includes a control subvolume suffix parameter (SUFFIX char), where char is an alphanumeric character.
One set of disks can be replicated to another set of target disks to provide a copy of the live database. There are two operational considerations unique to this environment: • • The updaters operate in transaction mode, which means you should not stop TMF before stopping RDF. The RDF takeover operation cannot be performed unless you manually stop the monitor and extractor processes before issuing the TAKEOVER command or include the ! option in the TAKEOVER command.
You should place the RDFCOM component on $SYSTEM.SYSTEM, or you must add the new software location to your TACL search-subvolume list. EMS Support RDF/IMP, IMPX, and ZLT all support the Event Management System (EMS). They direct their command, event, warning, and error messages to an EMS collector in the form of fully-tokenized messages. You can view messages in the EMS log online using Viewpoint or any other tool you normally use for monitoring $0. When you do, so you are perusing the entire EMS log.
For information about this capability, see Chapter 14 (page 295). RDF and NonStop SQL/MX RDF can replicate NonStop SQL/MX user tables and indexes as well as NonStop SQL/MP objects and Enscribe files. For information about this capability, see Chapter 16 (page 323).
2 Preparing the RDF Environment Before RDF can be run on a NonStop system, the system configurations and user applications must meet certain RDF requirements. This chapter explains how to prepare each system for RDF installation and operation, ensuring that all these requirements are met and that you understand the RDF product’s restrictions.
rate of audit transmission from the primary system to the backup system, the database update rate, and whether or not you have copies of your applications installed (in “standby” mode). Sizing the RDF configuration is a complex task that is best carried out by HP personnel. Those personnel can assist you in configuring and sizing your RDF environment using tools and utilities designed and developed as part of the RDF Professional Service. Contact your service provider for further details.
1. Enter a FUP INFO command for the current TMF MAT and record the end-of-file (EOF) value; for example: FUP INFO $AUDIT.ZTMFAT.* CODE EOF LAST MODIF OWNER RWEP TYPE REC BLOCK $AUDIT.ZTMFAT AA000003 134 11292672 10:05 -1 GGGG 2. Enter a FUP INFO command for the current MAT 5 minutes later and record the EOF value; for example: FUP INFO $AUDIT.ZTMFAT.* CODE EOF LAST MODIF OWNER RWEP TYPE REC BLOCK $AUDIT.ZTMFAT AA000003 134 11653120 10:10 -1 GGGG 3.
Table 2-2 Software Requirements Software Requirement Files The RDF/IMP, IMPX, and ZLT products protect only files on the primary system that are audited by the TMF subsystem. Auditing The RDF/IMPX and ZLT products support the use of TMF auxiliary audit trails on the primary system (volumes protected by RDF can store audit data in either the MAT or an auxiliary audit trail). The backup database files are audited, and therefore must also reside on TMF data volumes.
extractor-to-receiver throughput. Please note that altering the value of AUDTITRAILBUFFER can be done offline or online, but if you do it online your new value will not take effect until you take the disk down and then bring it back up. TMF Configuration With Dump Process on the Primary System When you configure TMF with audit dump on, that subsystem dumps an audit trail file to tape or disk before purging the audit trail file. This approach is strongly recommended on the primary system.
backup database, you must also take audit dumps too. For more information see, “SET RDF” command in Chapter 8 (page 187).
DSM Catalogs and File Code 900 All files that have the file code 900 are replicated by the RDF product. These consist of DSM Tape Catalog files as well as some related files. In the case of files having the file code 900, RDF replication of them to the RDF backup system can provide critical information if you later lose the primary system to a disaster.
Designing Transactions for RDF Protection When designing applications containing transactions that update databases protected by RDF, you must consider the following restrictions that apply to the subsystem: • • The effects of network (distributed) transactions after an RDF takeover operation Database operations not replicated by RDF The sections that follow explain these restrictions.
Partitioned Files All partitions of a partitioned Enscribe file or NonStop SQL table or index must reside on volumes protected by RDF, or none should. Corresponding partitions on each system must have the same key values. CAUTION: For partitioned files, it is essential that the partial key value for Enscribe files or first key value for NonStop SQL tables on the backup system exactly match those on the primary system. This is the RDF database administrator’s responsibility.
Configuring an SMF Environment on the Primary System When configuring an SMF environment on an RDF primary system, make sure that SMF catalog files are not replicated by RDF to the backup system. The SMF catalogs on the primary and backup systems must remain independent of each other. There are three ways to do so: • Place the SMF catalog on a primary system volume that is not protected by RDF.
SMF allows physical disks to be added and removed from pools. The RDF updaters must be stopped prior to the addition or deletion of any physical disks from SMF pools on the backup system.
3 Installing and Configuring RDF After preparing your system configurations and user applications to meet RDF requirements, you are ready to install and configure RDF. This chapter, which is intended for system managers, system analysts, and database administrators, describes how to do these tasks.
Separating NonStop SQL Tables It is recommended that you avoid registering NonStop SQL tables protected by RDF in the same catalogs as tables that are not protected by RDF. Separating protected tables from unprotected ones simplifies the comparison of primary system catalogs with backup system catalogs. Compressing Audit Data for Tables and Files Although not required by RDF, using the AUDITCOMPRESS file attribute will enhance RDF performance.
The backup system should also have copies of the following files in case an RDF takeover operation is necessary: • • OBEY command files and TACL scripts containing NonStop SQL/MP or NonStop SQL/MX DDL commands that define the database SQLCI or MXCI report definitions To make it easy to compare catalogs on the primary and backup systems, it is strongly recommended that you register objects protected by RDF in separate catalogs from objects not protected by RDF.
3. Copy the command file or TACL macro to the backup system. Now do the following on the backup system: • • Change any system references in the command file or TACL macro from the primary system name to the backup system name. If the volume names are different or if you want a different database layout on the backup system, change volume references as well. Through the TACL command interpreter, issue an OBEY filename command or run the macro to create the backup database.
CREATE CONSTRAINT EMPNUM_CONSTRNT ON =EMPLOYEE CHECK EMPNUM BETWEEN 1 AND 99999; 6. Create the index for the NonStop SQL/MP table on the primary system: CREATE INDEX =EMPLNAME ON =EMPLOYEE( LAST_NAME, FIRST_NAME ); 7.
system. You must include the AUDITED parameter in both the BACKUP and RESTORE commands. BACKUP $TAPE,($DATA01.*.*,$DATA02.*.*,$DATA03.*.*, $DATA04.*.*), AUDITED RESTORE $TAPE,($DATA01.*.*,$DATA02.*.*,$DATA03.*.*, $DATA04.*.*), AUDITED Synchronizing Databases With FUP You can use the FUP DUP command to copy Enscribe database files from the primary system to the backup system.
Installing RDF The RDF/IMP, IMPX, or ZLT software, and all related documentation, is distributed on three independent product release compact disks (CDs). After loading a CD, double click on the Readme icon for complete instructions on how to install the RDF/IMP, IMPX, or ZLT software. Before installing this product, use NonStop SPR Scout to obtain access to all applicable software product revisions (SPRs).
RDF/ZLT (T0618) Product Components The release CD includes the following components for the RDF/ZLT product: RDF/ZLT The RDF/ZLT enabler module Readme The software documentation file To use the RDF/ZLT product, you must purchase both RDF/IMPX and RDF/ZLT (two separate CDs), install RDF/IMPX, and then install RDF/ZLT.
Table 3-1 RDF Process and Program Security Attributes (continued) Program Name Run Under a Specific Logon ? LICENSE Required for Object File? RDFEXTO YES ++ YES RDFMONO YES ++ YES RDFNETO YES ++ NO RDFPRGO YES ++ YES RDFRCVO YES ++ YES RDFSCAN NO++++ NO RDFSNOOP YES +++ YES RDFUPDO YES ++ YES READLIST NO NO RDIMAGE YES ++ YES + RDFCOM operational commands require super-user group access; however, INFO and STATUS commands can be issued by all users.
• • • RDFSNOOP. The RDFSNOOP program opens the image files in privileged mode and must be licensed with FUP or by running the RDFINST macro. RDFSNOOP can be owned by any user ID. RDFSNOOP must be run by a member of the super-user group (user ID 255,nnn) to read the image files. RDFUPDO. RDF updater programs open image files in privileged mode and must be licensed with FUP or by running the RDFINST macro. RDFUPDO also must be able to open database files for protected write access.
TMF Subsystem Running Previously If TMF was running on the primary system and you have shut the TMF subsystem down, and if you have started TMF on the backup system and added the RDF updater volumes to the TMF configuration, you need not take any other steps with respect to TMF. Proceed to the next task, described in “Initializing RDF”. Initializing and Configuring RDF After initializing and configuring TMF, you are ready to initialize and configure RDF.
Initializing RDF To a TMF Shutdown Timestamp If TMF was running previously on the primary system and did not need to be initialized and configured, you can initialize RDF to a timestamp that reflects the time of the last TMF shutdown. This initialization is typically used when one stops TMF in order to initialize RDF to that TMF stop location. This might be useful if you are about to use RDF for the first time and you stop TMF in order to synchronize your backup database to your primary database.
Determining a Valid inittime Value When using the INITTIME parameter without the NOW clause, it is important that you specify a valid inittime value. To do so, first issue a STATUS RDF command and take note of the highest updater RTD time. Then round that RTD time up to the next higher minute (0:43 becomes 1:00, 1:27 becomes 2:00, 3:04 becomes 4:00, and so forth). Finally, subtract that rounded-up time from the current system time shown in the status display.
tableA (which used to contain it, but now does not), and the audit record will not be applied to the backup database. In this particular case, the database is not corrupted, but data corruption could happen for other NonStop SQL/MP or NonStop SQL/MX DDL SHARED ACCESS operations.
4. 5. 6. 7. Subtract this value from the general timestamp (11AUG2008 05:24). Issue the STOP UPDATE command. This command stops the updaters but allows the extractor and receiver to continue to shipping and storing audit, respectively. Install the new RDF software in a different volume.subvolume from that housing the current version of RDF that is running. For example, if you are upgrading to T0346ABS, you might specify $system.rdfabs. Run $system.rdfabs.
For RDF network environments, you should subtract an additional 15 minutes from the timestamp you calculated in Step 4.
NOTE: Instead of issuing SET and ADD commands interactively within an RDFCOM session, you can create and execute an RDF configuration command file. The first time you configure RDF, you can either configure it interactively or use the text editor to create a command file. After you have configured RDF, you can easily create a command file from the existing configuration file as explained in “Creating a Configuration Command File” (page 96).
UPDATERTXTIME Attribute The UPDATERTXTIME attribute specifies the maximum transaction duration in seconds (from 10 to 300) for all updater processes. The default is 60 seconds. RDF updaters operate in transaction mode. Updater transactions are essentially long-running transactions that pin audit trail files on the backup system and can affect the duration of backout operations if an updater transaction aborts for any reason.
NETWORK Attribute The NETWORK attribute specifies whether or not you are configuring an RDF network. When set to OFF (the default value), an RDF takeover operation provides local database consistency, but it cannot provide transaction consistency for network transactions that involved several RDF backup databases. When set to ON, the RDF subsystem provides database consistency for network transactions that were replicated to other backup databases by other RDF subsystems.
REMOTE STANDBY Attribute The REMOTE STANDBY attribute specifies the system name of the ZLT standby system. node-name must be a valid name and must identify a system in your current Expand network. The default is the name of the backup system. For information about the ZLT capability, see Chapter 17 (page 337). OWNER Attribute The OWNER attribute specifies a userid under which all RDF processes will always run.
NOTE: To have secondary image trails, you must add them after initialization and before RDF has been started for the first time. Also you cannot add secondary image trails until you have configured the receiver, as described in the previous paragraphs. The secondary image trail files have the same extents as the master image trail files. To delete a secondary image trail, you must stop RDF, delete any updaters associated with the particular trail, and then delete the trail.
• • • CPUS PRIORITY WAIT or NOWAIT The PROGRAM parameter specifies the name of a Guardian object file that is executed once RDF has reached a particular state, either after a STOP RDF, REVERSE, or TAKEOVER operation. The INFILE attribute specifies the name of an edit file that will be passed as the IN file to the trigger process when it is created. The OUTFILE attribute specifies the name of a file or process that will be passed as the OUT file to the trigger process when it is created.
RDF network cannot contain two or more RDF subsystems with the same primary system (that is, it cannot contain RDF subsystems for \A to \B and \A to \C). BACKUPSYSTEM Attribute The BACKUPSYSTEM attribute specifies the name of the backup system associated with the specified primary system. There is no default value. REMOTECONTROLSUBVOL Attribute The REMOTECONTROLSUBVOL attribute specifies the name of the control subvolume used by the RDF subsystem configured for the specified primary and backup systems.
back online, the monitor creates its own backup process in the primary processor and then switches control to that monitor process. The PRIORITY attribute specifies the priority at which the monitor will run. You should set the monitor’s priority higher than that of any application’s process. The PROCESS attribute supplies a name for the monitor process. You should specify a meaningful mnemonic such as $AMON or $MON1.
To configure an RDF extractor process named $EXT to run as a process pair in CPUs 5 and 3 of the primary system, at a priority of 185, with an RTD warning threshold of 360 seconds, issue the following commands: ]SET ]SET ]SET ]SET ]SET ]ADD EXTRACTOR EXTRACTOR EXTRACTOR EXTRACTOR EXTRACTOR EXTRACTOR ATINDEX 0 PROCESS $EXT CPUS 5:3 PRIORITY 185 RTDWARNING 60 You can issue ADD EXTRACTOR commands only when RDF is stopped.
UPDATERDELAY attribute in the global RDF configuration record, the updaters can then read the image trails and apply the freshly written audit to the backup database immediately, thereby keeping updater RTD times to the lowest possible value. Because the receiver writes the audit immediately to the image trails after processing each extractor message, having FASTUPDATEMODE set ON can impact extractor-to-receiver throughput.
You can issue ADD PURGER commands only when RDF is stopped. Updater Processes Use SET VOLUME and ADD VOLUME commands to configure the following updater attributes: • • • • • • • • • • ATINDEX CPUS primary-CPU : backup-CPU PRIORITY PROCESS IMAGEVOLUME UPDATEVOLUME INCLUDE EXCLUDE EXCLUDEPURGE INCLUDEPURGE You must configure an updater process for each primary system volume to be protected by RDF.
The following RDFCOM commands configure an updater named $UP01 to run as a process pair in CPUs 2 and 4 at a priority of 180. The updater will be associated with an secondary image trail on the volume $IMAGA1. The name of the backup volume and the primary volume being protected is $DATA01.
Enabling RDF Operations After you have copied all pertinent database files from the primary system to the backup system, installed the RDF software on both systems, initialized and configured TMF on the primary and all backup systems, and initialized and configured RDF, you can then start the TMF and RDF subsystems. You must start TMF on the primary and all backup systems before you can start RDF. Starting TMF To start or restart TMF, issue the TMFCOM command START TMF.
If you later want to start the updater processes, you merely issue a START UPDATE command. Restarting the Applications As the final step in establishing an RDF environment, if you had shut down your applications previously, you can restart them now.
4 Operating and Monitoring RDF To operate and monitor RDF, you enter commands through two online utilities: the RDFCOM and RDFSCAN interactive command interpreters. Through these utilities, you initiate communication with RDF, request various RDF operations or information displays, and terminate communication with the subsystem.
IN command-file specifies a command file from which RDFCOM commands are to be read. RDFCOM reads 132-byte records from the specified file until it encounters either the end-of-file mark or an EXIT command. If you do not specify the IN option, TACL automatically supplies the name of its current default input file—usually the terminal from which you issued the RDFCOM command. Typically, it is very useful to have your RDF configuration commands specified in a text file.
>RDFCOM [control-subvolume] For example, to start a session on a primary system named SANFRAN, you would enter the following command (assuming that no suffix character was specified in the INITIALIZE RDF command): >RDFCOM SANFRAN If the suffix character “3” was specified in the INITIALIZE RDF command, then you would enter the following command: >RDFCOM SANFRAN3 When RDFCOM starts, it searches the specified control-subvolume on $SYSTEM of the local system for the RDF configuration file to open.
• • resume communication with RDFCOM by entering the operating system command PAUSE at the TACL prompt. If you press BREAK when an RDFCOM command that displays information (such as STATUS RDF) is in progress, RDFCOM terminates execution of this command and prompts you for another one. If you press BREAK when an RDFCOM command that changes the RDF configuration or status (such as ALTER RDF) is in progress, RDFCOM continues to execute this command while immediately prompting you for another one.
To run RDFCOM and execute the commands in this file, supply the command file name in the IN option of the command to start RDFCOM: 4> RDFCOM /IN RDFSET/ control-subvolume When it uses a command file in this way, RDFCOM works in batch mode: RDFCOM begins the session, reads and executes each command from the command file, and displays the associated output at your terminal.
You would execute this command as an OBEY file to your TACL prompt. For this example, assume you have been running an RDF subsystem where \Boston is your primary system and \SF is your backup system. You have stopped TMF and RDF, you have reinitialized and reconfigured TMF, and you want to reinitialize, reconfigure, and restart RDF. Recall that before you can initialize RDF you must delete the control subvolumes on both primary and backup systems.
RDF Subsystem PRIM1 \PRIM1 ------------------> \BACK1 RDF Subsystem PRIM2 \PRIM2 ------------------> \BACK2 RDF Subsystem PRIM3 \PRIM3 ------------------> \BACK3 Now suppose these RDF subsystems are running as an RDF network, you have lost PRIM1, you have stopped the applications on PRIM2 and PRIM3, and you want to execute the takeover commands from a single obey file to be executed on BACK1. Here are the commands you would put in the obey file. Assume that you have put RDFCOM on $SYSTEM.
Table 4-1 RDFCOM Configuration Commands (continued) Command Object Function SET RDF; MONITOR; EXTRACTOR; RECEIVER; VOLUME; IMAGETRAIL; PURGER; RDFNET; NETWORK; TRIGGER; Adds option values to the configuration memory table for the specified process. SHOW RDF; MONITOR; EXTRACTOR; RECEIVER; IMAGETRAIL; VOLUME; PURGER; RDFNET; NETWORK; TRIGGER; Lists current option values from the configuration memory table for the specified process.
Table 4-3 RDFCOM Utility Commands Command Object Function EXIT -- Terminates an RDFCOM session. FC -- Enables you to edit (fix) a previously issued command. HELP {ABBREVIATIONS } {ALL } {command Displays help text for commands and messages. } {message-number} HISTORY -- Displays the 10 most recently issued RDFCOM commands. OBEY filename Causes RDFCOM to read commands from the specified command file. OPEN control-subvolume Sets the RDF control subvolume to $SYSTEM.
{ { { { { { PURGER RDFNET NETWORK TRIGGER trigger-type VOLUME $volume $volume } } } } } } Cannot be performed with RDF running. Only a user in the SUPER group can execute this command.
In response, RDFCOM displays the following information: ------------------------------------------------------------| 715 Primary Stopped | ------------------------------------------------------------Cause: The primary process of a NonStop process pair has stopped. This probably was the result of an operator inadvertently issuing a STOP command from TACL. Effect: The backup process takes over, but not in fault-tolerant mode, until the primary process can be re-created.
Enter the RDFSCAN function you want: To begin an RDFSCAN session and open the file $SPOOL.SANFRAN.RDFLOG for scanning, enter: >RDFSCAN $SPOOL.SANFRAN.RDFLOG RDFSCAN displays the following: RDFSCAN - T0346A06 - 14MAR04 (C)1988 Tandem (C)2004 Hewlett Packard Development Company, L.P. File: $SPOOL.SANFRAN.
Table 4-4 RDFSCAN Commands (continued) Command Object Function NOLOG -- Turns off the LOG command. SCAN number Beginning at the current record, examines the specified number of messages in the message file, and displays messages that contain the current match pattern. The complete syntax for all RDFSCAN commands appears in Chapter 9 (page 261).
Scan - Reads "n" lines of the RDFLOG and displays them with optional pattern matching. FILE: \WHICH.$SYSTEM.RDF.RDFLOG, current record: 37501, last record: 37513 Enter the next RDFscan function you want: Introductory Usage Information To display a brief introduction to the purpose, features, and use of RDFSCAN, enter HELP INTRO: Enter the next RDFscan function you want: HELP INTRO In response, RDFSCAN displays: RDFSCAN is a utility for quickly scanning the RDFLOG file.
Receiver (0) Receiver (1) Imagetrail (0) Imagetrail (1) Purger $DATA06 -> $DATA06 $DATA07 -> $DATA07 $DATA08 -> $DATA08 $RRCV0 $RRCV1 $RPRG $RUPD1 $RUPD2 $RUPD3 0:00 185 $MIT 0:00 185 $IMAGE0 $IMAGEA 185 0:06 185 $IMAGE0 0:00 185 $IMAGEA 0:06 185 $IMAGEA 44 1: 2 1: 2 22 3 22 3 3 1: 2 9568 1: 2 811008 2: 3 811568 3: 0 In the STATUS RDF display, the first line gives the name of the primary system (\RDF04 in this example), the name of the backup system (\RDF05 in this example), and the timestamp that s
Table 4-5 RDF States (continued) Status Description Update NSA Stopped RDF had been running with Update On, a Shared Access NonStop SQL/MP or SQL/MX operation was detected, and all updaters have completed their shutdown. Note, you must consult the RDF LOG for either the RDF event 905 or 908 to determine if it is safe for you to perform the DDL operation on the backup system. * Monitor Unavailable * The monitor is either stopped or is running but unable to respond.
The receiver RTD time virtually always mirrors that of the extractor sending to it. The only time it varies is during a receiver restart condition. The value of this RTD time has to a large part become obsolete, but it continues to be displayed for long standing continuity with older RDF releases.
column for any RDF process, you should examine the messages in the RDF log file or on the RDF log device to determine what is happening and what corrective action to take. Except for updaters, asterisks in the Error column continue to appear in every STATUS RDF display until the error condition has been corrected.
To change any of the attribute values listed above, you start RDFCOM and use the ALTER command. ALTER is a restricted command; it can be issued only by members of the super-user group. See the description of the ALTER command in Chapter 8 (page 187). Process Priority All configured RDF processes should run at a priority greater than that of any application process.
Reading Log Messages RDF messages are sent to the EMS log (collector) specified during RDF configuration. If RDF encounters an error while attempting to open or send a message to the configured log, RDF takes the following actions: 1. RDF writes either of the following messages to the local $0 process: "705 File Open Error error# filename" "700 File System Error error# filename" 2. RDF then closes the log (if it is open). The log remains as configured.
NOTE: The record numbers reflected by RDFSCAN are approximate and might not exactly match the record numbers that would be displayed by a FUP INFO RDFLOG, STAT command. With RDFSCAN you can specify: • • • A starting point within the message file The number of records to retrieve Text to search for in the message file RDFSCAN displays those RDF messages that meet the criteria you specify. The following is a sample display for a primary system.
Enter the next RDFSCAN function you want: DISPLAY ON File: $SYSTEM.RDF.RDFLOG, current record: 750, last record: 903, Pattern: *REMOTE* Enter the next RDFSCAN function you want: Record number: 751 2004/06/04 11:20:16 \LAB1 Record number: 752 2004/06/04 11:20:26 \LAB1 $LOST -> $BLOST Record number: 756 2004/06/04 11:20:30 \LAB1 Record number: 758 2004/06/04 11:21:46 \LAB1 $INFO -> $BINFO Record number: 760 2004/06/04 11:22:33 \LAB1 $POPPY -> $BPOPPY File: $SYSTEM.RDF.
5 Critical Operations, Special Situations, and Error Conditions When running RDF, there are a number of critical operations and situations that need careful consideration. Understanding all aspects of these operations and situations is essential. Understanding critical operations ensures that you perform said operations correctly, quickly, and efficiently. Understanding critical situations and error conditions ensures that you achieve resolution as quickly as possible.
Some errors involving one or more updaters might require you to resynchronize certain files; see the EMS event log for further information. Any error that cannot be explained should be reported to your service provider. For information about the causes, effects, and recovery actions for all RDF event messages, see Appendix C (page 365) or at the RDFCOM prompt enter the HELP command followed by the RDF event number.
Table 5-1 Recovery From File Modification Failures (RDF Event 700) (continued) File System Error Recovery Action 200 through 231 Repair the device or clear the condition. 707 Enable the volume for TMF transaction processing. Table 5-2 lists the file-system error numbers and recovery actions for RDF event 705, which reports file-opening failures. Table 5-2 Recovery From File Open Failures (RDF Event 705) File System Error Recovery Action 11 Resynchronize the file.
Table 5-3 Recovery From File Creation Failures (RDF Event 739) (continued) File System Error Recovery Action 60 through 66 Repair the device or clear the condition. 100 Repair the device or clear the condition. 103 Repair the device or clear the condition. 120 through 121 Repair the device or clear the condition. 130 through 139 Repair the device or clear the condition. 157 Check file integrity. 190 Repair the device or clear the condition. 199 Alter the security (probably Safeguard).
Exceeding the Maximum Number of Concurrent File Opens The maximum number of audited files a single updater can have concurrently open is 3,000. If you have more than 3,000 audit files being replicated by a single updater, then it is possible that the updater associated with the volume may report RDF event 813 - "Concurrent file opens exceeds capacity". This happens if the updater has 3,000 files open and it must open a new file.
• • • • Failure of a TMF audited volume on the primary system TMF subsystem failure after which the TMF volume recovery is successful TMF file recovery operation on the primary system that is not to a timestamp, first purge, or TOMATPOSITION position. TMF ABORT TRANSACTION with the AVOIDHANGING option on the primary system RDF cannot recover from the following events: • • TMF file recovery operation to a timestamp, first purge, or TOMATPOSITION on the primary system.
Processor Failures All RDF processes other than RDFCOM run as process pairs. If a CPU failure causes a primary RDF process to fail, the backup process takes over without interruption in service. If any RDF process pair stops unexpectedly, the monitor sends abort messages to the other RDF processes in order to bring about an orderly shutdown of RDF. You can then restart the subsystem by merely issuing a START RDF command.
Purger Failure If the primary CPU of the purger process fails, the purger process in the backup CPU takes over, the current PURGETIME interval is aborted, and a new PURGETIME interval is started. When the CPU that failed comes back up, RDF switches the purger to run on the reactivated primary CPU. If both the primary and backup CPUs of the purger process fail, RDF aborts. RDFNET Failure If the primary CPU of the RDFNET process fails, the RDFNET process in the backup CPU takes over.
~DISABLE DATAVOLS * ~START TMF Notice that these commands prevent any disk volumes on the local system from being enabled for TMF operations before starting the subsystem. 3. Reenable all pertinent disk volumes for TMF operations by entering the following command through TMFCOM: ~ENABLE DATAVOLS * When this command is executed, TMF performs its volume recovery operation on the audited volumes, and RDF reads the audit during this operation. 4.
File Recovery on the Primary System A file recovery operation occurs whenever a TMFCOM RECOVER FILES command is issued at the primary system. A simple file recovery operation does not affect RDF nor does it require database synchronization. A file recovery operation to a timestamp or a first purge, however, does require you to stop RDF, reinitialize, and resynchronize the affected files. The file recovery TOMATPOSITION is a special usage that achieves synchronization itself.
• • On the primary system, reinitialize RDF with the INITTIME option, specifying the calculated timestamp from the above step. Restart RDF. When the updaters have caught up with transaction activity on the primary system, the backup database is once again synchronized with your primary database.
1. 2. Issue the RDFCOM UNPINAUDIT command. If you have only one RDF subsystem configured on your primary system and the control subvolume is the name of the primary system, then this is a simple operation. If, however, you have multiple RDF subsystems configured on the primary system, each with its own set of extractors, then you may need to issue the UNPIN audit command for each RDF subsystem.
enables the RDF processes to resume processing where they stopped before the shutdown, unless an audit trail file that RDF needs has been purged and cannot be restored to disk. Stopping RDF by Stopping TMF The reason for stopping RDF by stopping TMF is to ensure that the primary and backup databases are logically identical when the shutdown is complete (RDF has applied all changes to the backup database).
When you shut down RDF by issuing a TMFCOM STOP TMF command, you can use successive STATUS RDF commands to determine when all of the RDF processes have stopped. Stopping RDF From the Primary System When you issue the STOP RDF command on the primary system, all RDF processes stop immediately without processing to the end-of-file mark in the MAT (except the updaters, which might continue for a short while to finish up their work in progress).
When you issue a STOP RDF command on the backup system, RDFCOM attempts to contact the RDF monitor on the primary system. After discovering that the monitor is not accessible, RDFCOM sends individual stop messages to all RDF processes on the backup system. If RDFCOM can contact the monitor on the primary system, the STOP RDF command is aborted.
Restarting RDF If you want to restart RDF and have it resume processing where it stopped at the previous shutdown, you can only do so if you have not reinitialized RDF subsystem since the shutdown. Use the START RDF command to restart RDF. RDF automatically starts with UPDATE ON unless you explicitly specify UPDATE OFF in the START RDF command. When RDF restarts, it uses the information in the context files to determine where it last stopped, and resumes processing from that point.
When the extractor for the new RDF subsystem running from \B to \A reports an RTD time of 0:00, then you know that extractor has caught up and you can then prepare for another switchover operation to move your application processing back to \A. The planned switchover repeats the procedure described above, except that you reverse the roles of systems \A and \B. After doing so, RDF replication once again occurs from \A to \B.
With reciprocal configurations it is imperative that you make sure the file-sets being replicated by the two RDF subsystems are absolutely independent of each other, and this can only be done in one of two ways: 1. 2. The volumes protected by RDF Subsystem #1 are completely different from the volumes protected by RDF Subsystem #2. For example, RDF Subsystem #1 protects volumes $DATA1-$DATA10, and RDF Subsystem #2 protects volumes $DATA20-$DATA30.
When the extractor for RDF subsystem #1 reports an RTD time of 0:00, then you know that extractor has caught up and you can then prepare for another switchover operation to move your application processing back to \A, as follows: 1. 2. 3. 4. 5. 6. On system \B, create an audited Enscribe file on each data volume in the RDF subsystem #1 configuration. Wait until all of those files are created on system \A. On system \B, stop RDF subsystem #1. Purge the Enscribe files on both systems.
Transactions that must be undone during this undo pass are stored in the ZTXUNDO file in your Master Image Trail subvolume. You can use the READLIST utility to see what transactions were undone by this Local Undo operation. Phase Two Undo Pass This is also known as File Undo.
NOTE: If you do not use the ! option and if the primary system is down, then RDFCOM will need to wait for the Expand level-4 timer to expire. This timer is usually set to 4 or 5 minutes, and this means that the actual takeover processing does not commence until after the timer expires. Compared to all the other non-RDF tasks that need to be completed before you can resume application processing on your backup system, this short delay may not even be noticed.
By using the TAKEOVER ! version of the TAKEOVER command you eliminate the Expand level-four timer and the prompt. For super fast takeover, see “How to Plan for the Fastest Movement of Business Operations to Your Backup System After Takeover” (page 144). Issuing the TAKEOVER Command in an Obey File You have the following three options to issue the TAKEOVER command through an OBEY file. 1.
Takeover Failure If a double CPU failure occurs and any RDF process pair fails during the takeover operation, you can restart the operation just by entering the TAKEOVER command through RDFCOM again. You can ascertain that a takeover operation failed by issuing a STATUS RDF command and getting a response such as the following: STATUS RDF (\RDF04 -> \RDF05) is NOT running A partial RDF TAKEOVER has completed Also, a takeover failure generates RDF event 725 in the EMS log.
Your database administrator can use the RDFSNOOP utility to examine exception records in exception files. For information about RDFSNOOP, see Appendix B (page 359). CAUTION: The absence of exception file records after a successful takeover operation does not necessarily indicate that the backup database is logically identical to the primary database. It is possible that no audit data reached the backup system for some transactions committed on the primary system.
as the RDF takeover operation completes and you will have full TMF protection. For more details see the discussion on “TMF and Online Dumps on the Backup System” (page 154). 4. Most customers require a high-level decision to takeover on the backup system; this is not an automated decision for the majority of RDF users; most require an executive level decision to takeover. a. Make sure your system operators have a hierarchical list of who to contact in case of the loss of your primary system.
d. advantageous to have one application that performs query processing and another that does read/write operations. If your applications have files open for read access, then your operations staff can close files and/or restart the applications while the decision is made to takeover or not. – If not takeover, then allow query processing to resume, as normal; – If takeover, then you have already closed the files and you can proceed with next tasks need to commence work on your backup system. 9.
in order to resume business operations on your backup system. Do it when you can schedule down time or do it during periods of low activity. Since a lot can change over the course of a year, it is a standard disaster recovery practice that you perform this exercise at least once a year. The age old adage is "practice makes perfect", and this certainly applies here. An annual test run can mean a considerable difference between a lengthy RTO versus a rapid RTO.
If the takeover completes on the backup system, the purger logs an RDF event 888 specifying a MAT position (sno, rba). Subsequently, when the primary system is once again online and you are ready to switch the applications back to the primary, you first initiate a TMF file recovery command on the former primary system, using the TOMATPOSITION option with the MAT position from the 888 event.
Reading the Backup Database (BROWSE versus STABLE Access) Unlike databases protected by TMF, backup databases for RDF protection have no locks on rows or records, even while these rows or records are being updated. Therefore, applications can read the backup databases at any time; the data can, however, be inconsistent because reading and updating can occur simultaneously.
Access to Backup Databases with Stable Access Because the RDF updaters work asynchronously with respect to one another and to transaction boundaries, when you use the backup database as a read-only resource you are almost always accessing an inconsistent database, meaning that you normally only have Browse access to the backup database. The following discussions provide three means of achieving Stable access to your backup database without having to perform an RDF Takeover operation.
undo, the updaters read this list, and they read backwards in the image trail, performing logical undo operations on those records that need to be backed out. The following example illustrates the effect of a STOP UPDATE, TIMESTAMP command. In the example, t+number indicates a transid, and the timestamp below reflects the time of most recent commit or abort record in the audit trail.
The only operations that must be performed WITH SHARED ACCESS are merge partitions and move boundaries. It is recommended that you perform all other operations with nonshared access. NOTE: When you make DDL changes to your primary database, you can use the NonStop SQL DDL Replicator product to replicate NonStop SQL/MP DDL changes to your backup database automatically, instead of you having to perform those changes manually on the backup system.
again checks to see all updaters have processed all image audit up to this special record. When the purger generates the RDF event 908, you are now ready to perform steps 2 and 3 above. CAUTION: While the NonStop SQL products allow a DDL change with Shared Access where the target is located on a different node, RDF does not support this. Consider an example where you gave a Table X on your RDF primary system \A and you want to create a new partition for the table on \B.
1. 2. 3. Execute a process that opens the image trail file with shared read access. This can be a simple process that you supply to perform only this operation. When the purger determines that all updaters are finished with this image trail file (named, say, AA000007), and have moved on to the next image trail file (named, say, AA000010), then it might try to purge AA000007. The purge operation will fail, however, because your process still has AA000007 open.
Of course, if you are taking online dumps of your backup database, you must also configure TMF to perform audit dumping either to tape or disk. Doing FUP RELOAD Operations With Updaters Running Because the backup database is audited by TMF, you cannot do FUP RELOAD operations on it unless you have altered the RDF UPDATEROPEN attribute to SHARED. Previously you needed to stop the updaters before you could alter this attribute, but RDF now allows you to do this online.
NOTE: If you enter the SCF PRIMARY DISK for an updater's UPDATEVOLUME, the affected updater might report a number of RDF 700 events with the file-system errors 10, 11, and 71. If these errors occur, they will be reported immediately following the disk primary event. In this situation, these errors can be expected and they do not indicate that the backup database has become inconsistent with the primary database.
6 Maintaining the Databases A vital task in working with RDF is to keep the backup and primary databases synchronized with each other.
Figure 6-2 Synchronized Databases During RDF Operations Figure 6-3 shows synchronized databases where the application is running on \PRIMARY and the transaction data for the three new transactions has been applied to the backup database. Figure 6-3 Synchronized Databases, No Outstanding Audit Figure 6-4 shows synchronized databases where TMF has just been shut down.
Figure 6-4 Synchronized Databases After STOP TMF Command Figure 6-5 shows unsynchronized databases. In this figure, T5 and T6 (transactions 5 and 6) have not been transmitted to the backup system because of a physical disaster, such as fire or flood, or because the primary or backup systems have failed.
NonStop SQL/MP or NonStop SQL/MX Databases For NonStop SQL/MP or NonStop SQL/MX databases, changes you need to perform manually on the backup system include: • • • • Catalog changes Results of DDL operations, including creating or altering tables and views Partition key changes Table purges Catalog Changes RDF regards NonStop SQL/MP and NonStop SQL/MX DDL operations like updates to SQL catalogs. Although SQL catalogs are audited tables, RDF does not replicate catalog changes.
you need to restart updating. When restarted, the only updaters that do any work are those that terminated prematurely last time. When they reach the special record, they stop and the purger then logs the event 908. See the section “RDF and NonStop SQL DDL Operations” (page 151) for further discussion.
Guidelines for Create Index and Alter Table Move Operations The following guidelines apply to NonStop SQL/MP and NonStop SQL/MX DDL operations: • Creating an index or loading data into an added table partition does not interfere with RDF protection. Although a CREATE INDEX or ALTER TABLE MOVE FROM FIRST KEY UP TO KEY operation seems to create an audited index or partition within a transaction, only the updates to the catalog and file labels are audited.
a backup index to be different than that of the primary index. In such a case, the index rows transmitted from the primary system to the backup system will be corrupt with regard to their key values. Although the records are physically present in the index on the backup system, NonStop SQL/MP does not see them because the actual key specifier value does not match the expected one.
If you purge a table on the primary system, you must not re-create it on the primary system until you are certain that the updaters have caught up, and you have purged and re-created the table on the backup system.
The remainder of this chapter describes how to do offline resynchronization. For information about online resynchronization, see Chapter 7 (page 167). To resynchronize the primary and backup databases, you need to make all backup database files or tables logically identical to the primary database files or tables when there is no audit data to be processed for the files or tables.
To resynchronize only the affected volume or the individual files/tables on that volume, do the following: 1. 2. 3. 4. 5. 6. Stop your applications. Either Stop TMF or stop RDF using the Drain option (see discussion on this option in Chapter 5) Make a copy of the tables and files that reside on the particular volume. Move the copy of the database taken in Step 3 to your backup system. Restart TMF, if it was stopped in Step 2.
7 Online Database Synchronization With RDF/IMP, IMPX, or ZLT you can synchronize entire databases or selected volumes, files, tables or even partitions while your applications continue to run. For information about NonStop SQL/MX databases, see Chapter 16 (page 323). Overview The RDF online database synchronization protocol consists of the following general steps (the details of which are discussed later in this chapter): • • • Initialize the RDF configuration with the SYNCHDBTIME option.
NOTE: RDF does not replicate NonStop SQL/MP and NonStop SQL/MX catalogs. Therefore, if you are synchronizing NonStop SQL/MP and NonStop SQL/MX tables, you might need to create NonStop SQL/MP and NonStop SQL/MX catalogs manually on the backup system if they do not already exist. Synchronizing Entire Databases Online To synchronize an entire RDF backup database to the primary database online: 1. 2. If RDF is currently running, issue a STOP RDF command on the primary system.
purpose of this command is to enable RDF to determine when the synchronization operation has completed and the backup database is synchronized with the primary database. When the extractor completes its role in the online synchronization operation, it generates the RDF Event 782 and then resumes normal operations. For more detailed information, see “Phases of Online Database Synchronization” (page 183). 6.
Duration and Preparation Issues As indicated in the steps described above, getting a complete copy of your entire database and placing it on the backup system can take quite a bit of time, and you cannot start the updaters until the database is fully prepared on the backup system. This leads to an issue that you must consider. While you are making a copy of the database and then getting it prepared on the backup system, you must run RDF with UPDATE OFF.
through M and a new table (tableB) contains the keys N through Z. Suppose also that you performed this operation manually on the backup system.
• • Partitioned files (key-sequenced or relative). For partitioned files, you can initiate the load operation with a single command by executing the LOAD command against the primary partition. Alternate key files (key-sequenced or relative). You should execute LOAD commands against all alternate key files. Special Consideration for Enscribe Files If you create empty Enscribe files on your primary system, you should create them with the audit attribute set off.
If the file is empty and contains zero records, you must reissue your original command again, and recheck the contents of the target file. FUP COPY QUEUE1, QUEUE2, FIRST 1, SHARE FUP COPY QUEUE2,, H The target file, QUEUE2 in this example, is not ready for synchronization until it has at least one record in it. Therefore, you might need to repeat the above operation until a record appears.
FUP ALTER $DATA1.subvol.file, PART (1,$DATA3) • If you move duplicate Enscribe alternate key files, you must alter the system name in the file label of the duplicate file or table to specify the backup system. For example, if you moved a duplicate Enscribe alternate key file named ALTF0100 associated with the file PART0100, you must change the system name in the file label of the duplicate alternate key file to that of the backup system: FUP ALTER $DATA1.TEST.PART0100 ALTFILE (0,\backup.$DATA.TEST.
set code 4700 set part (1, $data2, 2, 2 ) set altkey (1, file 0, keyoff 6, keylen set altkey (2, file 0, keyoff 6, keylen set altkey (3, file 0, keyoff 6, keylen set altkey (4, file 0, keyoff 6, keylen set altkey (5, file 0, keyoff 6, keylen set altfile (0, $data3.test.altf0200 ) create $data3.test.part0200 set altfile (0, $data3.test.altf0201 ) create $data3.test.
Synchronizing Selected Database Portions Online There are a number of reasons why you might want to synchronize only selected portions of your database. For example: • • • If you have a large database, it might be easier to break the total number of volumes into subsets, and then synchronize one subset at a time. If a file or table has become corrupt, you might want to synchronize just that one file.
option. For the timestamp to be used with the SYNCHDBTIME attribute, you specify a timestamp following the guidelines for the INITTIME option. When you configure a new RDF subsystem, use your existing RDF configuration file. You then follow the guideline for an entire database synchronization operation, except that you only need to obtain a new copy of the one file or partition. Partial Database Synchronization Issues There are many considerations when synchronizing selected portions of a database.
To load the primary partition only, issue the following command: FUP LOAD $DATA1.TEST.PART0100, $DATA1.TEMP.PART0100, PARTONLY,SHARE To load the secondary partition only, issue the following command: FUP LOAD $DATA2.TEST.PART0100, $DATA2.TEMP.PART0100, PARTONLY,SHARE When the load operations are finished, use BACKUP and RESTORE (or FUP DUP) with the PARTONLY option to copy the partition you need to the backup system.
FRNL (Step 4, Method 2) This method can be used for tables with or without SYSKEY or clustering keys. There are no special considerations for key-sequenced tables with indexes, but see below for issues regarding the synchronization of indexes. NonStop SQL/MP and NonStop SQL/MX Tables With Partitions The utilities associated with and related to the NonStop SQL products have limitations that make synchronization of individual partitions complicated and difficult.
Thus, you now have on tape empty partitions for the entire table. Should you ever lose a volume to a complete media failure, you can install a new disk and then use the RESTORE utility with the PARTONLY option to recover the missing partition. Because you have backed up a table with the name you need on the backup system, you can restore any partition that you need to with the PARTONLY option and without having to use the MAP NAMES option.
table with all its partitions onto disk on the backup system. You must use MAP NAMES to correct the system name. Thus, $DATA.DUP.PART is now on the backup system. If you created the duplicate table directly on the backup system, skip this step. 8. 9. 10. 11. 12. 13. 14. 15. Rename the original table on the backup system whose primary partition is being synchronized to a temporary name using the SQLCI ALTER TABLE command ($DATA.TEST.PART becomes $DATA.TEMP.PART).
2. Purge the RDF control subvolume and then issue an INITIALIZE RDF command of the following form on the primary system: INITIALIZE RDF, BACKUPSYSTEM \system, SYNCHDBTIME ddmmmyyyy hh:mm For the timestamp, follow the guidelines for the INITTIME option. 3. 4. Configure RDF and then issue a START RDF, UPDATE OFF command on the primary system.
14. Rename the original table on the backup system from its temporary name back to its original name using the SQLCI ALTER TABLE command ($DATA. TEMP.PART becomes $DATA.TEST.PART). 15. Use the RESTORE utility with the PARTONLY option to put the loaded primary partition of the duplicate table into the correct location. MAP NAMES is not required because the loaded partition now has the correct name on tape and can be restored directly. 16.
Phase 2 Phase 2 completes when the extractor is certain the synch-complete image record has been successfully written in all image trails, and the extractor’s restart location is at a point in the audit trail following the TMP control point record associated with the completion of phase 1, part 3, above. Upon completion of phase 2, the extractor logs message 782. Updater Phase 2 You cannot start the updaters until the extractor has completed phase 2.
Additionally, the STATUS RDF display has been enhanced to identify which database volumes are still being synchronized. That information is reported in the error column of the display. If a volume is still being synchronized, its entry in the error column is sync. As soon as a volume is successfully synchronized, its entry in the error column is blank. If an updater encounters an error during synchronization, the associated entry in the error column is ****.
8 Entering RDFCOM Commands To manage, operate, and control RDF and its environment, you enter commands through the RDFCOM online utility. This chapter, directed to system managers and operators, describes the RDFCOM commands and their attributes.
The default security restrictions for all RDFCOM commands are summarized in Table 8-2. RDF State Requirement Some RDFCOM commands can only be entered after RDF has been started; others must be entered before the subsystem has been started or after it has been stopped. In each command description, these constraints are listed under the heading “RDF State Requirement.” Usage Guidelines Details about the proper use of a command appear in “Usage Guidelines.
Table 8-1 Systems for RDFCOM Commands (continued) Extractor Image Monitor RDF Receiver Purger Trail Update Volume RDFNET Network Trigger Other Objects TAKEOVER B UNPINAUDIT P VALIDATE P Legend P = Primary only B = Backup only E = Either * = SYNCH ** = RTDWARNING Table 8-2 Default User Security for RDFCOM Commands Extractor Image Monitor RDF Receiver Purger Trail ADD S ALTER S S Update Volume RDFNET Network Trigger S S S S S S S S S S S S S S S S COPYAUDIT Other Objects O*
Table 8-2 Default User Security for RDFCOM Commands (continued) Extractor Image Monitor RDF Receiver Purger Trail Update Volume RDFNET Network Trigger OUT A RESET S S S S S S S S S S SET S S S S S S S S S S SHOW A A A A A A A A A A A* A A START STATUS Other Objects O* A STOP A* S* S* A* A* S* A** S*** TAKEOVER O UNPINAUDIT S VALIDATE S Legend: A = All users S = Super-user group only O = owner of RDF subsystem * = Must also have remote password for primar
The system does not distinguish between uppercase and lowercase alphabetic characters in a file name. If all the optional left-hand parts of a file name are present, it is called a fully qualified file name; if any of the optional left-hand parts are missing, it is called a partially qualified file name. For more information about file names and process identifiers and the rules that govern them, see the Guardian Procedure Calls Reference Manual.
device-name specifies the name of a device. A device name consists of a dollar sign ($) followed by one to seven alphanumeric characters; the first alphanumeric character must be a letter. qualifier is an optional qualifier. It consists of a pound sign (#) followed by one to seven alphanumeric characters; the first alphanumeric character must be a letter. ldev-number specifies a logical device number. A logical device number is represented by a dollar sign ($) followed by a maximum of five digits.
ADD The ADD command applies configuration parameter values for the specified process or other object from the RDF configuration memory table to the RDF configuration file. ADD {RDF {MONITOR {EXTRACTOR {RECEIVER {IMAGETRAIL $volume {PURGER {RDFNET {NETWORK {[VOLUME] $volume {TRIGGER trigger-type } } } } } } } } } } RDF applies RDF global configuration parameters. MONITOR applies configuration parameters for the monitor. EXTRACTOR applies configuration parameters for an extractor.
Usage Guidelines With the ADD command, all configuration parameter settings that you previously supplied in SET commands for the particular process or other object are applied from the RDF memory table to the RDF configuration file. Any parameter settings that you did not supply are set to their default values. Each volume on the primary system protected by RDF requires a corresponding updater process on the backup system.
When the preceding command sequence is executed, all of the other RDF global parameters are set to their default values: (In this list, \LONDON is the system at which you issued the command sequence.
trigger-type is REVERSE or TAKEOVER. This command parameter alters a trigger that has already been added to the RDF configuration. Where Issued These commands can be issued only at the primary system, except altering the TAKEOVER trigger, which can also be issued on the backup system if and only if the primary system is not available. NOTE: You should only alter the TAKEOVER trigger on the backup system if you are about to issue the TAKEOVER command.
The following command changes the execution priority of the auxiliary extractor process associated with the auxiliary audit trail AUX02 to 170: ]ALTER EXTRACTOR ATINDEX 2 PRIORITY 170 To change the primary and backup CPUs for the master receiver process to CPUs 3 and 4 respectively, enter an ALTER RECEIVER CPUS command: ]ALTER RECEIVER CPUS 3:4 Remember you cannot do this particular alter operation while RDF is running.
(The RDF control subvolume is A1 on both systems.) RDF Configuration #2: \A ------------------> \C (The RDF control subvolume is A2 on both systems.) Assume you have lost the original primary system (\A), you have successfully completed a takeover on both backup systems (\B and \C), and the MAT positions displayed by the respective 735 messages are: \B: \C: 735 LAST MAT POSITION: Sno 10, RBA 100500000 735 LAST MAT POSITION: Sno 10, RBA 100000000 500 kilobytes of audit records is missing at \C.
on at the time of the error condition, and then resumes copying. Because it keeps track of where it was in the COPYAUDIT operation, RDFCOM does not have to recopy the previously copied image files. RDFCOM abends if it encounters network problems while searching the remote image trails for missing audit records. If that happens, RDFCOM logs a message to the EMS event log, but not to the home terminal. If RDFCOM encounters network problems during any other phase of COPYAUDIT execution, it does not abend.
NOTE: You should only delete the TAKEOVER trigger on the backup system prior to issuing the TAKEOVER command. If you delete the TAKEOVER trigger on the backup system when you are not intending to execute a takeover operation, then you must remember to delete the trigger on the primary system too when the latter comes back online. Failure to do this means that when you start RDF next, RDFCOM will copy the TAKEOVER trigger information over to the backup system, thereby reinstating it on that backup system.
Now assume that RDF is protecting primary system data volume $DATA06, which is configured to auxiliary audit trail AUX01. Assume also that the changes are being replicated to backup system volume $DATA6, and that the updater for that volume is acquiring its audit data from secondary image trail volume $SECITB.
R, I, and D to replace, insert, and delete characters in the command line. If you omit the text parameter, RDFCOM displays the most recently issued command. {?} [ text ] requests RDFCOM to display the most recently issued command that begins with the specified text string. If you omit the text parameter, RDFCOM displays the most recently issued command. {!} [ text ] requests RDFCOM to execute the most recently issued command that begins with the specified text string.
RDF now displays the corrected INFO MONITOR command followed by another prompt that asks for any further corrections. Because you have no further changes, you press the Return key after the subcommand prompt. Now, RDFCOM processes the INFO MONITOR command, this time successfully. ]INFO MONITOR .
RDF State Requirement You can issue the HELP command at any time, whether or not RDF has been started. Usage Guidelines This command retrieves and displays information from the RDFHELP file. If you omit all options, RDFCOM uses the ALL option and lists all RDFCOM commands.
HISTORY OBEY OPEN OUT RDF Concepts: Abbreviations RDF error messages: error-number E.g., "help 700" prints an explanation for the RDF error message 700 To display information about RDF message 715, enter: ]HELP 715 RDFCOM displays the following description: ------------------------------------------------------------| 715 Primary Stopped | ------------------------------------------------------------Cause: The primary process of a NonStop process pair has stopped.
Now, suppose you issue a HISTORY command: ]HISTORY In response, RDFCOM displays: History: ADD EXTRACTOR START RDF SHOW EXTRACTOR ALTER EXTRACTOR PRIORITY 170 SHOW RECEIVER ALTER RECEIVER PRIORITY 175 STATUS RDF ALTER MONITOR PRIORITY 195 INFO * HISTORY INFO The INFO command displays the current configuration parameter values from the configuration file for the specified process or other object.
The subsystem saves the text in the command file, also embedding the appropriate SET and ADD commands. Any time you want, you can execute the text by specifying the command file name in an OBEY command or in the IN option of the RDFCOM command that begins a session, producing a new RDF configuration based on the one captured by the INFO command. You can use the OBEYFORM option with any variation of the INFO command: for example, with INFO EXTRACTOR, INFO RDF, or INFO *.
Output Displayed The parameters displayed for the RDF global options, secondary image trails, and the individual processes are explained under the SET IMAGETRAIL, SET RDF, SET MONITOR, SET EXTRACTOR, SET RECEIVER, SET RDFNET, SET NETWORK, SET PURGER, and SET VOLUME command descriptions. Examples Examples of several INFO commands follow.
VOLUME VOLUME VOLUME VOLUME VOLUME CPUS 2:1 PRIORITY 160 UPDATEVOLUME $DATA1 IMAGEVOLUME $SECIT1 PROCESS $UP01 VOLUME VOLUME VOLUME VOLUME VOLUME VOLUME VOLUME $DATA02 ATINDEX 0 CPUS 2:1 PRIORITY 160 UPDATEVOLUME $DATA2 IMAGEVOLUME $SECIT2 PROCESS $UP02 VOLUME VOLUME VOLUME VOLUME VOLUME VOLUME VOLUME $DATA03 ATINDEX 0 CPUS 2:1 PRIORITY 160 UPDATEVOLUME $DATA3 IMAGEVOLUME $SECIT2 PROCESS $UP03 TRIGGER TRIGGER TRIGGER TRIGGER TRIGGER TRIGGER TRIGGER PROGRAM $SYSTEM.RDF.RDFCOM INFILE $DATA01.RDF.
SET EXTRACTOR RTDWARNING 60 ADD EXTRACTOR INFO MONITOR Command To display the current configuration parameters for the monitor process, enter: ]INFO MONITOR RDF displays output in the following format: MONITOR PROCESS $MON MONITOR CPUS 2:1 MONITOR PRIORITY 170 You would see this particular output, for example, if you originally configured the monitor to run in CPUs 2 and 1 at the default priority of 165, but later changed the priority to 170 (using an ALTER command).
system. The updaters $UP01 and $UP02 are accessing the secondary image trail $SECIT1; updater $UP03 is accessing the secondary image trail $SECIT2.
TRIGGER TRIGGER TRIGGER TRIGGER TRIGGER TRIGGER INFILE $DATA01.RDF.TKOVER OUTFILE $DATA01.RDF.OUTFILE CPUS 0:1 PRIORITY 150 NOWAIT TAKEOVER INFO TRIGGER Command With OBEYFORM Option Like all INFO commands, INFO TRIGGER supports the optional OBEYFORM parameter. The output of an INFO TRIGGER REVERSE, OBEYFORM command might be: SET SET SET SET SET SET ADD TRIGGER TRIGGER TRIGGER TRIGGER TRIGGER TRIGGER TRIGGER PROGRAM $SYSTEM.RDF.RDFCOM INFILE $DATA01.RDF.RDFCONF OUTFILE $DATA01.RDF.
INITIALIZE RDF , BACKUPSYSTEM backup-system-name [ , SUFFIX suffix-character ] [ , TIMESTAMP : ] [ , INITTIME : | NOW ] [ , SYNCHDBTIME :] [!] backup-system-name specifies the backup system. The system name begins with a backslash (\) followed by 1 to 7 letters or digits; the first character following the backslash must be a letter. There is no default system name.
SYNCHDBTIME : is a timestamp used for online database synchronization. It has the same format as the timestamp parameter described above. There are no special considerations for specifying the synchdbtime parameter, except that it must designate a time earlier than the present time. The SYNCHDBTIME parameter can only be used if RDF/IMP, IMPX, or ZLT is installed on both the primary and backup systems.
# causes the control subvolume files both on the primary and backup systems to be purged before initialization. • If used in an interactive mode (either as a command from RDFCOM or in an OBEY file ) without "!" operator, RDFCOM displays: RDFCOM will purge all the files in the control subvolume (of both local & remote systems) if present. Do you wish to proceed? [Y/N] • Processing continues based on User Reply. This operator cannot be used inside an IN file without "!". Where Issued Primary system only.
• • • • • • If you include the TIMESTAMP, INITTIME, or SYNCHDBTIME options in the INITIALIZE RDF command, the initialization will complete much quicker if all the files from the current down to the one in which the timestamp being sought is located are all on disk. If, however, some of these audit files have been dumped to tape, RDFCOM triggers TMF to prompt you to restore needed audit trail files.
• If you include the TIMESTAMP option in the INITIALIZE RDF command, use the following guidelines to determine when you must restore the backup database: — If you are going to start RDF with UPDATE ON, restore the database to the backup system before you start RDF. — If you are going to start RDF with UPDATE OFF, you do not have to restore the database. However, if the need for an RDF takeover arises, you must then restore the database on the backup system before you issue the TAKEOVER command.
Usage Guidelines If you omit system, volume, or subvolume, RDFCOM uses the defaults in effect when RDFCOM was started. A command file can contain other OBEY commands, nested up to four levels deep. RDFCOM reads the commands in the command file until it reaches an EXIT command or the end of the file: • • If it encounters an EXIT command, RDFCOM closes the command file, terminates the RDFCOM session, and passes control back to the TACL command interpreter.
accomplish the same thing—identifying DENVER3 as the RDF control subvolume and then obtaining current status information for that system: • Sequence A: >RDFCOM ]OPEN DENVER3 ]STATUS RDF • Sequence B: >RDFCOM DENVER3 ]STATUS RDF Remember that, when you enter the RDFCOM command without specifying a control subvolume, RDFCOM assumes that the control subvolume name is the same as that of the local system on which the RDFCOM is running (without the backslash and with no suffix character).
Where Issued Primary or backup system. Security Restrictions None; anyone can enter the OUT command. RDF State Requirement You can enter the OUT command at any time, whether or not RDF has been started. Usage Guidelines The OUT command specifies a file to which all subsequent output, other than prompts for entering RDFCOM commands, is to be written during this session. This file will receive listings produced by INFO, SHOW, and STATUS commands.
RDF resets the values for the RDF global options. MONITOR resets the values for the monitor process. EXTRACTOR resets the values for the extractor process (this includes resetting the ATINDEX value to 0). RECEIVER resets the values for the receiver process (this includes resetting the ATINDEX value to 0). VOLUME resets the values for the updater processes (this includes resetting the ATINDEX value to 0 and clearing all EXCLUDE and INCLUDE clauses).
Examples To reset the extractor process parameters in the configuration memory table to their default values, enter: ]RESET EXTRACTOR To reset the extractor process parameters in the configuration file to their default values so that these values now affect RDF, issue the following commands after RDF has been initialized: ]RESET EXTRACTOR ]SET EXTRACTOR PROCESS $EXT ]ADD EXTRACTOR To reset the updater process parameters in the configuration memory table to their default values, enter: ]RESET VOLUME To re
included in its display. The display includes only those RDF processes (extractor or updaters) whose RTD exceeds the configured threshold. The default is 60 seconds. VOLUME volume-name specifies a valid volume name in the current TMF configuration on your primary system. When configuring RDF for ZLT, you must add the complete set of audit trail volumes to which protected data volumes are configured. You use a SET EXTRACTOR VOLUME statement for each individual volume.
]SET EXTRACTOR RTDWARNING 180 ]ADD EXTRACTOR To configure a master extractor in an RDF/ZLT environment, where there are two active volumes ($TMFMAT1 and $TMFMAT2), and one overflow volume ($MATOFLO), issue the following commands: ]SET ]SET ]SET ]SET ]SET ]SET ]ADD EXTRACTOR EXTRACTOR EXTRACTOR EXTRACTOR EXTRACTOR EXTRACTOR EXTRACTOR PROCESS $EXTR CPUS 3:4 RTDWARNING 165 VOLUME $TMFMAT1 VOLUME $TMFMAT2 VOLUME $TMFOFLO SET IMAGETRAIL The SET IMAGETRAIL command associates an image trail with a specific aud
CPUS primary-CPU : backup-CPU identifies the CPUs in which the monitor process is to run as a process pair on the primary system; primary-CPU is the primary CPU; backup-CPU is the backup CPU. Values range from 0 through 15. The defaults are 0:1. PRIORITY priority identifies the execution priority for the monitor process; priority is the execution priority, from 10 through 199. The default priority is 165.
BACKUPSYSTEM backup-system For a network master, specifies the name of the associated backup system. For a nonnetwork master, specifies the name of the network master’s backup system. REMOTECONTROLSUBVOLUME subvolume For a network master, specifies the name of the primary system’s remote control subvolume. For a nonnetwork master, specifies the name of the network master’s remote control subvolume.
PRIORITY priority identifies the execution priority for the purger process; priority is the execution priority, from 10 through 199. The default is 165. PROCESS process-name specifies the process name for the purger process; process-name is any unique, valid process name of up to six characters; the first character must be a dollar sign ($). You cannot specify any of the reserved process names listed in the Guardian Procedure Calls Reference Manual. This parameter is not optional.
from file AA000010 to file AA000013. Files AA000010 through AA000012 might no longer needed, but, because the RETAINCOUNT is set to three, the purger process can only purge AA000010 (it must keep AA000011 and AA000012 on disk). Thus, as long as the RTD times of the extractors on the two backup systems are less than 24 hours apart, the triple contingency protocol will work successfully.
LOGFILE ems-collector-name specifies a device (EMS collector) that is to receive messages from RDF. The specified device must exist on both the primary and backup systems. The default is $0. The device on the primary system receives log messages from the extractor and monitor processes plus RDFCOM messages that are logged in message 835 and messages from RDFNET, if configured.
SOFTWARELOC $volume.subvolume specifies where the RDF software is installed. The default is $SYSTEM.RDF. NETWORK {ON | OFF} specifies whether or not you are configuring an RDF network. When set to OFF (the default value), RDF takeover operations execute and database consistency is not guaranteed for transactions spanning more than one RDF backup database. When set to ON, the RDF subsystem guarantees database consistency across multiple RDF backup systems configured within an RDF network.
{OWNER {owner-id} where owner-id is either groupname,username or groupnumber,usernumber. This parameter specifies the userid under which all RDF processes will always run. This global configuration parameter provides functionality whereby any super-user group userid can start and stop RDF. Once the OWNER attribute is set, you must limit EXECUTE access to the RDFCOM object so that only those super group users authorized to manage RDF can run RDFCOM.
{PRIORITY priority {PROCESS process-name } } CPUS primary-CPU : backup-CPU identifies the CPUs in which the RDFNET process is to run as a process pair on the primary system; primary-CPU is the primary CPU; backup-CPU is the backup CPU. Values range from 0 through 15. The defaults are 0:1. PRIORITY priority identifies the execution priority for the RDFNET process; priority is the execution priority, from 10 through 199. The default priority is 165.
ATINDEX audittrail-index-number is an integer value identifying a configured TMF audit trail on the primary system. 0 specifies the MAT. 1 through 15 specify auxiliary audit trails AUX01 through AUX15. The default is 0. For each configured extractor, there must be a corresponding receiver with the same ATINDEX value. For information about protecting auxiliary audit trails, see Chapter 13 (page 291).
only specify FASTUPDATEMODE ON if your throughput rate is typically low to moderate. In environments with high extractor-to-receiver throughput, specifying FASTUPDATEMODE ON will cause the extractor to fall behind TMF audit generation. See Chapter 3 “Installing and Configuring RDF” for a more complete discussion of this option, and note that for FASTUPDATEMODE to achieve what you want, you must also set the RDF UPDATERDELAY to 1 second. The default is FASTUPDATEMODE OFF. Where Issued Primary system only.
By default, in this example the auxiliary receiver process will run at a priority of 165. SET TRIGGER The SET TRIGGER command sets trigger parameters within the RDF configuration memory table. The supplied values are not applied to the RDF configuration file, however, until you issue an ADD TRIGGER command. The trigger type (REVERSE or TAKEOVER) is specified in the ADD TRIGGER command.
Security Restrictions None. RDF State Requirements None. Usage Guidelines The SET TRIGGER command enters the trigger parameter values specified in this command into the RDF configuration table in memory. This table serves as an input buffer only, so these values do not affect the subsystem until they are applied to the RDF configuration file through the ADD command. Example In the following example, you are configuring an RDF environment to run from \Boston to \London.
{ATINDEX audittrail-index-number {CPUS primary-CPU : backup-CPU {PRIORITY priority {PROCESS process-name {IMAGEVOLUME $volume {UPDATEVOLUME $volume {INCLUDE subvol.file {EXCLUDE subvol.file {MAPFILE $vol.subvol.file {MAPLOG $vol.subvol.file } } } } } } } } } } ATINDEX audittrail-index-number is an integer value from 0 through 15 specifying the audit trail on the primary system to which the data volume being protected is mapped. 0 specifies the MAT.
MAPFILE $vol.subvol.fname specifies the mapfile on the backup system that contains mapping strings that constitute the mapping rules. The updater will apply these rules to the audit records present in the image trail files. This parameter is optional. MAPLOG $vol.subvol.fname specifies the log file on the backup system into which the updater should log the source and target filename pairs if a mapping rule is applied. This parameter is optional. Where Issued Primary system only.
If you want to specify different INCLUDE/EXCLUDE lists for each volume, then you should use the RESET VOLUME command after you ADD each updater. The RESET VOLUME command clears out any INCLUDE/EXCLUDE lists you SET for the previous updater. To view the current INCLUDE and EXCLUDE parameters in the RDF configuration memory table, issue a SHOW VOLUME command. To view the INCLUDE and EXCLUDE parameters for an updater that has already been added, issue an INFO VOLUME or INFO $volume command.
RECEIVER displays the current configuration parameter values for the receiver process. IMAGETRAIL displays the current configuration parameter values for the image trail. PURGER displays the current configuration parameter values for the purger process. RDFNET displays the current configuration parameter values for the RDFNET process. NETWORK displays the current configuration parameter values for an RDF network. TRIGGER displays the values of the TRIGGER attributes as they are currently set in memory.
RDF RDF RDF RDF RDF RDF RDF RDF RDF UPDATERDELAY 10 UPDATERTXTIME 60 UPDATERRTDWARNING 60 UPDATEROPEN PROTECTED NETWORK OFF NETWORKMASTER OFF UPDATEREXCEPTION ON REPLICATEPURGE OFF OWNER SUPER.RDF The primary system name is set implicitly and the backup system name is set in the INITIALIZE RDF command.
• • The updater is to use the secondary image trail $SECIT1 (which was previously added to the RDF configuration by way of an ADD IMAGETRAIL command). You have configured this updater only to replicate file in the subvolume MYFILESET.*, but you do not want to replicate MYFILESET.LOG.
START RDF [, UPDATE {ON | OFF}] UPDATE ON Enables update processing on the backup system; this is the default value. UPDATE OFF Disables update processing on the backup system. RDF image files are not purged from the backup system. Where Issued Primary system only. Security Restrictions You can issue the START RDF command if you are the member of the super-user group that initialized RDF and have a remote password from the RDF primary system to the backup.
Examples To start RDF with updating enabled, enter: ]START RDF To start RDF with updating disabled, enter: ]START RDF, UPDATE OFF START UPDATE The START UPDATE command starts all updater processes on the backup system. START UPDATE Where Issued Primary system only. Security Restrictions You can issue the START UPDATE command if you are a member of the super-user group and have a remote password from the RDF primary system to the backup.
RECEIVER requests information and statistics for the receiver process. PURGER requests information and statistics for the purger process. PROCESS procname requests information and statistics for the specified process. VOLUME requests information and statistics for all configured updater processes. RTDWARNING requests information and statistics for only those processes (the extractor or any updater) that have fallen behind the configured RTD threshold (rtd-time).
Imagetrail Purger $DATA06 -> $DATA07 -> $DATA08 -> (1) $RPRG $DATA06 $RUPD1 $DATA07 $RUPD2 $DATA08 $RUPD3 $IMAGEA 185 0:06 185 $IMAGE0 0:00 185 $IMAGEA 0:06 185 $IMAGEA 3 22 3 3 9568 811008 811568 1: 1: 2: 3: 2 2 3 0 RDFCOM - T0346H09 – 11AUG08 C)2008 Hewlett-Packard Development Company, L.P. Status of \RDF04 -> \RDF05 RDF 2008/08/11 05:26:49.082 Control Subvol: $SYSTEM.
($RRCV0) associated with the MAT and writing to the Master Image Trail ($MIT) and a Secondary Image Trail ($IMAGE0), a second receiver ($RRCV1) associated with AUX01 and writing to a Secondary Image Trail ($IMAGEA1), updater $RUPD1 associated with the MAT reading $IMAGE0 and applying updates to $DATA006, updater $RUPD2 associated with the AUX01 reading $IMAGEA1 and applying updates to $DATA007, and updater $RUPD3 associated with the AUX01 reading $IMAGEA1 and applying updates to $DATA08.
The RTD value reported for each updater process is the difference between the “last modified time” of that updater's audit trail on the primary system and the timestamp added to the image record by the extractor before sending it to the receiver. As is the case with the receiver during an RDF takeover operation, the RTD is replaced by dots to indicate there is no RTD. On a finely tuned RDF backup node, the RTD for an updater can regularly lag 1 to 15 seconds behind TMF processing.
Cpus The eighth column specifies the CPUs in which each process pair is running. Error The final column is used for several purposes. For all RDF processes it is usually blank, which indicates the process is running normally and without any error condition. The following displays can also be reported. **** The specific process has encountered a serious error. You should examine the event log to see what happened.
Examples To display current RDF configuration information and operational statistics once, enter this command: ]STATUS RDF To display that information 10 times, once every minute, enter: ]STATUS RDF, PERIOD 60, COUNT 10 To display current information and statistics for all configured extractor processes once, enter this command: ]STATUS EXTRACTOR To display current information and statistics for only those processes (the extractors or any updater) that have fallen behind the configured RTD threshold (rtd
Security Restrictions You can issue the STOP RDF command if you are a member of the super-user group that initialized RDF and have a remote password from the RDF primary system to the backup. RDF State Requirement You can issue the STOP RDF, DRAIN and STOP RDF, REVERSE commands only while RDF is running and update is on. Usage Guidelines The decision to stop RDF is a management decision that should be carefully planned and performed.
NOTE: Before you can restart RDF, you must stop RDF on the primary system as well. When RDFCOM executes the STOP RDF command, it writes a message to the RDF log file indicating this action. Updaters cannot always respond immediately to a STOP RDF command. If an updater has audit records queued for the disk process, the updater must wait until all of that information is processed before it can shut down.
Usage Guidelines You must wait until the preceding load or TMF FRNL operations have completed before issuing this command. See the descriptions of online database synchronization in Chapter 7 (page 167) for the proper use of this command. Example To issue this command as part of online database synchronization, enter: ]STOP SYNCH STOP UPDATE The STOP UPDATE command suspends updating of the backup database and stops all updater processes.
Usage Guidelines Use the STATUS RDF command to determine whether updating is enabled or disabled. If updating is disabled, the STATUS RDF display specifies the state “Update stopped” and shows no status information for the updater processes. When you disable updating with the STOP UPDATE command, the extractor continues to send all relevant audit from the primary system to the receiver, and the latter stores it in the image trails. Therefore if you STOP UPDATE, you still have full RDF protection.
TAKEOVER The TAKEOVER command puts the backup database into a consistent state with regard to transaction boundaries, after which it can become your new database of record. TAKEOVER [!] If you omit the ! option, then RDFCOM attempts to reach the primary system to verify that it is indeed inaccessible. If it is able to reach the RDF monitor and extractor on the primary system, then the TAKEOVER command is immediately aborted.
In a non-network configuration, a takeover operation occurs in two phases. • • Phase 1 (local undo) undoes transaction data that was incomplete at the backup system at the time the primary system failed. That is, it undoes transactions that were applied during the redo phase but the final states of those transactions are unknown by RDF.
If your primary system is recovered and comes back online, See Chapter 5 (page 121) for how to recover it and use its database as a backup to the database on your backup system where your application processing is not taking place. For further related considerations, see also Exception File Optimization in Chapter 5 (page 121). Limitation When building the undo list for an RDF takeover operation, the purger has a limit of 655,360 transactions.
RDF State Requirement You can only issue the UNPINAUDIT command while RDF is stopped. Usage Guidelines If the system at which you issue the UNPINAUDIT command is the primary system in more than one RDF configuration, then you must open the RDF control subvolume and issue another UNPINAUDIT command for each of the other RDF configurations as well. The TMF audit trail files will not be unpinned until an UNPINAUDIT command has been executed for all RDF configurations that use them.
In response to a VALIDATE CONFIGURATION command, RDF verifies: • • • • • • • • • • • RDF global options are configured. RDF is initialized, and TMF is running on the primary system. The monitor, extractor, receiver, purger, and at least one updater are all configured. The primary and backup CPUs are different from each other for each of the monitor, extractor, receiver, purger, and updater processes. The TMF audit trail referred to by the context file exists (for an RDF restart).
9 Entering RDFSCAN Commands All RDF messages are directed to an EMS event log (collector). To examine that log without looking at all events for the entire system, you first use the standard EMS filter RDFFLTO to create an intermediate entry-sequenced file copy of the RDF log, and then enter commands through the RDFSCAN online utility. This chapter, which is written for system managers and operators, describes the RDFSCAN commands and their attributes.
In addition, this element is included only if applicable: • Output Displayed: Only two RDFSCAN commands (LIST and SCAN) produce output, although others influence its content and destination. For information about the other elements, see “Command Description Elements” in Chapter 8 (page 187). Except for the LOG and NOLOG commands, you can abbreviate the command name by entering only the first character (such as L for LIST) or any number of the leading characters (such as DIS for DISPLAY).
OFF disables the display of record numbers. Usage Guidelines The DISPLAY function is automatically enabled if pattern matching is enabled and is automatically disabled if pattern matching is disabled. For information about enabling and disabling pattern matching, see the MATCH command description in “MATCH” (page 267). Examples Suppose that $SYSTEM.SANFRAN.
Examples If you issue an EXIT command in response to the RDFSCAN prompt, RDFSCAN terminates the session and displays a logoff message: Enter the next RDFscan function you want: Thank you for using RDFscan EXIT If you press Ctrl-Y in response to the RDFSCAN prompt, RDFSCAN terminates the session and displays an end-of-file indication followed by the logoff message: Enter the next RDFscan function you want: EOF! Ctrl-Y Thank you for using RDFscan FILE The FILE command selects a file generated by the RDFF
HELP The HELP command displays the syntax of RDFSCAN commands or introductory information about the RDFSCAN utility. HELP [ ALL ] [ INTRO ] [ command ] ALL displays the syntax of all RDFSCAN commands. INTRO displays information on how to use the RDFSCAN utility. command displays the syntax of the RDFSCAN command indicated by command.
If pattern matching is disabled, the LIST command displays the specified number of messages starting at the current record. This behavior is identical to using the SCAN command with pattern matching disabled. For information about enabling and disabling pattern matching, see the MATCH command description in “MATCH” (page 267).
Usage Guidelines The LIST command always transmits its output to the standard output device for RDFSCAN, which is normally your terminal. When you specify a destination file in the LOG command, RDFSCAN directs subsequent LIST command output to that destination file as well as producing it on the standard output device. That is, with the LOG command, LIST output goes both to your terminal and the file specified in LOG.
If you enter the MATCH command but omit the text parameter, the RDFSCAN prompts you for a match pattern. To disable pattern matching, merely press the RETURN key at the prompt without entering a pattern. When entering a match pattern, you can use asterisks (*) and question marks (?) as wild-card characters. When pattern matching is enabled, the DISPLAY function is automatically enabled; when pattern matching is disabled, the DISPLAY function is automatically disabled.
Usage Guidelines When you issue the NOLOG command, RDFSCAN stops copying records to the file specified in the LOG command. However, RDFSCAN continues to display at your terminal all records accessed by subsequent LIST commands. Examples This command disables the copying of LIST command output: Enter the next RDFSCAN function you want: NOLOG File: $SYSTEM.SANFRAN.
Enter the next RDFSCAN function you want: AT 1000 File: $SYSTEM.SANFRAN.RDFLOG, current record: 1000, last record: 2955 Enter the next RDFSCAN function you want: MATCH *$AU02* File: $SYSTEM.SANFRAN.
10 Triple Contingency The triple contingency feature makes it possible for your applications to resume running with full RDF protection within minutes after loss of your primary system. NOTE: Replication of network transactions is not supported in conjunction with the triple contingency feature, nor is the replication of auxiliary audit trails.
• • (that is, which system had received the least amount of audit data from the extractor by the time the primary system was lost). On the backup system that was further behind (had the least amount of audit data), issue the COPYAUDIT command specifying the name of the other backup system and its RDF control subvolume. That command copies over all missing audit records from the designated system. Upon successful completion of the COPYAUDIT operation, do a second takeover on that system.
It is strongly recommended, however, that the various RDF process priorities be identical on both backup systems so that the performance of the two systems is approximately the same. WARNING! If the two backup systems are configured differently from one another in any important regard, the triple contingency feature will not work when you need it, and there will be no advance warning to that effect.
(and therefore be eligible for purging), but, because the RETAINCOUNT is set to three, the purger process can only purge AA000010 (it must keep AA000011 and AA000012 on disk). Thus, as long as the RTD times of the extractors on the two backup systems are less than 36 hours apart, the triple contingency protocol will work successfully.
500 kilobytes of audit records is missing on \C. Because \C has the least amount of audit records, you must issue this command on \C: COPYAUDIT, REMOTESYS \B, REMOTECONTROLSUBVOL A1 For each image trail, RDFCOM on \C reads its own context file to determine the MAT position of the last audit record in the trail. RDFCOM then searches the corresponding trail on \B to find that audit record and performs large block transfers to move all audit records beyond that point to the trail on \C.
RDF Subsystem #2 \A ---------> \C Because the two subsystems run independently of one another, if system \A fails and you execute TAKEOVER commands on systems \B and \C, the two backup databases might not be synchronized with one another. The extractor for the \A-to-\B subsystem, for example, might have replicated audit data to system \B, but, before the extractor for the \A-to-\C subsystem could replicate the same data to system \C, system \A failed.
Summary To be able to use the triple contingency feature, you must: 1. 2. 3. 4. Establish two RDF configurations with the same primary system and separate backup systems. Ensure that the hardware configurations of the two backup systems are identical with regard to data volumes and image trail volumes. Ensure that the data volumes and image trails of the two RDF configurations are configured identically with respect to the two backup systems (with the few minor exceptions noted earlier in this chapter).
11 Subvolume-Level and File-Level Replication By default, RDF provides volume-level protection, wherein changes to all audited files and tables on each protected primary system data volume are replicated to an associated backup system data volume. RDF/IMP, IMPX, and ZLT also support subvolume-level and file-level replication. To use this capability, you supply INCLUDE and EXCLUDE clauses when configuring updaters to identify specific subvolumes and files you want either replicated or not replicated.
Wildcard Character (*) The asterisk (*) can be used as a wildcard character in both subvolume and file names. Within Subvolume Names When used to designate subvolume names, the * must always be used as a suffix. su*v.fname, *.fname, and *.*, for example, are not valid. But DB*.filename is valid because the asterisk is used as a subvolume name suffix. In this case, changes made to all audited files and tables on all subvolumes whose name starts with DB on the protected data volume are replicated.
SET SET SET SET SET SET SET SET ADD VOLUME VOLUME VOLUME VOLUME VOLUME VOLUME VOLUME VOLUME VOLUME CPUS 1:2 IMAGEVOLUME $IMAGE PRIORITY 185 PROCESS $MM02 UPDATEVOLUME $DATA02 INCLUDE SBSUBVOL.MYFILE INCLUDE SBTEST10.FILE1 INCLUDE SBTEST10.FILE2 $DATA02 RESET VOLUME If you did not use the RESET VOLUME command above, then the INCLUDE lists for $DATA01 and $DATA02 are as follows: $DATA01 MYSUBVOL.MYFILE MMTEST10.FILE1 MMTEST10.FILE2 $DATA02 MYSUBVOL.MYFILE MMTEST10.FILE1 MMTEST10.FILE2 SBSUBVOL.
In the above example, the INCLUDE clause specifies that only audited files in $DATA01.MMTEST10 are to be replicated. The INCLUDEPURGE clause specifies that every Enscribe purge operation involving files in this same subvolume are to be replicated, but the EXCLUDEPURGE clause specifies that any purge operations involving the file $DATA01.MMTEST10.FILE10 are NOT to be replicated. Error Checking Extensive checking is done when the subvolume and file names are parsed, and invalid names cause errors.
SET VOLUME INCLUDEPURGE RRVOL*.* SET VOLUME EXCLUDEPURGE RRTYP*.* There is still one updater responsible for replicating changes from $DATA01 on the primary system to $DATA01 on the backup system, but the INCLUDE and EXCLUDE clauses explicitly identify which subvolumes and files on \PRIMARY.DATA01 are to be replicated (all audited files and tables in the subvolumes MMTEST10, DATA*, and DB* are replicated, except MMTEST10.CONC0826 and any files or tables in DATA* whose names start with "C").
12 Subvolume Name Mapping RDF allows users to replicate data from primary system source subvolumes to differently named destination subvolumes on the backup system. However, the recommended configuration is still one-to-one mapping between source subvolumes on the primary system and their corresponding destination subvolumes on the backup system. One-to-one mapping ensures that each partition of a partitioned file or table is mapped to the correct backup subvolume.
• • • Volume names are not allowed in mapping strings. If the updater detects a $ character, it logs an error. Reserved names are not allowed in mapping strings. See the examples of invalid mapping strings listed below. When two or more mapping rules are present in a mapfile, the rule listed first always takes precedence if it fits. For example, assume these two mapping strings are present: MAP NAMES SUBVOL1.* TO SUBVOL2.* MAP NAMES SUBVOL*.* TO SUBVOL3.
MAP NAMES TEST1.* TO TEST2.* Assume that the file $DATA01.TEST1.FILE on the primary system is modified. RDF applies the mapping rule on this file and replicates its changes to the file $DATA01.TEST2.FILE on the backup system. Next, the file $DATA01.TEST2.FILE on the primary system is modified. RDF determines that the mapping rule does not apply. If RDF had to replicate the changes, they would be replicated to the file $DATA01.TEST2.FILE on the backup system.
Adding a Mapfile and Maplog to an Updater's Configuration Record Use the RDFCOM SET command to store the names and paths for an updater's mapfile and maplog into the updater's configuration record. For example: RESET VOLUME SET VOLUME ATINDEX SET VOLUME CPUS SET VOLUME PRIORITY SET VOLUME PROCESS SET VOLUME UPDATEVOLUME SET VOLUME IMAGEVOLUME SET VOLUME MAPFILE SET VOLUME MAPLOG ADD VOLUME 0 2:1 175 $WU01 $DATA04 $DATA01 $data05.napconfg.mapfile $data05.napconfg.
To illustrate this problem scenario, assume these circumstances: • • • You create an audited, partitioned, key-sequenced file (Enscribe, SQL/MP, or SQL/MX) on the primary system where the primary and secondary partitions are on the same subvolume at $DATA01.SVOL.FILE and $DATA02.SVOL.FILE. One updater replicates the changes for the primary partition $DATA01.SVOL.FILE on the primary system to $DATA11.SVOL1.FILE on the backup system using this mapping string: MAP NAMES SVOL.* TO SVOL1.
13 Auxiliary Audit Trails In addition to the Master Audit Trail (MAT), RDF/IMPX and ZLT support protection of up to 15 auxiliary audit trails. If you want to protect data volumes associated with an auxiliary audit trail, you must configure an auxiliary extractor and an auxiliary receiver for that trail. Thus, for each auxiliary audit trail, there will be one auxiliary extractor-receiver pair. Auxiliary Extractor An auxiliary extractor can only be configured to a single auxiliary audit trail.
• • It is an error if the specified atindex does not correspond to a valid index of a configured auxiliary audit trail. That is, if you have configured two TMF auxiliary audit trails with the respective audit trail numbers of 1 and 2, you cannot configure an auxiliary extractor with an atindex value of 3. It is an error to specify two extractors or two receivers with the same atindex value.
up with the master extractor. When that happens, RDF (or more specifically, the master receiver process) might falsely appear to be hung. As soon as the auxiliary extractor has caught up, however, the TMF shutdown operation proceeds. The same can happen to the updaters when a stop-update-to-time or a SQL shared access DDL operation enters the RDF subsystem, wherein the updaters configured to an auxiliary audit trail may take a long time to shutdown if the auxiliary extractor has fallen behind.
For more information about Expand multi-CPU paths, see the Expand Configuration and Management Manual.
14 Network Transactions The RDF/IMPX and RDF/ZLT products are able to guarantee backup database consistency for transactions that update data residing on more than one RDF primary system. RDF/IMPX and RDF/ZLT can map the volumes being protected to both the MAT and auxiliary audit trails. NOTE: Network transaction processing is currently not supported in configurations that use the triple contingency feature. You must use RDF/IMPX or RDF/ZLT to protect all databases open to network transactions.
NETWORKMASTER Attribute This attribute, located in the RDF configuration record, specifies whether or not the particular system is the master of the RDF network. Each RDF network has one, and only one, network master. To set this attribute, use this RDFCOM command: SET RDF NETWORKMASTER {ON | OFF} When this attribute is set to OFF (the default value), the particular system is not the network master. When this attribute is set to ON, the particular system is the network master of the RDF network.
There is no default value. REMOTECONTROLSUBVOL (RCSV) Network Attribute The remote control subvolume (RCSV) is the name of the control subvolume used by the RDF subsystem configured for the specified primary and backup systems. It is set by this RDFCOM command. SET NETWORK REMOTECONTROLSUBVOL subvolume-name There is no default value. PNETTXVOLUME Network Attribute You only use this attribute when configuring the network master.
the MAT and auxiliary audit trails and send data to the receivers. The updaters read their data from their image trails and apply it to their UpdateVolumes. During normal processing, no RDF subsystem (except the RDFNET process within the network master primary system) interacts with any other RDF subsystem in the RDF network. Therefore, the performance of an individual RDF subsystem is unaffected by its inclusion within an RDF network.
(it is undone during phase 1 on the tenth system). All of the updaters then look for audit data associated with the transaction, and undo it. The purger of the network master determines what network transactions are incomplete across the different backup systems, and it produces the master network undo list. Each purger then uses this master list to ascertain the transaction data that must be undone on its backup database.
Takeover and File Recovery When a takeover operation completes in an RDF network environment, the purger logs two events: one reports a safe MAT position (indicating that all committed data up to that location was successfully applied to the backup database), and the second (888 or 858) reports whether or not a File Recovery position is available for use on the primary system.
More specifically, assume that system \A (the network master) executes: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. T10 (network transaction started on \A) T11 (non-network transaction) T11 commit T10 commit T12 (network transaction started on \A) T12 commit T13 (network transaction started on \B) T13 commit T14 (non-network transaction) T15 (network transaction started on \A) T14 commit T15 commit At approximately the same time system \B executes: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.
T13 preceded the commit for T12. Therefore, the purger determines that these two transactions could not have touched the same data, and T13 does not need to be undone. This illustrates a very important point. With network transactions, the commit sequence of network transactions might differ from one node to another, depending on where the transactions originated. For example, consider two network transactions: T101 and T102. Assume T101 originated on \M and T102 originated on \N.
Network Configurations and Shared Access NonStop SQL/MP DDL Operations Under certain circumstances after a shared access NonStop SQL/MP DDL operation, takeover network undo processing leads to an abort with database corruption. To avoid this problem, use this protocol when performing shared access NonStop SQL/MP DDL operations in a network environment: 1. 2. 3. 4. 5. 6. 7. Issue the RDFCOM STOP RDF command on the primary system where you plan to perform the shared access NonStop SQL/MP DDL operation.
1. 2. The network configuration record must point to the network master of the RDF network. You must ensure that the updater responsible for the PNETTXVOLUME is also configured to the same image trail as that listed in the network master’s network configuration record. Otherwise, validation will fail and you will be unable to start the newly configured subsystem.
It is rare for clocks on different systems to have exactly the same values, thus rendering it impossible for stop-update-to-time operations to perform correctly across multiple backup systems. Sample Configurations Two sample configurations follow, one for the network master and one for a non network master. The network attributes are highlighted in bold.
SET NETWORK REMOTECONTROLSUBVOLUME RDF05 SET NETWORK PNETTXVOLUME $DATA08 ADD NETWORK SET SET SET ADD RDFNET CPUS 0:2 RDFNET PRIORITY 165 RDFNET PROCESS $MNET RDFNET SET SET SET SET SET SET ADD VOLUME VOLUME VOLUME VOLUME VOLUME VOLUME VOLUME ATINDEX 0 CPUS 1:2 IMAGEVOLUME $DATA06 PRIORITY 185 PROCESS $RU43 UPDATEVOLUME $DATA07 $DATA07 RESET VOLUME Sample Non-Network Master Configuration The configuration that follows is for an RDF subsystem running from \RDF05 to \RDF06, where the network master is \
SET SET SET ADD NETWORK PRIMARYSYSTEM \RDF04 NETWORK BACKUPSYSTEM \RDF06 NETWORK REMOTECONTROLSUBVOLUME RDF04 NETWORK SET SET SET SET SET SET ADD VOLUME VOLUME VOLUME VOLUME VOLUME VOLUME VOLUME ATINDEX 0 CPUS 1:2 IMAGEVOLUME $DATA06 PRIORITY 185 PROCESS $RU53 UPDATEVOLUME $DATA08 $DATA08 RESET VOLUME RDFCOM STATUS Display This example illustrates the RDFCOM STATUS display for a network master it includes the RDFNET process). RDFCOM - T0346H09 – 11AUG08 (C)2008 Hewlett-Packard Development Company, L.
15 Process-Lockstep Operation The RDF/IMPX products include the process-lockstep operation, which is process-based. That is, when a process invokes the lockstep operation for a business transaction, the process must wait until all audit records associated with that business transaction are safely stored in image trails on the backup system before continuing. Process-lockstep is not needed with RDF/ZLT because ZLT functionality provides means whereby no committed data is ever lost during an unplanned outage.
The DoLockstep Procedure How you invoke the DoLockstep procedure differs depending on whether your applications are written in COBOL or TAL. Including the DoLockstep in COBOL85 Applications To invoke the DoLockstep procedure from a COBOL85 program, you must first include the DoLockstep object module in the SPECIAL-NAMES paragraph in the CONFIGURATION section. CONFIGURATION SECTION. SOURCE-COMPUTER. HP NONSTOP. OBJECT-COMPUTER. HP NONSTOP. SPECIAL-NAMES. FILE "$vol.subvol.LSLIBTO" IS LOCKSTEP-LIB.
When the RDF receiver has flushed all audit records up to and including the lockstep audit into the image trail, it replies to the extractor that the lockstep data is safe. When the extractor receives that information, it replies to the gateway which, in turn, passes status back to the DoLockstep call, and the latter then returns status to the application.
Multiple Concurrent Lockstep Operations Because DoLockstep suspends the calling application until the associated lockstep transaction commits on the backup system, a single application process cannot have more than one lockstep operation in progress at any one time. Multiple application processes, however, can invoke DoLockstep concurrently.
STARTUPMSG This attribute must include the process name of your RDF extractor (for example, STARTUPMSG "ENABLE $MEXT"). The startup message must also include either ENABLE or DISABLE as the first parameter. Failure to include either of these parameters will cause the gateway to stop. The gateway can only communicate with one extractor. If you have multiple RDF subsystems using the same node as their primary system, only one of them can execute lockstep operations.
transaction is in progress, the RDF gateway merely queues the request. Consequently, if an application process issues a DoLockstep request immediately after the gateway has started a lockstep transaction, that request must wait to be performed until the current lockstep transaction is committed on the backup system. That could also increase response times. Lockstep and Auxiliary Audit Trails You cannot use lockstep processing in an RDF subsystem that is protecting auxiliary audit trails.
errnum is a file-system error number. procname is the name of an extractor process. Cause Effect The RDF extractor is no longer responding, and it might be stopped. The lockstep gateway stops. Recovery Determine why the RDF subsystem stopped, correct the problem, and then restart the subsystem. 5 The lockstep gateway received error errnum from the RDF extractor procname. errnum is a file-system error number. procname is the name of the process that is in use.
8 Error errnum received when attempting to obtain info on file filename. errnum is a file-system error number. filename is the name of a lockstep file. Cause The lockstep gateway received the specified error while attempting to call FILE_GETINFOLIST_ on the specified file. Effect The lockstep gateway stops. Recovery Correct the error condition and restart the lockstep gateway. 9 Error errnum returned when attempting to update file filename. errnum is a file-system error number.
Effect The lockstep gateway stops. Recovery This is an internal error, but the gateway is restarted. If the problem persists, contact the Global Mission Critical Solution Center (GMCSC) or your service provider. 13 Invalid data returned from the RDF Extractor. Cause The lockstep gateway sent a request to the RDF extractor, and the latter returned invalid data. Effect The lockstep gateway stops. Recovery This is an internal error, but the gateway is restarted.
Effect The lockstep gateway stops. Recovery This is an internal error, but the gateway is restarted. If the problem persists, contact the Global Mission Critical Solution Center (GMCSC) or your service provider. 18 Filename formatting error errnum. errnum is a file-system error number. Cause The lockstep gateway received the specified error while attempting to format the lockstep filename. Effect The lockstep gateway stops. Recovery This is an internal error, but the gateway is restarted.
Cause The specified error was returned when the lockstep gateway attempted to open the RDF extractor. Effect The gateway continues trying to open the extractor and will repeat this message every five minutes until the open is successful. Recovery This is an informational message. If you want lockstep operations to resume, and if your RDF subsystem is not currently running, you must restart the subsystem. 23 Position error errnum returned when attempting to position in lockstep file filename.
Recovery This is an informational message; no recovery is required. 27 Lockstep Gateway Started. Cause Effect The lockstep gateway is started. The lockstep gateway continues its initialization activity. Recovery This is an informational message; no recovery is required. 28 RDF extractor procname responded with error errnum to lockstep request. procname is the name of an extractor process. errnum is a file-system error number.
Recovery Correct the STARTUPMSG attribute script and then manually delete the RDF lockstep gateway process from SCF and run your newly edited SCF script. The process name must not include a node name. 32 A spurious STARTUPMSG argument was encountered for the lockstep gateway. Cause In your SCF script for starting the lockstep gateway, the STARTUPMSG attribute contained an extra or unrecognizable argument. Effect The lockstep gateway stops.
16 NonStop SQL/MX and RDF RDF supports replication of NonStop SQL/MX user tables (file code 550) and indexes (file code 552). These operations are supported in much the same way as they are with NonStop SQL/MP, and the same types of data and DDL operations are replicated.
3. If you want each catalog to be seen from both systems, register your primary and backup catalogs. To register the primary catalog on the backup system, issue a REGISTER CATALOG command on the primary system. To register the backup catalog on the primary system, issue a REGISTER CATALOG command on the backup system. The format of the REGISTER CATALOG command is: REGISTER CATALOG catalog ON node.
LOCATION clause when creating the primary system's schema, you must query the primary system to obtain the Guardian subvolume name, and you must use the Guardian subvolume name with the LOCATION clause here. For example, if issued on the backup system, this command creates a schema on the backup system called SCH in catalog BCAT using subvolume ZSDXYZ3A: CREATE SCHEMA BCAT.SCH LOCATION ZSDXYZ3A; 6. Create each object (table or index) on the primary system.
7. Create each object on the backup system. The ANSI name of the object must be constructed as follows: • • • catalog name: use the name of the backup catalog you created in Step 2. schema name: use the name you used in Steps 4 and 5. table or index name: must match on the primary and backup systems. This command creates a table called TAB1 in the schema BCAT.SCH, with three partitions, located on volumes $data02, $data13, $data14, respectively. CREATE TABLE BCAT.SCH.
the name of the subvolume used for the schema on the primary system, then you must query the primary system to obtain the Guardian subvolume name, and you must use the Guardian subvolume name with the LOCATION clause here. See “Creating NonStop SQL/MX Primary and Backup Databases” (page 323). 3. 4. If you want each catalog to be seen from both systems, register your primary and backup catalogs. See “Creating NonStop SQL/MX Primary and Backup Databases” (page 323) for instructions and examples.
6. At the backup system, use the RESTORE utility to place the objects on the backup system, specifying the ANSI names for the backup system. Use the LOCATION clauses to have RESTORE place the objects in the correct Guardian locations. See “Restoring to a Specific Location” for general restore syntax for NonStop SQL/MX databases. For example, assume you have the objects on your primary system that have these fully qualified Guardian names: \pnode.$DATA01.ZSDABCDEF.FILE100 \pnode.$DATA02.ZSDABCDEF.
MXGNAMES utility as described in “Creating a NonStop SQL/MX Backup Database From an Existing Primary Database” (page 326) to generate the LOCATION clauses for the temporary objects, modifying the volume names as necessary and using the primary node name for the -node option. Alternatively, you can use the SHOWDDL command to obtain the fully qualified filenames of the objects you want replicated and specify the same Guardian subvol.
The backup database is now ready for RDF replication, and you can drop the temporary catalog.schema.objects on your primary system. Creating the Fuzzy Copy on the Backup System The advantage of this method is that it eliminates the use of temporary objects as well as tape handling because you create your backup objects directly on the backup system.
2. Load the rows from the primary partition into the backup partition. This requires each catalog to be registered on the other node as described in “Creating NonStop SQL/MX Primary and Backup Databases” (page 323). INSERT INTO backup-table SELECT * FROM primary-table WHERE key-column >= 'F' AND key-column < 'K'; Indirectly From the Primary to the Backup By Way of a Temporary File If the number of rows to load over the network is too great, you can use a temporary file on the primary system: 1. 2. 3. 4.
Perform the steps described in “Online Database Synchronization With NonStop SQL/MX Objects” (page 328). In this case, you are only dealing with a single partition. If you create a temporary table on your primary system, you only need to populate the one partition with the INSERT statement shown in “Offline Synchronization for a Single Partition” (page 330).
Consideration for Creating Backup Tables Currently, you cannot use the CREATE LIKE statement to create backup or temporary tables because CREATE LIKE does not preserve the original Guardian file names that are essential for RDF. At some point in the future when ANSI names are supported, CREATE LIKE will be a viable means of creating backup or temporary tables, but until then the following discussion has the utmost significance.
1. 2. 3. 4. 5. 1. 2. Primary Node: \P Backup Node: \B All volume names are identical on the primary and backup systems. Primary catalog name: PCAT Backup catalog name: BCAT You are restoring four tables from two different schemas in catalog PCAT. Schema information: Primary schema name Schema subvolume Backup schema name PCAT.MYSCHEMA ZSDAAAAA BCAT.MYSCHEMA PCAT.MYSCHEMAX ZSDBBBBB BCAT.MYSCHEMAX Table and Index information: Table or Index Name Guardian Names for partitions and indexes PCAT.
As described in “Creating a NonStop SQL/MX Backup Database From an Existing Primary Database” (page 326), you can use the MXGNAMES utility to automatically generate the correct LOCATION clauses, substituting the backup node name as needed. However, you must remap any nonmatching volume names in these locations manually.
17 Zero Lost Transactions (ZLT) Zero Lost Transactions (ZLT), functionality that is available only with the RDF/ZLT product, ensures that no transactions that commit on the primary system are lost on the RDF backup system if that primary system is downed by an unplanned outage. RDF achieves this though the use of remote mirroring for the relevant TMF audit trail volume(s).
Figure 17-1 ZLT Configuration With a Single Standby/Backup System Figure 17-2 shows the configuration where a single system serves as both the standby and backup systems, and the remote mirror is located at an intermediate site. Figure 17-2 ZLT Configuration With a Single Standby/Backup System and With the Remote Mirror Located at an Intermediate Site Figure 17-3 shows the configuration where individual standby and backup systems are located at separate sites.
Figure 17-3 ZLT Configuration With Standby and Backup Systems Located at Separate Sites If the standby and backup systems are not one and the same, you must remember to set up remote passwords between the standby and backup systems. You must do so with the same userid that has control over starting and stopping RDF. If you lose your primary system due to an unplanned outage, you connect the remote mirrors to the standby system, and then initiate a takeover operation on the backup system.
Using CommitHoldMode If you want absolute ZLT protection, you must configure your audit trails with the COMMITHOLDMODE attribute set to on. Doing so causes each write to the audit trail to be directed to the remote audit trail disk first. If that write fails for any reason, TMF activates commit-hold mode. If CommitHoldMode mode is activated, TMF stops all further commit operations.
the primary CPU for all extractors, it puts the primary processes of the extractors in as many different CPUs as possible to achieve load balancing provided there are enough CPUs. If, for example, you have six extractors configured, but you only have two CPUs on your standby system, the monitor places the primary processes of three extractors on one CPU and the primary processes for the other three extractors on the other CPU.
can have up to 16 active, 16 restore, and 16 overflow volumes, an extractor list can contain up to 48 volume names. Set each volume name: SET EXTRACTOR VOLUME volume-name volume-name must be a valid volume name specified in the current TMF configuration on your primary system. Use a SET statement for each individual volume. You do not need to specify whether the volume is an active volume, restore volume, or overflow volume; you merely specify the volume name.
Each extractor logs RDF event 901 reporting it is started for ZLT processing, starts a special audit-fixup process to fix up the last file in the audit trail (see “The Audit-Fixup Process” below), and sends all remaining audit records to its receiver. When an extractor reaches the end of its audit trail, it sends a “ZLT finished” indication to its receiver, and logs RDF event 900 reporting it has completed its ZLT task. When all extractors are finished, they are terminated and deleted.
If a takeover does not involve ZLT, the extractor is not included in the STATUS display during an RDF takeover operation. With ZLT configured and enabled, the STATUS RDF display changes during an RDF takeover. During phase 1 (ZLT processing), status is displayed for the extractor(s), consisting of process name, sno, rba, cpus, error. The RTD field, however, is left blank.
1. Determine which disks (the local disk on the primary system or the remote mirror on the standby system) for all audit trails in the RDF configuration received the most audit records. The example that follows shows how to do so for the MAT. If your RDF configuration includes one or more auxiliary audit trails, you must do the same for each auxiliary audit trail.
d. e. Once the remote mirror is started, issue SCF START $audit-vol (which causes the revive from -M to -P) Start TMF. When startup is complete, the database on the primary system contains the same data that the database on the backup system had at the conclusion of the RDF takeover operation.
SQL Shared Access DDL Operations Normal support for SQL shared access DDL operations is provided during ZLT takeover operations: • • The updaters are guaranteed to stop at the correct locations. If some of the updaters terminated prematurely while a shared access operation is in the system, only those that had not completed the task are restarted during the next takeover operation.
A RDF Commands Quick Reference The syntax rules for the RDFCOM and RDFSCAN commands, explained in detail in Chapter 8 (page 187) and Chapter 9 (page 261), are summarized in this appendix. This appendix, which is written for system managers and operators, summarizes the syntax descriptions for: • The command to run RDFCOM from the Guardian user interface to the NonStop operating system. See “RDFCOM Run Syntax”. • The RDFCOM commands, listed in alphabetical order, beginning with the ADD command.
{RECEIVER {PURGER {RDFNET {TRIGGER {VOLUME receiver-option } purger-option } netsync-option } {trigger-type } {trigger-option } } updater-option } COPYAUDIT The COPYAUDIT command copies missing audit records from the backup system that has the most to the backup system that has the least. This command is only for use with the triple contingency feature. Where Issued: Backup system only (the backup system with the least amount of audit records).
Where Issued: Primary or backup system. Security: Any user. HISTORY INFO The INFO command displays the current configuration parameter values from the configuration file for the specified process or other object. Where Issued: Primary or backup system. Security: Any user.
Where Issued: Primary or backup system. Security: Any user. OUT [\system.][$volume.][subvolume.][file] RESET The RESET command resets all configuration parameters for the specified process to their default values within the RDF configuration memory table. The corresponding parameters within the configuration file do not change, however, unless you issue an ADD command. Where Issued: Primary system only. Security: Super-user group member.
{CPUS primary-CPU : backup-CPU } {PRIORITY priority } {PROCESS process-name } SET NETWORK The SET NETWORK command sets RDF network configuration parameters within the RDF configuration memory table. The supplied values are not applied to the RDF configuration file, however, until you issue an ADD NETWORK command. Where Issued: Primary system only. Security: Super-user group member.
SET RDFNET The SET RDFNET command sets the designated configuration parameters for the RDFNET process to the supplied values within the RDF configuration memory table. The supplied values are not applied to the RDF configuration file, however, until you issue an ADD command. Where Issued: Primary system only. Security: Super-user group member.
where volume-option is: {ATINDEX atindex {CPUS primary-CPU : backup-CPU {PRIORITY priority-number {PROCESS process-name {IMAGEVOLUME volume {UPDATEVOLUME volume {INCLUDE subvol.file {EXCLUDE subvol.file {INCLUDEPURGE subvol.file {EXCLUDEPURGE subvol.file {MAPFILE $vol.subvol.file {MAPLOG $vol.subvol.file } } } } } } } } } } } } SHOW The SHOW command displays the current parameter values contained in the RDF configuration memory table for the specified process.
{PROCESS procname {VOLUME {RTDWARNING {RDFNET } } } } STOP RDF The STOP RDF command shuts down the RDF subsystem. Where Issued: Primary or backup system (can be issued on the backup system only when all communications lines to the primary system are down). Security: Super-user group member with remote password from the primary system to the backup. STOP RDF { [, DRAIN ] } { [, REVERSE ] } STOP SYNCH The STOP SYNCH command is used as part of the online database synchronization protocol.
RDFSCAN Commands Quick Reference RDFSCAN runs under the Guardian user interface (normally the TACL command interpreter) to the NonStop operating system. The RDFSCAN command starts an RDFSCAN session that lets you enter RDFSCAN commands interactively, noninteractively, or through a command file. Where issued: primary or backup system. Security: Any user. RDFSCAN [ file ] AT The AT command specifies the record in the RDF log file at which RDFSCAN begins the next operation.
NOLOG The NOLOG command disables LIST command copying that was previously enabled by a LOG command. NOLOG SCAN The SCAN command scans a specific number of messages in the log file and displays all of those in that range that contain the current match pattern. SCAN number File Names and Process Identifiers File names and process identifiers sometimes appear as parameters in RDFCOM and RDFSCAN commands. These names typically identify objects such as disk files, log devices, and processes.
B Additional Reference Information This appendix provides additional reference information about: • “Default Configuration Parameters” (page 359) • “Sample Configuration File” (page 360) • “RDFSNOOP Utility” (page 362) • “RDF System Files” (page 362) • “RDF File Codes” (page 364) Process names are also reserved: $X* , $Y* , and $Z*. Certain keywords in the NonStop SQL/MP product are reserved words in SQL commands. Those reserved words are listed in the SQL/MP Reference Manual.
Parameter Default Value(s) MIN MAX RECEIVER FASTUPDATEMODE off n.a. n.a. TRIGGER CPUS 0:1 0 15 TRIGGER PRIORITY 150 10 199 TRIGGER WAIT WAIT n.a. n.a. TRIGGER NOWAIT WAIT n.a. n.a. PURGER CPUS 0:1 0 15 PURGER PRIORITY 165 10 199 PURGER PURGETIME 60 30 1440 PURGER RETAINCOUNT 2 2 5000 VOLUME ATINDEX 0 0 15 VOLUME CPUS 0:1 0 15 VOLUME PRIORITY 160 10 199 VOLUME UPDATEVOLUME $SYSTEM n.a. n.a. VOLUMEIMAGEVOLUME RECEIVER RDFVOLUME n.a. n.a.
SET RECEIVER CPUS 1:2 SET RECEIVER EXTENTS (1000,1000) SET RECEIVER PRIORITY 165 SET RECEIVER RDFVOLUME $GOLD SET RECEIVER FASTUPDATEMODE ON SET RECEIVER PROCESS $MRECV | *** | *** Add the receiver parameters to the | *** RDF configuration file. | *** ADD RECEIVER| *** | *** Add secondary image trails. | *** ADD IMAGETRAIL $SECIT1 ADD IMAGETRAIL $SECIT2 | *** | *** Set the updater parameters for the first | *** volume to be protected by the RDF product. | *** $U01 is the name of this updater.
| *** $DB3 on the backup node corresponds to | *** the volume $DB03 on the primary node. | *** Note that the IMAGEVOLUME parameter is omitted; | *** it defaults to $SECIT2 because it was not reset | *** after the previous ADD VOLUME command. | *** SET VOLUME CPUS 2:1 SET VOLUME PRIORITY 160 SET VOLUME UPDATEVOLUME $DB3 SET VOLUME PROCESS $U03 | *** | *** Add the RDF updater parameters for | *** the third updater process to the | *** configuration file.
commands. The configuration file resides on both the primary and backup node; on both nodes, the configuration file is named: $SYSTEM.control-subvolume.CONFIG • Context file The context file is a key-sequenced file with record length 4062. The context file contains the context information that tells the RDF subsystem where the RDF processes stopped. There is a separate context file on the primary node and the backup node; on both nodes, the context file is named: $SYSTEM.control-subvolume.
The ZFILEINC file resides on the backup node and is named $SYSTEM.control-subvolume.ZFILEINC. • RDFTKOVR file This file records whether an RDF Takeover operation has completed successfully. This file is empty under normal circumstances (eof = 0). If, however, you have executed an RDF Takeover operation and it completes successfully, then they key word "DONE" is written in the file by RDF. This file can be used for executing fast business takeover operations.
C Messages This appendix describes the messages generated by RDF.
(3) The name of the system on which the particular RDF process is running. (4) The name or process ID of the RDF process that issued this message. (5) The message number. (6) The message text that explains the log entry. If the EMS event log is $0 (the default collector), only items (3), (4), (5) and (6) are logged because of file-length restrictions. The pages that follow list all the RDF messages that RDF produces. The messages appear in ascending order by message number.
program is the name of the program file that RDF tried to execute. expected is the expected version number of the program. received is the actual version number of the program, as reflected by the program file. Cause In response to a START RDF command, RDF attempted to execute the designated program file ( program ). The program in that file, however, had a different version number ( received ) than was expected ( expected ). You have installed the wrong version of RDF. Effect RDF stops.
Effect If this message is issued by an updater process, see Table 5-2 in the RDF manual to determine the appropriate recovery actions. The extractor retries OPEN calls for the audit trail files if the error is 11 (file missing), 12 (file in use), or 59 (file is bad). Those errors might occur while the audit trail file is being restored to disk. The receiver retries OPEN calls for the image files if the error is 11 (file missing), 12 (file in use), or 59 (file is bad).
710 TMP is inaccessible Cause An RDF process has tried to obtain audit trail information from the TMF management process pair (TMP), but the TMP was not accessible. The probable cause is that TMF is not currently running. RDF requires that TMF be up and running on the primary and all backup nodes. Effect The RDF process abends. Recovery You must start the TMF product on the affected node. 711 Failure to get process info - error error error is the file-system error number that identifies the specific error.
Recovery See the description of the NEWPROCESS procedure in the Guardian Procedure Errors and Messages Manual to determine the cause of the failure. Once the underlying cause is corrected, the backup process can be created. 714 CHECKPOINT Failure - backup comm error error error is the file-system error number that identifies the specific error. Cause A call to the checkpoint procedure failed, and the backup process of a NonStop process pair is still running.
719 Bad parameter in CHECKPOINT - status nnn nnn is the status word returned by CHECKPOINT. Cause A CHECKPOINT call from the primary process of a NonStop process pair to its backup process failed because of a parameter error. This message indicates a programming problem within RDF. The message includes the status word returned by the CHECKPOINT procedure. Effect The backup process is stopped, and a new one is created after about 15 seconds. Recovery This is an informational message; no recovery is required.
rba is the relative byte address where the error occurred in the audit trail file. Cause The extractor encountered an irrecoverable error at the designated relative byte address (rba) in the designated TMF audit trail file. This message indicates an internal RDF or TMF error. Effect This is a catastrophic error; the extractor abends, and RDF stops. Recovery Because this message indicates a system error, you should preserve the indicated audit trail file for further analysis by your service provider.
Recovery This is an internal error. Contact your service provider. 728 Backup Processor Down Cause The CPU of the backup process of a NonStop process pair failed. Effect The primary process continues to run, but not in fault-tolerant mode. Recovery Reload the downed processor. The backup process is re-created when the processor is reloaded. 729 Attempt to alter process priority failed priority priority is the priority requested for the process.
Cause The updater has found a Stop-RDF-Updater record in the image trail. This special record is generated in the TMF audit trail on the primary system when an SQL DDL operation WITH SHARED ACCESS involving the specified file has completed. Each updater will stop when it reaches this record in the image trail. Effect The updaters stop. Recovery When all updaters have stopped, you must perform the same SQL DDL operation on the RDF backup system that was originally performed on the primary system.
• To compare MAT positions in preparation for the RDFCOM COPYAUDIT command of the triple contingency protocol • To use for File Recovery to a MAT position on the primary system Recovery This is an informational message; no recovery is required. 736 [ANSI-object-type ANSI-name, Partition partition-id,] [File filename] missing on backup system ANSI-object-type is the ANSI object type (for example, table, index, and so on.). ANSI-name is the ANSI name of the SQL/MX object that encountered the error.
filename is the name of the file that the updater tried to create. Cause An updater was unable to create a file on its UPDATEVOLUME disk. The message includes both the file-system error number and the name of the file the updater attempted to create. Effect See Table 5-3 (page 123) to determine the effect of this error. Recovery See Table 5-3 (page 123) to determine the appropriate recovery actions. 740 Create for unprotected RDF volume failed volume.
Cause A fatal error occurred in RDF. Effect RDF stops. Recovery Restart RDF and contact your service provider. 744 FILEINFO obtained on [ANSI-object-type ANSI-name, Partition partition-id,] file filename ANSI-object-type is the ANSI object type (for example, table, index, and so on.). ANSI-name is the ANSI name of the SQL/MX object that encountered the error. partition-id is the partition ID of the SQL/MX object that encountered the error.
748 Internal error - RDF extractor abending Cause The extractor has detected an audit record of an unknown version. Effect The extractor process abends. Recovery This is an internal error. Contact your service provider. 749 Old audit record format encountered Cause The extractor has detected an audit record generated by an unsupported version of TMF. Effect The extractor abends. Recovery Reinitialize RDF. You might need to resynchronize the primary and backup databases.
The message includes the file name, relative block number, and relative byte address of the audit file in question. Effect This is a catastrophic error; the extractor abends, and RDF stops. Recovery This message indicates an internal RDF or TMF error. You must resynchronize the primary and backup databases. Save the audit trail file, and report this error to your service provider. 753 Audit trail file stutter.
Cause The purger has detected that all the updaters have shut down following a successful stop-update-to-time operation. Effect The database is now in a consistent state. Recovery This is an informational message; no recovery is required. 758 Process abending Cause The indicated process is abending. Effect A SAVEABEND file is created, a stack trace is logged, and the process (and its backup process, if any) abends. RDF stops. Recovery Restart the RDF product and report the error to your service provider.
Effect You cannot start the RDF subsystem. Recovery Purge all existing context and configuration files on the primary and backup system. Then initialize the RDF subsystem. 763 Process incompatible with local system Cause The process reporting the error has determined that it has been installed on the wrong operating system. Effect The process abends. Recovery Install the version of the RDF product that is compatible with the installed release of the operating system.
768 Phase one part 3 database synchronization complete Cause The third part of phase one of a database synchronization operation has completed. Effect The extractor continues with phase two. Recovery This is an informational message; no recovery is required. 769 Rolling over to filename filename is the name of the next image file in the sequence. Cause The reporting process has filled the current image file and is ready to begin writing to the next file in the sequence.
Recovery This is an informational message; no recovery is required. 775 Restart position adjusted for database synchronization Cause The extractor has encountered a restart condition during an online database synchronization operation, and its current audit trail restart position sent by the receiver might lead to loss of data. Effect The extractor revises its restart location to an earlier point in the audit trail, thereby guaranteeing that no data will be lost.
Effect If all the records for a transaction are not received on the backup node, the transaction is treated as if it aborted. For every image record that is not applied to the backup database, an exception record is written to the designated exception file. Recovery This is a normal occurrence during TAKEOVER processing. The system manager can use RDFSNOOP to list the image records that were not applied to the backup database.
timestamp is the timestamp specified previously by an operator in an RDFCOM STOP UPDATE, TIMESTAMP timestamp command. Cause A STOP UPDATE, TIMESTAMP timestamp command has been issued and the updater has completed its redo pass. Effect The updater is ready to commence its undo pass. Recovery This is an informational message; no recovery is required. 786 STOP SYNCH message received Cause The extractor has received notification of the RDFCOM STOP SYNCH command.
Effect This error is not fatal; processing continues. The backup process is stopped and then re-created later. Recovery See the description of the CHECKALLOCATESEGMENT procedure in the Guardian Procedure Errors and Messages Manual to determine the cause of the failure. Some corrective action might be required for the backup process to be re-created without repeated failures. The most likely cause is insufficient space available for swapping on the swap volume.
NOTE: Under some circumstances, after the configured backup receiver process re-creates its primary process and switches to that process, it continues to hold an image file open. When the receiver no longer needs this image file and the purger tries to purge it, the purger logs RDF error 797 accompanied by file-system error 12.
For error 43 (unable to obtain disk space for extent), the receiver retries the write operation. All other errors are fatal; the receiver abends, and RDF stops. Recovery The only recovery from an error 43 condition is to free some disk space. You can do that by purging unused files, by using FUP DEALLOCATE to deallocate unused extents (not for image files), or by using DCOM to move extents so that small free areas are combined into a larger free space.
805 WRITEUPDATE error error on file-name error is the file-system error number that identifies the specific error. filename is the name of the file on which the error occurred. Cause The RDFNET process has encountered the specified error on the specified file. Effect The RDFNET process aborts its current transaction, posts a timer, and waits for that timer to expire before attempting a new transaction. Recovery You should determine the cause of the error and then take appropriate corrective action.
Cause The operator issued a TAKEOVER command. Effect RDF starts a TAKEOVER operation. Recovery This is an informational message; no recovery is required. 812 Error error communicating with procname error is the file-system error number that identifies the specific error. procname is the name of the process with which the updater cannot communicate. Cause The updater encountered a file-system error while attempting to communicate with the receiver or purger.
error is a file-system error number. filename is the name of the image file associated with the error. Cause The receiver or purger process has encountered an error while attempting to perform a setmode operation on the specified file. Effect The process abends. Recovery Correct the problem that led to the error and restart RDF. 817 Shutting down in response to STOP UPDATE Cause The operator issued a STOP UPDATE command, and the updater is stopping normally.
821 RDF updater stopped unexpectedly, updater updater is the name of the updater process that stopped. Cause An updater has stopped unexpectedly. The message includes the name of the stopped process. Effect This message is issued by the RDF monitor. The monitor sends an abort request to all remaining RDF processes to stop RDF.
826 Missing RDF updater CONFIG record Cause The RDF monitor was unable to find an updater configuration record when performing a START RDF command. Effect The START operation fails and RDF shuts down. Recovery Use the SET and ADD commands to create one or more updater configuration records. 827 RDF version incompatible with TMF Cause The RDF process is not compatible with the audit format being generated by TMF. Effect RDF stops.
Effect The equivalent of an INSPECT TRACE is written and then the process will abend. Recovery This message gives your service provider information about the state of a process that is terminating abnormally. You might be able to correct the underlying problem and restart RDF. Otherwise it might be necessary to reinitialize RDF. 832 Open error error on filename error is the file-system error number that identifies the specific error. filename is the name of the affected file.
command-text is the text of the command that was issued. userid if present, is the Guardian userid (group.user) of the user who issued the command. Cause RDFCOM logs this message whenever you issue any of these commands: ALTER, INITIALIZE RDF, START RDF, START UPDATE, STOP RDF, STOP UPDATE, or TAKEOVER. command-text is the command text. If the event includes a userid, it indicates that the userid was not the RDF OWNER or that the userid was not the owner of the RDFCOM object file.
839 Error - Audit-trail file is missing. File filename filename is the name of the audit trail file that could not be found. Cause The extractor was unable to find the designated audit trail file. Usually this occurs because TMF has purged the audit trail file while RDF was stopped.
843 Incorrect version of audit received Cause The receiver has received audit from the extractor that does not match the version of audit that the receiver expects. Effect The receiver abends. Recovery This is an internal error. Contact your service provider. 844 Phase one database synchronization complete Cause The updater that generated this message has completed phase one of the online synchronization operation for its volume on the backup system.
Cause TMF was stopped during an RDF online database synchronization operation, before the extractor had completed its phase one processing. Effect The extractor abends because the database synchronization operation can no longer succeed. Recovery You must reinitialize the RDF product and restart the online database synchronization operation.
855 RDF transaction already aborted Cause The named RDF process has encountered an error condition that requires that its last transaction be aborted, but the transaction has already been aborted. Effect The process restarts and continues processing. Recovery This is an informational message; no recovery is required. 856 Commencing image trail purge pass Cause The purger process is ready to start the task of determining what image files it can purge.
Recovery See the description of BEGINTRANSACTION errors in the TMF Application Programmer’s Guide and take appropriate corrective action. 861 Extractor processname RTD (rtd) exceeds RTD warning threshold (threshold#) processname is an extractor process name. rtd is an RTD value. threshold# is an RTDWARNING warning threshold value. Cause The extractor has fallen behind the configured RTDWARNING threshold specified in the RDF configuration. Effect The extractor continues normal processing.
865 Missing purger config record Cause The purger configuration record is not in the RDF configuration file. Effect The reporting process abends and RDF will abort. Recovery Restart RDF. If the problem persists, contact your service provider. 866 RDF purger stopped unexpectedly Cause The purger process has terminated unexpectedly. Effect RDF aborts. Recovery Determine why the purger stopped, and then restart RDF. If the problem persists, contact your service provider.
Recovery This is an informational message; no recovery is required. 874 SEGMENT_ALLOCATE_ returned error error, error-detail# error is an error number. error-detail# is the detailed error number. Cause The specified error occurred while attempting to allocate an extended segment. Effect The affected process abends and RDF will abort. Recovery Try to determine why the segment could not be allocated. Take appropriate corrective action and restart RDF.
878 Invalid image filename or filecode [ANSI-object-type ANSI-name, Partition partition-id,] file filename ANSI-object-type is the ANSI object type (for example, table, index, and so on.). ANSI-name is the ANSI name of the SQL/MX object that encountered the error. partition-id is the partition ID of the SQL/MX object that encountered the error. filename is the Guardian filename of the file that encountered the error.
Recovery This is an informational message. You must examine the event log to determine why the process is restarting and if any recovery action is required. 882 RDF process transaction unilaterally aborted Cause The named RDF process' current transaction has been aborted by TMF and the disk process. Effect The process restarts. Recovery This is an informational message. You must examine the event log to determine why the process is restarting and if any recovery action is required.
888 MAT position for File Recovery: SNO num RBA num Cause A successful takeover has completed. If you need to bring your primary database back into synchronization with your backup database, use TMF File Recovery on your primary system with the specified MAT position. Effect None. Recovery This is an informational message. See the TMF Manual for information about TMF File Recovery to an audit trail position (TOMATPOSITION).
893 Stop Update to Time Operation rejected Cause You attempted to issue a new STOP UPDATE, TIMESTAMP command, but the existing ZTXUNDO list from your last stop-update-to-time operation is still needed. Effect The command is aborted. Recovery Wait until all RDF updaters have been started and have caught up, then retry the operation. If you get this event message again, stop the RDF product, then restart it. After RDF starts you might issue another STOP UPDATE, TIMESTAMP command.
Cause The updater has detected more than the default maximum number of transactions that need to be undone. This exceeds the number of transactions that can currently be loaded into memory. Effect The updater abends, and the takeover operation aborts. Recovery If this happens during a takeover operation, reissue the TAKEOVER command. When the updater restarts the table will automatically be resized to accommodate the required number of transactions.
Effect The master receiver waits until all auxiliary receivers have caught up. The auxiliary receiver might continue to report this event as it continues to catch up. Recovery If the master receiver has been waiting for more than 30 seconds, you should check the status of all auxiliary extractors and receivers with the RDFCOM STATUS PROCESS command.
908 A file is prepared for SQL DDL operation Cause You have completed an SQL DDL operation WITH SHARED ACCESS for a file on your primary system, and all updaters have processed the required audit. It is now safe for you to execute the same DDL operation on the backup database. To obtain the name of the source file involved in the operation on the primary system, look for the last RDF 733 event, which lists the name of the file.
text is the termination text. Cause The purger has seen that the user specified trigger process has stopped. The message might include completion code and/or termination text information that indicates why the process stopped. If the process abended that will also be indicated. Effect The trigger process is stopped. Recovery There is no set recovery procedure. See the Guardian Procedure Errors and Messages Manual for a description of completion codes.
Cause The extractor has encountered a critical audit record that pertains to either a STOP TMF operation or a NonStop SQL/MP or NonStop SQL/MX DDL operation WITH SHARED ACCESS, and the Monitor was unable to communicate information about the operation to another RDF process. See the most recent RDF Event 701 for details. Effect The extractor delays for a short period of time and then tries to process this audit record again. Recovery See the most recent RDF event 701 and take corrective action.
Cause The updater has detected that the specified mapfile is not found. Effect The updater stops and RDF aborts. Recovery Provide an existing mapfile, and restart RDF. 927 Filenames filename1 and filename2 collide on the same physical filename filename3 on the backup system filename1 is the name of one colliding file on the primary system. filename2 is the name of another colliding file on the backup system. filename3 is the name of a physical file on the backup system on which the two files are colliding.
WITH SHARED ACCESS involving the specified table or index has completed. Each updater stops when it reaches this record in the audit trail. Effect All the updaters stop. Recovery When all updaters have stopped, you must perform the same SQL DDL operation on the RDF backup node that was originally done on the primary node. When this operation has been performed, start the updaters using the START UPDATE command.
Effect The ALTER operation was not completed. Recovery See the Operator Messages Manual for a description of the error code. For additional details about understanding and correcting file-system errors, see the Guardian Procedure Errors and Messages Manual. If possible, correct the error and reenter the command that encountered the error. Otherwise, see your system manager. ALTER of this element cannot be performed with RDF up Cause An ALTER command was issued for a non-runtime ALTER option.
Cause You must specify the PrimarySystem, BackupSystem, and RemoteControlSubvolume for the network master of your RDF network. Effect The configuration command fails. Recovery Set the missing fields and add the record. A TAKEOVER operation has not completed on the local system Cause You tried to execute the COPYAUDIT command, but the RDF TAKEOVER operation has not completed on the local system. Effect The COPYAUDIT command is aborted.
Cause The network master network record does not have the have the specified backup system name for the local RDF subsystem. Effect Validation fails. Recovery You must reconfigure your network master. BACKUPSYSTEM is Not Defined Cause The RDF configuration file is invalid. Effect The RDF configuration record was not added. Recovery Enter a SET RDF command, using the BACKUPSYSTEM parameter to identify the backup system. Then, enter an ADD RDF command to add this system to the configuration.
Cause The context file data could not be created or cleared while START RDF processing was being performed after the INITIALIZE RDF command. Effect START RDF processing is aborted. Recovery See the Operator Messages Manual for a description of the error code. For additional details about understanding and correcting file-system errors, see the Guardian Procedure Errors and Messages Manual. If possible, correct the error and reenter the command that encountered the error. Otherwise, see your system manager.
cpu:cpu are the primary and backup CPUs, respectively. Cause Both the configured primary and backup CPUs were unavailable when a START RDF command was issued. Effect The command fails. Recovery Wait until both CPUs become available, and reenter the START RDF command. If necessary, see your system manager. cpu:cpu CPUS are not SYSGEN’d cpu:cpu are the primary and backup CPUs, respectively. Cause A START RDF command failed because the specified CPUs do not exist (they were not configured during SYSGEN).
Cause You tried to add an auxiliary extractor or receiver to an RDF environment that had previously been configured for lockstep operations. Effect The ADD EXTRACTOR or ADD RECEIVER command fails. Recovery You cannot have both lockstep operation and protection for data configured to an auxiliary audit trail in the same RDF subsystem. If you want lockstep protection, then your data must be placed on TMF data volumes configured to the Master Audit Trail.
and Messages Manual. Take appropriate corrective action, and then reissue the UNPINAUDIT command. Error error# obtaining FILECODE and CRVSN of the MAPFILE filename error# is the file-system error number that identifies the specific error. filename is the name of the updater mapfile specified in the updater configuration. Cause RDFCOM could not obtain the file code and CRVSN of the updater mapfile when an ADD VOLUME, START RDF, START UPDATE, or VALIDATE CONFIGURATION command was being executed.
error# is the file-system error number that identifies the specific error. filename is the name of the image trail file associated with the error. Cause The COPYAUDIT command encountered the specified error while attempting to create the specified image file on the local image trail volume. Effect The COPYAUDIT command aborts. Recovery See the Operator Messages Manual for a description of the error code.
Recovery See the Guardian Procedure Errors and Messages Manual for a description of the recovery actions for the file-system error. Correct the error indicated by error#, then reenter the command. Error error# on process info attempt error# is the file-system error number that identifies the specific error. Cause The COPYAUDIT command encountered the specified error while attempting to process information about the current RDFCOM session. Effect The COPYAUDIT command aborts.
Cause During execution of a VALIDATE CONFIGURATION command, RDFCOM determined that the updater mapfile is invalid. Effect The command fails. Recovery Correct the mapfile and reenter the command. Expected MAP in the mapping string mapping-string in the MAPFILE filename mapping-string is the erroneous mapping string specified in the mapfile. filename is the name of the updater mapfile specified in the updater configuration.
Expecting 'Yes' or 'No' response. Cause You have entered an unexpected response to an RDFCOM prompt that requires only either YES (or Y) or NO (or N) as verification to proceed with your request. Effect The requested operation does not take place. Recovery Reenter your request, this time specifying either YES, Y, NO, or N to the prompt. Expecting 'Yes' or 'No' response.
Recovery Set FASTUPDATEMODE ON for the Master Receiver before setting FASTUPDATEMODE ON for any Auxiliary Receiver. FASTUPDATEMODE should be turned OFF for all the Auxilliary Receivers before it is turned OFF for the Master Receiver. Cause You have tried to turn FASTUPDATEMODE OFF for the Master Receiver when FASTUPDATEMODE is ON for one or more Auxilliary Receivers. Effect FASTUPDATEMODE will not be turned OFF for the Master Receiver.
error# is the NEWPROCESS error number that identifies the specific error. Cause A NEWPROCESS error occurred during START RDF or TAKEOVER processing. Effect The START RDF or TAKEOVER operation is aborted. Recovery See the Operator Messages Manual for a description of the error code. For additional details about understanding and correcting process errors, see the Guardian Procedure Errors and Messages Manual. Correct the error and reenter the START RDF or TAKEOVER command.
Cause You tried to add a secondary image trail on the volume volume-name, but an image trail has already been added for this volume. Effect The command fails. Recovery Select a different volume for the secondary image trail. IMAGETRAIL volume-name is not used by any updater volume-name is the name of the image trail. Cause While validating your configuration, RDFCOM determined that the image trail on the volume volume-name is not referenced by any updater processes. Effect The validation operation aborts.
timestamp is an INITTIME timestamp specified previously by an operator in an RDFCOM INITIALIZE RDF command. Cause You are attempting to initialize RDF to a timestamp that is earlier than the current time, and database synchronization is not involved. A record whose timestamp is less than timestamp has been found. Effect The RDFCOM INITIALIZE RDF command continues. Recovery This is an informational message; no recovery is required.
Recovery Examine the specified timestamp. If the format is correct, then the error arises from an incorrect value for the date or time, such as 30 for hour or 32 for day. Correct the specified timestamp and reenter the INITIALIZE RDF command. The correct format for the timestamp is day month year hour:min, where: day is a number from 1 to 31. month is the first three letters of the month, such as JAN, FEB, MAR. year is a two-digit or four-digit number, such as 91 or 1991.
Cause RDFCOM expected * in the filename portion of the subvolume indicated by subvolume-name when an ADD VOLUME, ALTER VOLUME, START RDF, START UPDATE, or VALIDATE CONFIGURATION command was being executed. Effect The command fails. Recovery Correct the mapping string, then reenter the command. Mapping string mapping-string is invalid in the MAPFILE filename, error error# mapping-string is the erroneous mapping string specified in the mapfile.
Cause You have already added an extractor with an ATINDEX value of 0. Effect The ADD EXTRACTOR command fails. Recovery Review and revise your RDF configuration. Master RECEIVER Record already exists Cause You have already added a receiver with an ATINDEX value of 0. Effect The ADD EXTRACTOR command fails. Recovery Review and revise your RDF configuration. Maximum number of image trails already added Cause You have already added the maximum number of secondary image trails (255), and cannot add more.
Recovery You must add the appropriate NETWORK configuration record. Network synch file ZRDFNETX file must be INCLUDED Cause An INCLUDE pattern has been specified that will cause audit records associated with the NetSynch data file to be filtered out. Effect RDF can not be started. Recovery Correct the VOLUME INCLUDE associated with the PNETTXVOLUME so that the file $volume.control-subvolume.ZRDFNETX is included.
Cause The COPYAUDIT command could not find any image files on the remote image trail. This problem indicates that the receiver’s RETAINCOUNT value was probably not set high enough and that, as a result, some image files on the remote system were purged. Effect The COPYAUDIT command aborts. Recovery There is no recovery action. The COPYAUDIT command cannot be executed because image files needed for this command were already purged from the remote system.
Recovery If the network record you have previously added pertains to the RDF network master subsystem, then do not add any further network records. If the network record you have previously added does not pertains to the RDF network master subsystem, then you need to purge your current configuration, reinitialize, and reconfigure. Only the SUPER group can execute this command Cause You issued a command restricted to members of the super-user group, but your logon ID does not indicate that group.
Effect The command fails. Recovery Enter another command, or shut down RDF and reenter this command. Operation must be performed on the PRIMARYSYSTEM \primary or BACKUPSYSTEM \backup primary is the name of the primary node that can perform the operation. backup is the name of the backup node that can perform the operation. Cause The command was issued at a third system, which is not allowed. Effect The command fails.
Recovery See the Operator Messages Manual for a description of the error code. For additional details about understanding and correcting file-system errors, see the Guardian Procedure Errors and Messages Manual. If possible, correct the error and reenter the COPYAUDIT command. Otherwise, contact your service provider. Position error error# on local image file error# is the file-system error number that identifies the specific error.
Cause The RDF configuration file is invalid: both primary and backup node names are identical. Effect The validation fails. Recovery Alter the RDF configuration to reflect different names for these two nodes. Process Name Error: error# error# is the error number that identifies the specific error. Cause A NEWPROCESS error occurred during START RDF or TAKEOVER processing. Effect The START RDF or TAKEOVER operation is aborted. Recovery See the Operator Messages Manual for a description of the error code.
filename is the name of the nonexistent EMS collector (RDF log file). node is the name of the system where the collector name is invalid. Cause The RDF configuration file is invalid: a nonexistent EMS collector was specified. Effect The ADD RDF command fails. Recovery Specify a valid EMS collector name in a SET RDF command, and then reenter the ADD RDF command. RDF network subsystem ctrl-subvol has not been validated ctrl-subvol is the name of an RDF subsystem control subvolume.
ctrl-subvol is the name of an RDF subsystem control subvolume. Cause The RDF subsystem that you specified as your network master has not been configured as a network master. Effect Validation fails. Recovery You need to reconfigure your local subsystem and specify the control subvolume of your network master. You might also need to reconfigure your network master. RDF subsystem ctrl-subvol stopped. TMF audit trails remain pinned. ctrl-subvol is the name of an RDF subsystem control subvolume.
Effect Validation fails. Recovery You must reconfigure your network master and possibly your local configuration. RDFVOLUME is not allowed for an aux receiver. Auxiliary receivers do not have an RDFVOLUME. Cause You tried to add an auxiliary receiver for which you had specified an RDFVOLUME. Effect The ADD command fails. Recovery Issue a RESET RECEIVER command, and then reconfigure the particular receiver without specifying an RDFVOLUME.
As one of its validation checks during START RDF processing, RDFCOM tries to create a temporary image file on the receiver’s RDFVOLUME and then to allocate all 16 extents. This check, if successful, verifies that: • If RDF is starting for the first time, there is enough storage for at least one image file • If RDF has been started previously, there is enough storage for one image file when the next image-file rollover occurs If the check fails because there is insufficient storage, this message appears.
RECEIVER record exists, use ALTER RECEIVER Cause An ADD RECEIVER command was issued when the configuration file already contained a receiver record. Effect The command fails. Recovery No recovery is required if you want to use the existing receiver process as it is configured. If you want to change any of the receiver’s configuration options, however, enter an ALTER RECEIVER command that specifies those changes. RECEIVER record NOT found.
mapping-string is the erroneous mapping string specified in the mapfile. filename is the name of the updater mapfile specified in the updater configuration. Cause RDFCOM found a reserved subvolume name in the mapping string specified in the updater mapfile when an ADD VOLUME, ALTER VOLUME, START RDF, START UPDATE, or VALIDATE CONFIGURATION command was being executed. Effect The command fails. Recovery Remove the reserved subvolume name in the mapping string, then reenter the command.
SHUTDOWN Failure: error# on VOLUME volume error# is the error number that identifies the specific error. volume is the name of an RDF data volume. Cause RDFCOM could not stop the updater for volume volume because of error. Effect The shutdown is aborted. Recovery See the Operator Messages Manual for a description of the error code. For additional details about understanding and correcting errors, see the Guardian Procedure Errors and Messages Manual.
to disk. Alternatively, you can use a different TMF shutdown point that is located in a MAT file still on disk, or you can stop TMF and use that resulting shutdown point. START RDF Aborted Cause A START RDF command aborted. Effect The command fails. Recovery Scan the EMS event log to determine why the command aborted, correct the error if possible, and reenter the START RDF command. START UPDATE in progress, Please Wait... Cause A START UPDATE command is being executed.
Effect Some updaters might have shut down, but others never received the stop message and are still running. The receiver and monitor cannot now identify these updaters, and you cannot stop them with another STOP UPDATE command or a STOP RDF command. Recovery All remaining updaters must be manually stopped from the TACL interface (for example, with a TACL STOP $UPD1 command).
Cause A takeover operation is underway. Effect The takeover operation continues. Recovery This is an informational message; no recovery is required. TAKEOVER command is not allowed in an OBEY/IN file without the bang (!) option. Cause TAKEOVER command has been issued through an OBEY/IN file without bang (!) option. Effect The TAKEOVER operation is aborted. Recovery Specify the bang (!) option along with the TAKEOVER command in the OBEY/ IN file or issue the TAKEOVER command from RDFCOM command prompt.
probably not set high enough and that, as a result, some image files on the remote system were purged. Effect The COPYAUDIT command aborts. Recovery There is no recovery action. The COPYAUDIT command cannot be executed because image files needed for this command were already purged from the remote system. The MAPFILE filename is not found filename is the name of the updater mapfile specified in the updater configuration.
Cause While searching for a TMF shutdown timestamp to use to initialize RDF, RDFCOM found that the audit trail file with the specified sequence number is not currently available. If you respond to the prompt with YES or Y, RDFCOM directs the TMP to begin restoring this file. NOTE: If the file was dumped to tape, you must wait for the TMP to prompt you to mount the tape. This mount request is logged in the EMS log. If you respond with NO or N, then RDFCOM aborts the initialization attempt.
\bksys is the name of the RDF backup system. subvol is the name of the remote control subvolume. Cause You tried to execute an INITIALIZE RDF command, but the RDF control files (such as CONFIG or CONTEXT) already exist on the remote control subvolume. If these files are on the backup system, then that name is specified. Effect The INITIALIZE RDF command aborts. Recovery You must purge \$SYSTEM.subvol.* on the backup systems before you can retry the INITIALIZE RDF command.
Effect The start command fails. Recovery You must reconfigure RDF with a named updater process. The year must be greater than 1996. Cause You specified the year of a timestamp that is earlier than 1997. Effect The command involving the timestamp fails. Recovery Reissue the command, specifying a timestamp year that is 1997 or greater. This aux EXTRACTOR Record already exists. Cause You tried to add an EXTRACTOR with a particular ATINDEX value, but there is already one configured with that value.
Recovery Check the contents of the RDF configuration file, issue a VALIDATE RDF command to verify the configuration, and reissue your request for the RDFCOM operation you originally wanted to perform. TMF NAT table is full. Cause There is a problem with TMF. Effect The configuration validation fails. Recovery Check the status of TMF. When TMF is operational, reenter the command. TMF Shutdown at timestamp has been found.
Cause The COPYAUDIT command has aborted because of a problem reported in the previous RDFCOM message. Effect The COPYAUDIT command aborts. Recovery Correct the problem reported in the previous error message and reissue the COPYAUDIT command. Unable to allocate Map Cause A NEWPROCESS error occurred during START RDF or TAKEOVER processing. Effect The NEWPROCESS procedure fails. Recovery Make sufficient space available on the swap volume for the requested operation.
error# is the file-system error number that identifies the specific error. Cause A DELETE IMAGETRAIL command tried to delete an image trail, but RDFCOM could not purge all image files in the trail because of the error denoted by file-system file-error. Effect The command fails. Recovery See the Operator Messages Manual for a description of the error code. For additional details about understanding and correcting file-system errors, see the Guardian Procedure Errors and Messages Manual.
Effect RDF will not start. Recovery Change the RDF configuration to reflect a valid disk volume. VOLUME device UPDATEVOLUME is NOT a disk volume device is the non-disk device assigned for the UPDATEVOLUME. Cause The RDF configuration file is invalid. Effect RDF will not start. Recovery Change the RDF configuration to reflect a valid disk volume. VOLUME Record exists, use ALTER VOLUME volume Cause An ADD VOLUME command was issued when the configuration file already contained an updater record for the volume.
Recovery Delete the updater, and then delete the image trail. VOLUME vol-name does not match imagetrail ATINDEX atindex Cause You added an updater with the specified ATINDEX, but the IMAGEVOLUME configured for the updater does not have that value. Effect The validation fails. Recovery Alter the particular updater’s ATINDEX value to match the appropriate audit trail number or delete the updater.
procname is the RDF process without a backup CPU, which is one of: EXTRACTOR, MONITOR, RECEIVER, or $volume UPDATER. Cause RDF is started without a backup process for the process identified in this message. Effect RDF is started. Recovery Stop RDF, reconfigure it to include a backup CPU for the RDF process, and start the subsystem once again. * * * WARNING * * * NSA SQL DDL operation encountered in the audit trail.
another three minutes from the specified timestamp to ensure that the starting position in the audit trail is a safe one. Recovery This is an informational message; no recovery is required. *** WARNING *** REPLICATEPURGE is not turned ON. Cause REPLICATEPURGE is turned OFF but INCLUDEPURGE or EXCLUDEPURGE lists have been added for a volume. Effect The INCLUDEPURGE or EXCLUDEPURGE lists will be ignored for the volume. Recovery SET RDF REPLICATEPURGE to ON before starting RDF.
error# is the file-system error number that identifies the specific error. filename is the name of the image trail file associated with the error. Cause The COPYAUDIT command encountered the specified error while attempting to write data into the specified image file on the local image trail volume. Effect The COPYAUDIT command aborts. Recovery See the Operator Messages Manual for a description of the error code.
You cannot add more than 48 network records Cause The current limit for the number of RDF subsystems in your RDF network is 48 and you have attempted to add 49. Effect The configuration command fails. Recovery Do not add any more network records. You cannot alter MAPFILE on the backup system if the primary system is available Cause During execution of an ALTER VOLUME command on the backup system, RDFCOM determined that the primary system is accessible. Effect The command fails.
RDFSCAN Messages The following RDFSCAN messages (listed alphabetically by text) can appear on your terminal screen during an RDFSCAN session. Beyond eof! Cause The AT position specified is beyond the end-of-file mark in the current log file. Effect The AT command fails. Recovery Reenter the AT command, this time with a record-number parameter that indicates a position before the end-of-file mark.
Filename is the invalid filename. Cause The specified file is not a valid file recognized by the operating system. Effect The command fails. Recovery Check the filename for correct spelling and compliance with syntax rules. HELP for command not found command is the RDFSCAN command for which online help was requested. Cause The command for which HELP text was requested is not a valid RDFSCAN command. Effect The HELP command fails.
D Operational Limits Table D-1 Operational Limits for RDF/IMP, IMPX, and ZLT Limit Description Maximum Value Number of volumes being protected 255 Number of volumes in an SMF pool on backup system 15 Number of auxiliary image trails 255 Number of files per updater 3000 Number of RDF configurations with the same primary system 37 Number of systems that can contribute audit to a primary system 255 Maximum number of image trail file primary and secondary extents 65,500 Maximum number of primary
E Using ASAP ASAP (Availability Statistics and Performance) allows many subsystem entities to be monitored across a network of NonStop servers. The status and statistics for the entities are collected on a single system, and are then monitored either through the ASAP command interface or through the ASAP graphical user interface (GUI) PC client.
Figure E-1 The RDF/ASAP Environment Installation The RDF SGP is packaged with the RDF/IMP and IMPX products and, by default, is installed on $SYSTEM.RDF. You might, however, place this object file wherever you want. If you install the SGP object file somewhere other than $SYSTEM.RDF, you must ensure that the ASAP configuration points to the correct location (by way of the SET RDF command within the ASAP command interface). See the ASAP Server Manual for details about the SET RDF command.
MONITOR RDF DOME->TANDA This command uses DOME as the CSV. To use a control subvolume with a suffix, say E, one should use the command: MONITOR RDF DOMEE->TANDA where, DOMEE is the control subvolume and TANDA is the RDF Backup System without '\'. Adding and Removing RDF Environments The RDF SGP performs the auto detection and processing of the RDF environments added through the MONITOR command when the process starts.
Table E-1 RDF Metrics Reported by ASAP (continued) Information Passed to ASAP Extractor Receiver Imagetrail Purger RDFNET Updater TMF — Auxiliary Audit Index X X X — — X File Sequence Number X X X — — — X Relative Byte Address — X — — — — X RTD Time X X X — — — X Primary CPU X X X — X X1 X Backup CPU X X X — X X1 X Priority X X X — X X1 X 1 2 468 Monitor Only in an RDF Network environment Only reported by the master receiver where the master i
Index * wildcard character, 268 900, File code, 63 ? wildcard character, 268 ] prompt, 99 views, 63 volume names, 58 BACKUPSWAP parameter, 228 BACKUPSYSTEM network attribute, 296 BACKUPSYSTEM parameter, 212 Bracket prompt (]), 99 A C Abbreviations, 203, 350 ADD command, 193, 349 ADD EXTRACTOR command, 88, 92 ADD MONITOR command, 91 ADD RECEIVER command, 93, 94 ALTER command, 195, 349 ALTER command, FUP, 74 Altering TMF configuration, 78 ASAP, Using with RDF, 56, 465 Asterisk wildcard character, 268 AT c
VALIDATE CONFIGURATION, 258, 356 RDFSCAN AT, 262, 357 DISPLAY, 262 EXIT, 263, 357 FILE, 264, 357 HELP, 265, 357 LIST, 265, 357 LOG, 266, 357 MATCH, 267, 357 NOLOG, 267, 268, 358 SCAN, 269, 358 RDFSCAN commands DISPLAY, 357 STATUS RDF, 112 Communications estimating required resources, 58 RDF requirements, 57, 59 Communications line failure, 126 Comparing SQL/MX tables, 335 CONFIG file description, 362 Configuration backup system, 57 command file, creating for RDF, 96 extractor process, 88, 92 monitor process
Error messages, file system, 366 Error recovery create operation, 123 modify operation, 122 open operation, 122 RDF error 700, modify operation, 122 RDF error 705, open operation, 122 RDF error 739, create operation, 123 Event log, scanning messages in EMS, 39, 118 Exception files description, 363 examining, 362 records, 38 EXCLUDE clauses, 279 EXIT command, 201, 263, 350, 357 Expand estimating required resources, 58 multi-CPU paths (superpaths), 293 EXPAND line failure, 126 Extractor process, 42 attributes
TMF, 78 INITTIME parameter, 81, 213 Installing the RDF subsystem, 75 K Keywords, 103 L Label modifications, file, 64 Licensed programs, 76 Line failure, 126 LIST command, 265, 357 LOCATION option, 333 Lockstep gateway messages, 314 Lockstep operation, 55 LOG command, 266, 357 Log device, messages sent to, 365 Log file, 261 $0, 366 description, 38 example, 119, 365 messages in, 118 scanning messages in, 39 specifying in RDFSCAN, 264, 357 Log, EMS event, 118 LOGDEVICE parameter, 228 LOGFILE parameter, 228
backup system, 71, 74 synchronizing databases with, 71 OBEYFORM option, 206 of INFO command, 96 OBEYVOL command, 218 ODBC catalog changes, 160 Offline synchronization for a single partition, 330 Online database synchronization, 167 phases of, 183 Online help RDFCOM, 107 RDFSCAN, 111 Online initialization, 80 OPEN command, 218, 351 Open operation error recovery, 122 file-system errors, 122 RDF errors, 122 Operating system RDF requirements, 59 security, 76 Operating the RDF subsystem, 99 Operations, RDF subsy
log device for messages, 365 log file, 38, 261 log file example, 365 managing, 99 messages, 365, 366 messages, scanning, 39, 118 network transactions, 295 NonStop process pairs, 41 operating, 99 operations, 42 parameters BACKUPSWAP, 228 BACKUPSYSTEM, 212 CPUS, 222, 224, 225, 226, 231, 232, 235, 236, 352, 353 EXTENTS, 226, 232 image trail, 88 IMAGETRAIL, 199 IMAGEVOLUME, 236 LOGDEVICE, 228 LOGFILE, 228 monitor process, 91 PRIMARYSWAP, 228 PRIORITY, 222, 224, 225, 226, 231, 232, 235, 236, 352, 353 PROCESS, 22
RDFNETO, security requirements, 77 RDFRCVO licensed program, 76 receiver object file, 75 security requirements, 77 RDFSCAN description, 261 description of use, 39, 118 ending a session, 110 help text file, 75 messages, 461 object code file, 75 online help, 111 running, 109 security requirements, 77 starting a session, 109 wildcard characters in match patterns, 268 RDFSCAN commands AT, 262, 357 DISPLAY, 262, 357 EXIT, 263, 357 FILE, 264, 357 HELP, 265, 357 LIST, 265, 357 LOG, 266, 357 MATCH, 267, 357 NOLOG,
SQL/MX offline synchronization for a single partition, 330 SQL/MX tables comparing, 335 restoring, 333 START RDF command, 97, 242, 355 START TMF command, TMFCOM, 97 START TRANSACTION command, TMFCOM, 74, 78 START UPDATE command, 98, 244, 355 Starting the RDF subsystem, 97 Starting the TMF subsystem, 97 STATUS RDF command, 112, 244, 355 STOP RDF command, 250, 356 STOP SYNCH command, 252, 356 STOP UPDATE command, 150, 253, 356 stop-update-to-time, 150, 253, 406 Stopping the backup system after a primary syste
SET VOLUME example, 239 setting option values, 236 SHOW VOLUME example, 241 partitioned files, auditing, 48 RDF errors, 121 restart point, 48 restart points, error recovery, 121 Updater, failure, 127 UPDATERDELAY parameter, 228 UPDATEVOLUME parameter, 236 User interfaces, RDF subsystem, 38 V VALIDATE CONFIGURATION command, 258, 356 Views, NonStop SQL/MP, 63, 70 Volume audited on backup system, 70 configuration, 58 failure, TMF, 128 limit, 58 mapping, 58 mapping primary to backup, 70 names, 191 names, diffe