RDF System Management Manual
Table Of Contents
- RDF System Management Manual
- What’s New in This Manual
- About This Manual
- 1 Introducing RDF
- RDF Subsystem Overview
- RDF Processes
- RDF Operations
- Reciprocal and Chain Replication
- Available Types of Replication to Multiple Backup Systems
- Triple Contingency
- Loopback Configuration (Single System)
- Online Product Initialization
- Online Database Synchronization
- Online Dumps
- Subvolume- and File-Level Replication
- Shared Access DDL Operations
- EMS Support
- SMF Support
- RTD Warning Thresholds
- Process-Lockstep Operation
- Support for Network Transactions
- RDF and NonStop SQL/MX
- Zero Lost Transactions (ZLT)
- Monitoring RDF Entities With ASAP
- 2 Preparing the RDF Environment
- 3 Installing and Configuring RDF
- 4 Operating and Monitoring RDF
- 5 Managing RDF
- Recovering From File System Errors
- Handling Disk Space Problems
- Responding to Operational Failures
- Stopping RDF
- Restarting RDF
- Carrying Out a Planned Switchover
- Takeover Operations
- Reading the Backup Database
- Access to Backup Databases in a Consistent State
- RDF and NonStop SQL/MP DDL Operations
- RDF and NonStop SQL/MX Operations
- Backing Up Image Trail Files
- Making Online Dumps With Updaters Running
- Doing FUP RELOAD Operations With Updaters Running
- Exception File Optimization
- Switching Disks on Updater UPDATEVOLUMES
- 6 Maintaining the Databases
- 7 Online Database Synchronization
- 8 Entering RDFCOM Commands
- 9 Entering RDFSCAN Commands
- 10 Triple Contingency
- 11 Subvolume- and File-Level Replication
- 12 Auxiliary Audit Trails
- 13 Network Transactions
- Configuration Changes
- RDF Network Control Files
- Normal RDF Processing Within a Network Environment
- RDF Takeovers Within a Network Environment
- Takeover Phase 1 – Local Undo
- Takeover Phase 2 – File Undo
- Takeover Phase 3 – Network Undo
- Takeover Phase 3 Performance
- Communication Failures During Phase 3 Takeover Processing
- Takeover Delays and Purger Restarts
- Takeover Restartability
- Takeover and File Recovery
- The Effects of Undoing Network Transactions
- Takeover and the RETAINCOUNT Value
- Network Configurations and Shared Access NonStop SQL/MP DDL Operations
- Network Validation and Considerations
- RDF Re-Initialization in a Network Environment
- RDF Networks and ABORT or STOP RDF Operations
- RDF Networks and Stop-Update-to-Time Operations
- Sample Configurations
- RDFCOM STATUS Display
- 14 Process-Lockstep Operation
- Starting a Lockstep Operation
- The DoLockstep Procedure
- The Lockstep Transaction
- RDF Lockstep File
- Multiple Concurrent Lockstep Operations
- The Lockstep Gateway Process
- Disabling Lockstep
- Reenabling Lockstep
- Lockstep Performance Ramifications
- Lockstep and Auxiliary Audit Trails
- Lockstep and Network Transactions
- Lockstep Operation Event Messages
- 15 NonStop SQL/MX and RDF
- Including and Excluding SQL/MX Objects
- Obtaining ANSI Object Names From Updater Event Messages
- Creating NonStop SQL/MX Primary and Backup Databases from Scratch
- Creating a NonStop SQL/MX Backup Database From an Existing Primary Database
- Online Database Synchronization With NonStop SQL/MX Objects
- Offline Synchronization for a Single Partition
- Online Synchronization for a Single Partition
- Correcting Incorrect NonStop SQL/MX Name Mapping
- Consideration for Creating Backup Tables
- Restoring to a Specific Location
- Comparing NonStop SQL/MX Tables
- 16 Zero Lost Transactions (ZLT)
- A RDF Command Summary
- B Additional Reference Information
- C Messages
- D Operational Limits
- E Using ASAP
- Index
Introducing RDF
HP NonStop RDF System Management Manual—524388-003
1-27
Reciprocal and Chain Replication
Consider the following example. Assume that Primary DB 1 and Backup DB 2 are both
located on $DATA on \A, and assume that Primary DB 2 and Backup DB 1 are also
located on $DATA on \B. Using the reciprocal example, suppose your application does
an update on \A to Primary DB 1. The extractor of RDF Subsystem 1 sees that the
update was for $DATA and sends that update to \B where the updater applies that
update to Backup DB 1. This update generates an audit record that goes into the audit
trail on \B and is marked as updater-generated. The extractor for RDF Subsystem 2
reads the audit trail looking for audit associated with $DATA. When it reads the record
generated by the udpater, it sees the update was associated with $DATA, but it also
sees that the record was updater-generated, which causes the extractor to filter that
record out and not send it to \A. This is correct and desired behavior.
If an updater transaction aborts, the TMF Backout process executes undo for the
aborted transaction, and Backout has no information about what process generated
the original audit for the transaction before it aborted. This can corrupt your primary
and backup databases unless you take appropriate steps (see further below).
Consider the following extension to the example above. After the updater on \B has
replicated the application's update from \A and before the update can commit its
transaction on \B, a CPU failure causes TMF to abort the transaction. Backout undoes
the updater's update. The resulting audit record is associated with $DATA, but Backout
does not know which process generated the original update, and the resulting record is
not marked as updater-generated. When the extractor for RDF Subsystem 2 reads
this record generated by Backout, it sees it was for $DATA and it sees that the record
was not updater generated. It therefore sends this record to \A. Now, when the
updater for RDF Subsystem 2 on \A applies this record to Primary DB 1, it thereby
backs out the commited update of your application. Additionally, Primary DB 1 and
Backup DB 1 are no longer in synch. Even though the updater on \B had its transaction
aborted, that updater will re-apply the application update to Backup DB 1. When done,
Primary DB no longer has the update, but Backup DB 2 does.
Note that, although this example describes a reciprocal configuration, the same basic
problem can happen with chain replication. In the chain case, the extractor for RDF
Subsystem 2 would be sending a Backout generated update to \C where the file or
table involved in the update does not even exist. This will cause the updater
responsibe for $DATA on \C to stall, waiting for you to create the file or table on \C.
The same effect occurs when you set up reciprocal environments or chain
environments, where you also have the REPLICATEPURGE attribute set. In this case,
the updater purges the file through the file system, and the resulting audit record does
not indicate that it was generated by an updater. If the extractor sends the audit record
for the purge to its backup system, the updater might purge a file you do not want
purged, or it might encounter an error 11.
To prevent these problems in a reciprocal configuration or chain configuration, you
must ensure that Backup DB 1 and Primary DB 2 are on mutually exclusive volumes.
For example, put Primary DB 1 and Backup DB 1 is on $DATA1 and put Primary DB 2
and Backup DB 2 on $DATA2. Thus the extractor can filter out the audit by volume
name and not depend on records being marked as updater generated.