RDF System Management Manual
Table Of Contents
- RDF System Management Manual
- What’s New in This Manual
- About This Manual
- 1 Introducing RDF
- RDF Subsystem Overview
- RDF Processes
- RDF Operations
- Reciprocal and Chain Replication
- Available Types of Replication to Multiple Backup Systems
- Triple Contingency
- Loopback Configuration (Single System)
- Online Product Initialization
- Online Database Synchronization
- Online Dumps
- Subvolume- and File-Level Replication
- Shared Access DDL Operations
- EMS Support
- SMF Support
- RTD Warning Thresholds
- Process-Lockstep Operation
- Support for Network Transactions
- RDF and NonStop SQL/MX
- Zero Lost Transactions (ZLT)
- Monitoring RDF Entities With ASAP
- 2 Preparing the RDF Environment
- 3 Installing and Configuring RDF
- 4 Operating and Monitoring RDF
- 5 Managing RDF
- Recovering From File System Errors
- Handling Disk Space Problems
- Responding to Operational Failures
- Stopping RDF
- Restarting RDF
- Carrying Out a Planned Switchover
- Takeover Operations
- Reading the Backup Database
- Access to Backup Databases in a Consistent State
- RDF and NonStop SQL/MP DDL Operations
- RDF and NonStop SQL/MX Operations
- Backing Up Image Trail Files
- Making Online Dumps With Updaters Running
- Doing FUP RELOAD Operations With Updaters Running
- Exception File Optimization
- Switching Disks on Updater UPDATEVOLUMES
- 6 Maintaining the Databases
- 7 Online Database Synchronization
- 8 Entering RDFCOM Commands
- 9 Entering RDFSCAN Commands
- 10 Triple Contingency
- 11 Subvolume- and File-Level Replication
- 12 Auxiliary Audit Trails
- 13 Network Transactions
- Configuration Changes
- RDF Network Control Files
- Normal RDF Processing Within a Network Environment
- RDF Takeovers Within a Network Environment
- Takeover Phase 1 – Local Undo
- Takeover Phase 2 – File Undo
- Takeover Phase 3 – Network Undo
- Takeover Phase 3 Performance
- Communication Failures During Phase 3 Takeover Processing
- Takeover Delays and Purger Restarts
- Takeover Restartability
- Takeover and File Recovery
- The Effects of Undoing Network Transactions
- Takeover and the RETAINCOUNT Value
- Network Configurations and Shared Access NonStop SQL/MP DDL Operations
- Network Validation and Considerations
- RDF Re-Initialization in a Network Environment
- RDF Networks and ABORT or STOP RDF Operations
- RDF Networks and Stop-Update-to-Time Operations
- Sample Configurations
- RDFCOM STATUS Display
- 14 Process-Lockstep Operation
- Starting a Lockstep Operation
- The DoLockstep Procedure
- The Lockstep Transaction
- RDF Lockstep File
- Multiple Concurrent Lockstep Operations
- The Lockstep Gateway Process
- Disabling Lockstep
- Reenabling Lockstep
- Lockstep Performance Ramifications
- Lockstep and Auxiliary Audit Trails
- Lockstep and Network Transactions
- Lockstep Operation Event Messages
- 15 NonStop SQL/MX and RDF
- Including and Excluding SQL/MX Objects
- Obtaining ANSI Object Names From Updater Event Messages
- Creating NonStop SQL/MX Primary and Backup Databases from Scratch
- Creating a NonStop SQL/MX Backup Database From an Existing Primary Database
- Online Database Synchronization With NonStop SQL/MX Objects
- Offline Synchronization for a Single Partition
- Online Synchronization for a Single Partition
- Correcting Incorrect NonStop SQL/MX Name Mapping
- Consideration for Creating Backup Tables
- Restoring to a Specific Location
- Comparing NonStop SQL/MX Tables
- 16 Zero Lost Transactions (ZLT)
- A RDF Command Summary
- B Additional Reference Information
- C Messages
- D Operational Limits
- E Using ASAP
- Index
Triple Contingency
HP NonStop RDF System Management Manual—524388-003
10-2
How Does It Work?
How Does It Work?
In general, the triple contingency feature works as follows:
•
The RETAINCOUNT configuration parameter on both backup systems prevents
the purger process from purging image trail files that might be needed for triple
contingency recovery.
•
If the primary system fails, you execute two takeovers: one on each backup
system. Upon successful completion of both takeovers (signalled by a 724
message in the EMS event log of both backup systems), the databases on the two
backup systems will almost assuredly not be identical: one of the extractors will
have been ahead of the other in its RDF processing when the failure occurred.
•
Examine the EMS event log on both backup systems for a 735 message. That
message, which follows the 724 message in the log, specifies the last position in
the MAT that was seen by the receiver process. Compare the MAT positions in
the two 735 messages and determine which of the two systems was further behind
in its RDF processing when the failure occurred (that is, which system had
received the least amount of audit data from the extractor by the time the primary
system was lost).
•
On the backup system that was further behind (had the least amount of audit data),
issue the COPYAUDIT command specifying the name of the other backup system
and its RDF control subvolume. That command copies over all missing audit
records from the designated system.
•
Upon successful completion of the COPYAUDIT operation, do a second takeover
on that system. When the second takeover has completed successfully, initialize
and configure the two backup systems as a new primary-backup pair (either
system can be the primary) and then restart application processing on the new
primary system.
The remainder of this section discusses the hardware and software requirements, the
RETAINCOUNT parameter, and the COPYAUDIT command in detail.
Hardware Requirements
Both backup systems should have similar hardware with respect to RDF operation (in
particular, the data volumes and image trails must be identical between the two
systems). It is also strongly recommended that the Expand bandwidth between the
primary and backup systems be the same for both configurations, as well as between
the two backup systems.
WARNING. To be able to use the triple contingency feature, it is imperative that you carefully
obey the instructions and caveats presented in this section.