RDF System Management Manual
Table Of Contents
- RDF System Management Manual
- What’s New in This Manual
- About This Manual
- 1 Introducing RDF
- RDF Subsystem Overview
- RDF Processes
- RDF Operations
- Reciprocal and Chain Replication
- Available Types of Replication to Multiple Backup Systems
- Triple Contingency
- Loopback Configuration (Single System)
- Online Product Initialization
- Online Database Synchronization
- Online Dumps
- Subvolume- and File-Level Replication
- Shared Access DDL Operations
- EMS Support
- SMF Support
- RTD Warning Thresholds
- Process-Lockstep Operation
- Support for Network Transactions
- RDF and NonStop SQL/MX
- Zero Lost Transactions (ZLT)
- Monitoring RDF Entities With ASAP
- 2 Preparing the RDF Environment
- 3 Installing and Configuring RDF
- 4 Operating and Monitoring RDF
- 5 Managing RDF
- Recovering From File System Errors
- Handling Disk Space Problems
- Responding to Operational Failures
- Stopping RDF
- Restarting RDF
- Carrying Out a Planned Switchover
- Takeover Operations
- Reading the Backup Database
- Access to Backup Databases in a Consistent State
- RDF and NonStop SQL/MP DDL Operations
- RDF and NonStop SQL/MX Operations
- Backing Up Image Trail Files
- Making Online Dumps With Updaters Running
- Doing FUP RELOAD Operations With Updaters Running
- Exception File Optimization
- Switching Disks on Updater UPDATEVOLUMES
- 6 Maintaining the Databases
- 7 Online Database Synchronization
- 8 Entering RDFCOM Commands
- 9 Entering RDFSCAN Commands
- 10 Triple Contingency
- 11 Subvolume- and File-Level Replication
- 12 Auxiliary Audit Trails
- 13 Network Transactions
- Configuration Changes
- RDF Network Control Files
- Normal RDF Processing Within a Network Environment
- RDF Takeovers Within a Network Environment
- Takeover Phase 1 – Local Undo
- Takeover Phase 2 – File Undo
- Takeover Phase 3 – Network Undo
- Takeover Phase 3 Performance
- Communication Failures During Phase 3 Takeover Processing
- Takeover Delays and Purger Restarts
- Takeover Restartability
- Takeover and File Recovery
- The Effects of Undoing Network Transactions
- Takeover and the RETAINCOUNT Value
- Network Configurations and Shared Access NonStop SQL/MP DDL Operations
- Network Validation and Considerations
- RDF Re-Initialization in a Network Environment
- RDF Networks and ABORT or STOP RDF Operations
- RDF Networks and Stop-Update-to-Time Operations
- Sample Configurations
- RDFCOM STATUS Display
- 14 Process-Lockstep Operation
- Starting a Lockstep Operation
- The DoLockstep Procedure
- The Lockstep Transaction
- RDF Lockstep File
- Multiple Concurrent Lockstep Operations
- The Lockstep Gateway Process
- Disabling Lockstep
- Reenabling Lockstep
- Lockstep Performance Ramifications
- Lockstep and Auxiliary Audit Trails
- Lockstep and Network Transactions
- Lockstep Operation Event Messages
- 15 NonStop SQL/MX and RDF
- Including and Excluding SQL/MX Objects
- Obtaining ANSI Object Names From Updater Event Messages
- Creating NonStop SQL/MX Primary and Backup Databases from Scratch
- Creating a NonStop SQL/MX Backup Database From an Existing Primary Database
- Online Database Synchronization With NonStop SQL/MX Objects
- Offline Synchronization for a Single Partition
- Online Synchronization for a Single Partition
- Correcting Incorrect NonStop SQL/MX Name Mapping
- Consideration for Creating Backup Tables
- Restoring to a Specific Location
- Comparing NonStop SQL/MX Tables
- 16 Zero Lost Transactions (ZLT)
- A RDF Command Summary
- B Additional Reference Information
- C Messages
- D Operational Limits
- E Using ASAP
- Index
Introducing RDF
HP NonStop RDF System Management Manual—524388-003
1-5
Unplanned Outages Without ZLT
Unplanned Outages Without ZLT
Without ZLT functionality, it is possible for some committed transactions to be lost
during an unplanned outage. When the RDF TAKEOVER command is issued, any
transaction whose final outcome is unknown on the backup system is backed out of the
backup database. One or more transactions might have committed on the primary
system, but, before the extractor could read and send the associated audit data to the
backup system, the primary system failed. Loss of audit data in this manner typically
involves no more than a fraction of a second.
If the primary system is unexpectedly brought down because of a disaster, the
outcome of some transactions might never be known, as illustrated in Table 1-1.
In the example illustrated in Table 1-1, a disaster has brought down the primary system
immediately after the commit record for transaction 100 was written to the MAT, but
before the RDF extractor process was able to send the commit record to the backup
system. For transaction 101, a single update was logged in the MAT and sent to the
backup system, but the primary system was brought down before the transaction was
completed.
When the command for a takeover is issued, the updater processes treat all
transactions whose outcomes are not known as aborted transactions. In this scenario,
only the changes related to transactions known with certainty to have been committed
on the primary system are left in the backup database. Therefore, in the example
illustrated in Table 1-1, the audit information associated with transactions 100 and 101
is backed out of the backup database.
Typically, the extractor process sends audit information to the backup system within a
second after it has been written to the MAT on the primary system, so a minimum
number of transactions are lost when a disaster brings down the primary system.
Table 1-1. Audit Information At the Time of a Primary System Failure
Primary database updates
(Sequence in master audit trail file)
Updates sent to the backup
(Sequence in image trail file)
TRANS100—Update 1 TRANS100—Update 1
TRANS100—Update 2 TRANS100—Update 2
..
..
..
TRANS100—Update 10 TRANS100—Update 10
TRANS101—Update 1 TRANS101—Update 1
TRANS100—Commit record
(Primary system fails)