RDF System Management Manual
Table Of Contents
- RDF System Management Manual
- What’s New in This Manual
- About This Manual
- 1 Introducing RDF
- RDF Subsystem Overview
- RDF Processes
- RDF Operations
- Reciprocal and Chain Replication
- Available Types of Replication to Multiple Backup Systems
- Triple Contingency
- Loopback Configuration (Single System)
- Online Product Initialization
- Online Database Synchronization
- Online Dumps
- Subvolume- and File-Level Replication
- Shared Access DDL Operations
- EMS Support
- SMF Support
- RTD Warning Thresholds
- Process-Lockstep Operation
- Support for Network Transactions
- RDF and NonStop SQL/MX
- Zero Lost Transactions (ZLT)
- Monitoring RDF Entities With ASAP
- 2 Preparing the RDF Environment
- 3 Installing and Configuring RDF
- 4 Operating and Monitoring RDF
- 5 Managing RDF
- Recovering From File System Errors
- Handling Disk Space Problems
- Responding to Operational Failures
- Stopping RDF
- Restarting RDF
- Carrying Out a Planned Switchover
- Takeover Operations
- Reading the Backup Database
- Access to Backup Databases in a Consistent State
- RDF and NonStop SQL/MP DDL Operations
- RDF and NonStop SQL/MX Operations
- Backing Up Image Trail Files
- Making Online Dumps With Updaters Running
- Doing FUP RELOAD Operations With Updaters Running
- Exception File Optimization
- Switching Disks on Updater UPDATEVOLUMES
- 6 Maintaining the Databases
- 7 Online Database Synchronization
- 8 Entering RDFCOM Commands
- 9 Entering RDFSCAN Commands
- 10 Triple Contingency
- 11 Subvolume- and File-Level Replication
- 12 Auxiliary Audit Trails
- 13 Network Transactions
- Configuration Changes
- RDF Network Control Files
- Normal RDF Processing Within a Network Environment
- RDF Takeovers Within a Network Environment
- Takeover Phase 1 – Local Undo
- Takeover Phase 2 – File Undo
- Takeover Phase 3 – Network Undo
- Takeover Phase 3 Performance
- Communication Failures During Phase 3 Takeover Processing
- Takeover Delays and Purger Restarts
- Takeover Restartability
- Takeover and File Recovery
- The Effects of Undoing Network Transactions
- Takeover and the RETAINCOUNT Value
- Network Configurations and Shared Access NonStop SQL/MP DDL Operations
- Network Validation and Considerations
- RDF Re-Initialization in a Network Environment
- RDF Networks and ABORT or STOP RDF Operations
- RDF Networks and Stop-Update-to-Time Operations
- Sample Configurations
- RDFCOM STATUS Display
- 14 Process-Lockstep Operation
- Starting a Lockstep Operation
- The DoLockstep Procedure
- The Lockstep Transaction
- RDF Lockstep File
- Multiple Concurrent Lockstep Operations
- The Lockstep Gateway Process
- Disabling Lockstep
- Reenabling Lockstep
- Lockstep Performance Ramifications
- Lockstep and Auxiliary Audit Trails
- Lockstep and Network Transactions
- Lockstep Operation Event Messages
- 15 NonStop SQL/MX and RDF
- Including and Excluding SQL/MX Objects
- Obtaining ANSI Object Names From Updater Event Messages
- Creating NonStop SQL/MX Primary and Backup Databases from Scratch
- Creating a NonStop SQL/MX Backup Database From an Existing Primary Database
- Online Database Synchronization With NonStop SQL/MX Objects
- Offline Synchronization for a Single Partition
- Online Synchronization for a Single Partition
- Correcting Incorrect NonStop SQL/MX Name Mapping
- Consideration for Creating Backup Tables
- Restoring to a Specific Location
- Comparing NonStop SQL/MX Tables
- 16 Zero Lost Transactions (ZLT)
- A RDF Command Summary
- B Additional Reference Information
- C Messages
- D Operational Limits
- E Using ASAP
- Index
Operating and Monitoring RDF
HP NonStop RDF System Management Manual—524388-003
4-22
Displaying Current Configuration Parameters and
Operating Statistics
•
Error lets you know if a process has experienced an error. If the column is blank,
no error has occurred. If the column for an updater contains asterisks (*****), the
updater has experienced a critical error. If the updater is doing an undo pass, the
word undo appears in the Error column. If RDFCOM cannot reach a particular
process, the Error column for that process contains the applicable file system error
number.
The occurrence of a critical error could mean that the backup database is no
longer synchronized with the primary database because of data loss. If asterisks
appear in the Error column for any RDF process, you should examine the
messages in the RDF log file or on the RDF log device to determine what is
happening and what corrective action to take.
Except for updaters, asterisks in the Error column continue to appear in every
STATUS RDF display until the error condition has been corrected.
For updaters, the asterisks disappear when the error is corrected and updating is
restarted after execution of any of the following commands:
STOP UPDATE
STOP RDF
STOP TMF
Note that although the occurrence of a critical error might mean that the primary
and backup databases are no longer synchronized with one another, that is not
always the case. If, for example, the primary CPU of the disk process goes down,
all updater processes affected by that error condition report a file system error and
then attempt to restart. If the error does not occur again when the affected updater
processes restart, the databases are probably still synchronized with one another.
In that case, the asterisks are cleared from subsequent STATUS RDF displays.
For more information on critical errors, you can scan the EMS collectors on the
primary and backup systems:
The EMS collector on the primary system contains log messages for the
extractor and monitor processes.
The EMS collector on the backup system contains log messages for the
receiver, purger, and all updater processes.
When RDF is not running, the STATUS RDF report indicates why. For example, the
report might indicate that the subsystem has never been started, or that it has crashed.
The report also indicates where processing resumes in the TMF audit trail when RDF
is restarted.
When the BREAK key is pressed while the STATUS RDF command is executing with
the PERIOD option (which requests repeated displays at a specified interval), the
break takes effect within one second rather than waiting until the end of the current
interval.