RDF System Management Manual for J-series and H-series RVUs (RDF 1.10)
• Failure of a TMF audited volume on the primary system
• TMF subsystem failure after which the TMF volume recovery is successful
• TMF file recovery operation on the primary system that is not to a timestamp, first purge, or
TOMATPOSITION position.
• TMF ABORT TRANSACTION with the AVOIDHANGING option on the primary system
RDF cannot recover from the following events:
• TMF file recovery operation to a timestamp, first purge, or TOMATPOSITION on the primary
system.
• TMF subsystem failure after which TMF cannot perform a successful volume recovery operation
After a TMF file recovery to a timestamp, first purge, or TOMATPOSITION, or after a TMF subsystem
failure for which volume recovery cannot succeed, the databases or the affected files on the primary
and backup systems must be resynchronized.
Communication Line Failures
RDF can recover from communication line failures. When the extractor detects that a communication
line to the backup system is down, it reports the error to the EMS event log. The extractor attempts
to resend data every minute until the line to the backup system is reenabled.
Unless you are running the ZRDF/ZLT product, the failure of the communications line will lead to
the loss of committed transactions if you also lose your primary system and you must perform an
RDF Takeover operation before the extractor was able to catch up. This risk is eliminated with the
RDF/ZLT product and a proper configuration for CommitHold. For further details see, “Zero Lost
Transactions (ZLT)” (page 320).
If you stop RDF on the primary system when the communication line to the backup system is down,
the monitor tries to send a stop message to the processes on the backup system and reports that
the line is down. All of the processes on the backup system continue to run until a STOP RDF
command is issued at the backup system.
NOTE: If you issue a STOP RDF command on the primary or backup system while the network
is down, you must also issue a STOP RDF command on the other system while the network is still
down.
If you have an RDF network running and the Network Master's RDFNET process encounters a
communications line failure when attempting to perform a network transaction on another primary
node in the RDF network, then it can lead to an increase in work to be performed during an RDF
Takeover operation. Once the comm line comes back up and the RDFNET process can resume its
network transactions, that need for increased takeover work is eliminated.
System Failures
If you lose your primary system and you can recover it without having to perform an RDF Takeover
operation, then no special recovery is required for RDF. When you have restarted your primary
system, then restart RDF before you restart your applications.
If you lose your primary system and you need to restart you applications as quickly as possible,
then perform the RDF Takeover operation on your backup system. Details of the various tasks you
need to do after the RDF Takeover are provided further below. Additionally, if you can eventually
recover your primary system, a discussion is also provided further below on how you can recover
the database on that system and bring it into synchronization with the database on your backup
system where your applications are now running.
If you lose your backup system, you only need to recover it and then restart RDF on your primary
system as quickly as possible. If the communications line to your backup system has sufficient
bandwidth, then RDF can catch up very quickly.
118 Critical Operations, Special Situations, and Error Conditions










