RDF System Management Manual for J-series and H-series RVUs (RDF 1.10)

ManualsBrandsHP ManualsServerHP Integrity NonStop J-Series

111

112

113

114

115

116

117

118

119

120

• Failure of a TMF audited volume on the primary system

• TMF subsystem failure after which the TMF volume recovery is successful

• TMF file recovery operation on the primary system that is not to a timestamp, first purge, or

TOMATPOSITION position.

• TMF ABORT TRANSACTION with the AVOIDHANGING option on the primary system

RDF cannot recover from the following events:

• TMF file recovery operation to a timestamp, first purge, or TOMATPOSITION on the primary

system.

• TMF subsystem failure after which TMF cannot perform a successful volume recovery operation

After a TMF file recovery to a timestamp, first purge, or TOMATPOSITION, or after a TMF subsystem

failure for which volume recovery cannot succeed, the databases or the affected files on the primary

and backup systems must be resynchronized.

Communication Line Failures

RDF can recover from communication line failures. When the extractor detects that a communication

line to the backup system is down, it reports the error to the EMS event log. The extractor attempts

to resend data every minute until the line to the backup system is reenabled.

Unless you are running the ZRDF/ZLT product, the failure of the communications line will lead to

the loss of committed transactions if you also lose your primary system and you must perform an

RDF Takeover operation before the extractor was able to catch up. This risk is eliminated with the

RDF/ZLT product and a proper configuration for CommitHold. For further details see, “Zero Lost

Transactions (ZLT)” (page 320).

If you stop RDF on the primary system when the communication line to the backup system is down,

the monitor tries to send a stop message to the processes on the backup system and reports that

the line is down. All of the processes on the backup system continue to run until a STOP RDF

command is issued at the backup system.

NOTE: If you issue a STOP RDF command on the primary or backup system while the network

is down, you must also issue a STOP RDF command on the other system while the network is still

down.

If you have an RDF network running and the Network Master's RDFNET process encounters a

communications line failure when attempting to perform a network transaction on another primary

node in the RDF network, then it can lead to an increase in work to be performed during an RDF

Takeover operation. Once the comm line comes back up and the RDFNET process can resume its

network transactions, that need for increased takeover work is eliminated.

System Failures

If you lose your primary system and you can recover it without having to perform an RDF Takeover

operation, then no special recovery is required for RDF. When you have restarted your primary

system, then restart RDF before you restart your applications.

If you lose your primary system and you need to restart you applications as quickly as possible,

then perform the RDF Takeover operation on your backup system. Details of the various tasks you

need to do after the RDF Takeover are provided further below. Additionally, if you can eventually

recover your primary system, a discussion is also provided further below on how you can recover

the database on that system and bring it into synchronization with the database on your backup

system where your applications are now running.

If you lose your backup system, you only need to recover it and then restart RDF on your primary

system as quickly as possible. If the communications line to your backup system has sufficient

bandwidth, then RDF can catch up very quickly.

118 Critical Operations, Special Situations, and Error Conditions