RDF System Management Manual for H-Series RVUs (RDF 1.8)
Takeover Phase 2 – File Undo
This undo phase only gets executed if volumes went down on the primary system, transactions
were aborted, and the volumes were never reenabled on the primary system before the primary
system was lost. In that situation, RDF determines what Backout could not undo, and runs the
undo.
Takeover Phase 3 – Network Undo
Using RDF/ZLT, no undo operations are performed during the network undo phase because no
committed transactions are undone.
Phase three determines if network transaction data is missing from any of the backup systems
in the RDF network, and marks those transactions to be undone on all of the systems. For example,
suppose you began a network transaction, updated tables on ten different systems, and then
committed the transaction. Now suppose that nine of the ten systems were able to transmit their
updates and commit records to their backup systems, but the tenth primary system went down
before its extractor was able to do so. Phase three determines that the particular transaction
involved all 10 databases, that one of the backup databases is missing audit data for that
transaction, and identifies the transaction as one that must be undone on the other nine systems
(it is undone during phase 1 on the tenth system). All of the updaters then look for audit data
associated with the transaction, and undo it.
More specifically, each purger process has three phases of work to do:
1. produce the local undo list in the ZTXUNDO file
2. produce the file undo list, if required
3. produce the network undo list in the ZNETUNDO file
The purger of the network master determines what network transactions are incomplete across
the different backup systems, and it produces the master network undo list. Each purger then
uses this master list to ascertain the transaction data that must be undone on its backup database.
For example, if a network transaction involved only four of the ten primary systems in an RDF
network, then that transaction only needs to be undone on the backup databases where that data
was replicated. Because the other systems were not involved, the transaction does not need to
be listed there. The list of network transactions that need to be undone on a specific system
resides in its ZNETUNDO file.
Takeover Phase 3 Performance
The speed with which a takeover completes for an entire RDF network varies based on the
number of systems in the network and how far any system had fallen behind when the takeover
was initiated.
For example, if you have three systems in your RDF network, and all extractors on all three
systems were keeping up with audit generation on their systems, and then one system fails, the
takeover operations might only take a modest number of additional seconds to complete phase
3 takeover processing.
In contrast, if you have three systems in your RDF network, and one extractor had fallen 60
minutes behind at the time its system went down, then phase 3 takeover processing on the other
two systems will take many more seconds to complete. The reason for this is that phase 3
processing on the two systems that were not behind will have to go through 60 minutes of data
to determine what must be undone due to data missing on the system that had fallen behind.
A variation of the first example is that no extractors have fallen behind, but you have 25 systems
in your RDF network. In such a case, phase 3 processing might take many additional seconds
because data must be checked for so many different systems in order to determine what network
data might be missing from the various systems in the RDF network.
RDF Takeovers Within a Network Environment 279










