RDF System Management Manual for J-series and H-series RVUs (RDF 1.10)

Usage Guidelines
The TAKEOVER command is customarily issued when the primary system fails or otherwise becomes
unavailable, and you want to make the backup database your new database of record for your
applications.
CAUTION: The TAKEOVER command is not a normal operational command. Operators should
never issue this command strictly on their own initiative. Issue this command only when specifically
told to do so by someone in high authority.
For a thorough discussion of a variety of issues you need to plan for in order to facilitate a fast
overall takeover operation that moves your application processing to the backup system, see the
discussion for “How to Plan for the Fastest Movement of Business Operations to Your Backup System
After Takeover” (page 135) in Chapter 5. The TAKEOVER command normally takes only a matter
of a few seconds, but all the other considerations and tasks delay moving your applications to the
backup system. With advanced planning, RDF customers have been able to recover from loss of
the primary system and resume operations on the backup system within a small number of minutes,
but it requires advanced planning.
For takeover considerations in a ZLT environment, see Chapter 17 (page 320).
If RDF is running with Update On in a non-ZLT environment, then RDFCOM sends a takeover
message to each RDF process on the backup system, and an RDF monitor is not started.
If RDF is running with updating off, RDFCOM stops the receiver and purger processes and starts
the monitor in takeover mode. The monitor then starts the receiver and purger processes and all
updater processes.
In a non-network configuration, a takeover operation occurs in two phases.
Phase 1 (local undo) undoes transaction data that was incomplete at the backup system at
the time the primary system failed. That is, it undoes transactions that were applied during the
redo phase but the final states of those transactions are unknown by RDF.
Phase 2 (file undo) only runs if volumes went down on the primary system, transactions were
aborted, and the volumes were never reenabled on the primary system before the primary
system was lost. In that situation, RDF determines what Backout could not undo, and performs
that undo itself.
A network configuration adds a third phase (network undo). See Chapter 14 (page 279).
For more information about undo processing during a takeover operation, see Takeover Operations
in Chapter 5 (page 113).
During the takeover operation, the purger produces lists that identify all transactions that must be
undone by the updaters during the three different undo phases. These are stored in structured files,
but they can be read with the READLIST utility in the RDF software's subvolume. See page 334 for
the files that are created for the three undo phases. Additionally, if you have configured RDF
UPDATEREXCEPTION ON, then each updater record information about each audit record it undoes
during the undo passes into its own exception file, thereby giving you an accurate account of what
was undone by each updater. If you have this attribute off, then it only records the first and last
record it has undone. Under normal conditions, the number of transactions undone by an updater
is small and writing to the exception file has not measurable cost. In some circumstances, writing
to the exception file can prolong the RDF takeover operation:
long-running batch transactions
If a long batch transaction was running on your primary system that did a large number of
updaters at the time the primary system failed, then all of these need to be undone by the
244 Entering RDFCOM Commands