RDF System Management Manual for J-series and H-series RVUs (RDF 1.10)

When all of the updater processes have stopped, the purger logs either the RDF event number 724
or 725 before stopping. Event 724 indicates that the takeover completed successfully. Event 725
indicates that it did not, and you should reissue the TAKEOVER command. Event 724 is always
followed by event 735, which indicates the last MAT position seen by the receiver process. The
735 event is used primarily for triple contingency. These events will be followed by either RDF
event 888 or 858. See “Restoring the Primary Systemfor more information.
For RDF network takeover considerations, see Chapter 14 (page 279).
For super fast takeover, see “How to Plan for the Fastest Movement of Business Operations to Your
Backup System After Takeover” (page 135).
Takeover Failure
If a double CPU failure occurs and any RDF process pair fails during the takeover operation, you
can restart the operation just by entering the TAKEOVER command through RDFCOM again. You
can ascertain that a takeover operation failed by issuing a STATUS RDF command and getting a
response such as the following:
STATUS RDF (\RDF04 -> \RDF05) is NOT running
A partial RDF TAKEOVER has completed
Also, a takeover failure generates RDF event 725 in the EMS log.
Monitor Considerations
Whether the RDF monitor was started when the initial TAKEOVER command was executed or not,
this process is always started when the TAKEOVER command is reissued.
Updater Considerations
When the purger shuts down at the end of the takeover operation, it examines the context record
of each updater process to determine if that updater has processed all applicable audit data
through to end-of-file in the image trail. If all updaters have processed through to end-of-file, the
purger logs a 724 message to the EMS event log, indicating that the takeover operation completed
successfully. But if it determines that one or more updaters have terminated prematurely, the purger
logs RDF Event number 726 for the first updater that failed and then logs RDF Event number 725,
a general message indicating that the takeover operation did not complete successfully. If these
messages appear in the EMS event log, you must reissue the TAKEOVER command.
Takeover and Triple Contingency
If you have configured two RDF subsystems for Triple Contingency, then when both takeover
operations complete you must examine the RDF event 735 on each backup system. If both report
the exact same MAT position, then you can designate either system as your new primary, configure
a new RDF system to run from this new primary to the backup, and then resume application
processing on the new primary with full RDF protection.
If each reports a different MAT position, then go to the backup system with the lowest MAT position
and execute the COPYAUDIT command (see Chapter 10 for details). The COPYAUDIT command
will copy over all additional audit that the other backup system has. When the command completes,
you then enter a new Takeover command on the local system. When it completes, the two databases
are in complete synchronization and you can then resume application processing on either backup
system, as indicated above.
Checking Exception Files for Uncommitted Transactions
Exception files are used by updaters to store information about each audit record that the updater
undoes during the three possible undo passes. An exception record logs information about a
specific audit record that the updater has undone. This may or may not be useful information for
you. If the volume of audit is small, then logging an exception record for each record undone might
have only a slight performance impact during the takeover operation. If, however, the volume of
134 Critical Operations, Special Situations, and Error Conditions