RDF System Management Manual for J-series and H-series RVUs (RDF 1.10)

Purger Failure
If the primary CPU of the purger process fails, the purger process in the backup CPU takes over,
the current PURGETIME interval is aborted, and a new PURGETIME interval is started. When the
CPU that failed comes back up, RDF switches the purger to run on the reactivated primary CPU.
If both the primary and backup CPUs of the purger process fail, RDF aborts.
RDFNET Failure
If the primary CPU of the RDFNET process fails, the RDFNET process in the backup CPU takes over.
When the CPU that failed comes back up, RDF switches the RDFNET process to run on the
reactivated primary CPU.
If both the primary and backup CPUs of the RDFNET process fail, RDF aborts.
RDF State Transition Failure
Periods during which RDF or the updating process is either starting or stopping are known as RDF
state transitions. In rare instances, when a primary CPU fails while RDF is either starting or stopping,
it is possible that not all processes complete the stop or start operation.
To minimize the chance of encountering this kind of failure, avoid CPU reloads during RDF state
transitions. Furthermore, if a CPU failure does occur during a state transition, carefully review the
EMS event log for signs of incorrect behavior. If the failure occurred while RDF or the updating
facility was stopping, check the Process Pair Directory (PPD) to ensure that the appropriate RDF
processes all have stopped; if they have not, you must stop them manually.
If a state transition failure occurs during execution of a STOP RDF command and the operation
appears to be stalled, manually stop all of the RDF processes by issuing the following command
on both the primary and backup system:
STATUS *, PROG RDF-software-loc.*, STOP
For example,
STATUS *, PROG $SYSTEM.RDF.*, STOP
If a state transition failure occurs during execution of a STOP UPDATE command and the operation
appears to be stalled, manually stop all of the RDF updaters by issuing the following command on
the backup system:
STATUS *, PROG RDF-software-loc.RDFUPDO, STOP
CAUTION: Issuing this command in this situation is only safe, however, if this is the backup system
for a single RDF environment.
Problems Involving TMF
TMF Audited Volume Failure
RDF can recover from a failure of a TMF audited volume on the primary or backup system. If the
volume is successfully recovered by volume recovery, then you do not have to perform any special
RDF procedures.
TMF Subsystem Failure on the Primary System
RDF can recover from a TMF failure on the primary system if the TMF volume recovery operation
is successful after the failure. To perform this recovery:
1. Stop RDF on the primary system by entering the following command through RDFCOM:
]STOP RDF
2. Restart TMF by entering the following command sequence through TMFCOM:
~DISABLE DATAVOLS *
~START TMF
120 Critical Operations, Special Situations, and Error Conditions