RDF System Management Manual for H-Series RVUs (RDF 1.8)

When the CPU that failed comes back up, RDF switches the extractor to run on the reactivated
primary CPU.
Receiver Failure
If the primary CPU of the receiver process fails, the receiver process in the backup CPU takes
over and resynchronizes with the extractor process. The extractor process might have to resend
audit data that was generated several seconds earlier. When the CPU that failed comes back up,
RDF switches the receiver to run on the reactivated primary CPU.
Updater Failure
If the primary CPU of an updater process fails, the corresponding updater process in the backup
CPU takes over.
If both the primary and backup CPUs of an updater fail, RDF aborts. A subsequent START RDF
command restarts the process without requiring database resynchronization. To support
restartability, however, the updaters use a different mechanism than the extractor or receiver:
the updaters rely entirely on context saving rather than checkpointing. For this reason, if the
backup member of an updater process pair takes over because the CPU of the primary member
failed, the backup updater might have to start at an earlier point in the image trail and require
several minutes to reach the point where the primary process was positioned when the CPU
failed.
If the primary CPU of an updater process fails and then comes back up, RDF does not switch
the updater to run on the reactivated primary CPU. Instead, once the backup updater takes over,
it becomes (and remains) the new primary process. If you subsequently stop and then restart
updating, however, the original CPU configuration for this updater process is restored.
Purger Failure
If the primary CPU of the purger process fails, the purger process in the backup CPU takes over,
the current PURGETIME interval is aborted, and a new PURGETIME interval is started. When
the CPU that failed comes back up, RDF switches the purger to run on the reactivated primary
CPU.
If both the primary and backup CPUs of the purger process fail, RDF aborts.
RDFNET Failure
If the primary CPU of the RDFNET process fails, the RDFNET process in the backup CPU takes
over. When the CPU that failed comes back up, RDF switches the RDFNET process to run on
the reactivated primary CPU.
If both the primary and backup CPUs of the RDFNET process fail, RDF aborts.
RDF State Transition Failure
Periods during which the RDF updaters (or RDF itself) are either starting or stopping are known
as RDF state transitions. In rare instances, when a primary CPU fails for an RDF process during
execution of a STOP RDF or STOP UPDATE command, not all RDF processes complete the state
transition properly.
To minimize the chance of encountering this kind of failure, avoid CPU reloads during RDF state
transitions. Furthermore, if a CPU failure does occur during a state transition, carefully review
the EMS event log for signs of incorrect behavior. If the failure occurred while RDF or the updating
facility was stopping, check the Process Pair Directory (PPD) to ensure that the appropriate RDF
processes all have stopped; if they have not, you must stop them manually.
If a state transition failure occurs during execution of a STOP RDF command and the operation
appears to be stalled, manually stop all of the RDF processes by issuing the following command
on both the primary and backup system:
126 Managing RDF