RDF/IMP and IMPX System Management Manual (RDF 1.3+)

ManualsBrandsHP ManualsServerHP NonStop G-Series

121

122

123

124

125

126

127

128

129

130

Managing RDF

Compaq NonStop™ RDF/IMP and IMPX System Management Manual—522204-001

5-7

Processor Failures

If any RDF process pair stops unexpectedly, the monitor sends an abort message to all

other RDF processes.

The subtopics that follow discuss how RDF responds to extractor, receiver, updater, and

RDF state transition failures.

Extractor Failure

Although the extractor runs as a process pair, the primary process does not maintain

restart information nor checkpoint this information to its backup. Instead, the receiver

maintains all restart information for the extractor, ensuring that the extractor is

restartable. The restart point is based on the Master Audit Trail (MAT) position of the

last record stored in the image trail on the backup system.

If the extractor process pair inadvertently stops, the monitor sends abort messages to the

other RDF processes in order to bring about an orderly shutdown of RDF. You can then

restart the subsystem by merely issuing a START RDF command.

If the primary extractor process fails, the backup process requests from the receiver a

new starting position in the MAT, ensuring a correct restart position. This extractor-

receiver protocol also provides protection against messages from the extractor

erroneously arriving out-of-order: if a message arrives out-of-order, the receiver simply

directs the extractor to restart. When the primary CPU that failed comes back up, RDF

switches the new primary (formerly the backup) extractor process so that it runs in the

primary CPU.

Receiver Failure

If the primary receiver process stops, the backup receiver process takes over and

resynchronizes with the extractor process. The extractor process might have to resend

audit data that was generated several seconds earlier. When the primary CPU that failed

comes back up, RDF switches the new primary (formerly the backup) receiver process

so that it runs in the primary CPU.

Note. If the monitor process pair unexpectedly stops (for example, as in a double CPU failure),

you must stop the other RDF processes manually and then restart the subsystem. When

stopping RDF processes manually, you must first stop the extractor on the primary system,

then stop all updaters on the backup system, and finally stop the receiver on the backup

system. The easiest way to do this is to issue a series of commands of the following form:

STATUS *,PROG $SYSTEM.RDF.procname, STOP. The following command provides an

example:

STATUS *, PROG $SYSTEM.RDF.RDFUPDO, STOP

Alternatively, after stopping the extractor, you can stop all updaters and the receiver on the

backup system by issuing a STOP RDF command on the backup system.

Caution. During the interval between loss of the extractor and RDF subsystem restart, you

should not add any disk volumes to the RDF configuration (with the ADD VOLUME command).