RDF System Management Manual for J-series and H-series RVUs (RDF 1.10)
the updater associated with the volume may report RDF event 813 - "Concurrent file opens exceeds
capacity". This happens if the updater has 3,000 files open and it must open a new file. Should
this occur, the updater immediately generates the RDF event 813, commits its current transaction,
closes all files, and restarts, which generates RDF event 837. When it restarts, it resumes processing
image audit at the audit record for the file that caused the problem. In this sense, the problem is
self-correcting, it does not impact updater performance, and it is safe for you to have more than
3,000 audited files on the volume. The danger comes when you have more than 3,000 audited
files on the volume and all of them need to be updated every 6-10 minutes on a regular basis. If
this happens regularly over a period of time, then it could cause the purger to stop purging files.
When this situation occurs, you must stop RDF as soon as possible and rebalance the number of
audited files on the primary and backup volume of the affected updater so that you have no more
than 3,000 audit files on that volume. When RDF has been stopped and you have rebalanced the
audit files, then reinitialize and reconfigure RDF using the INITTIME option. See “Initializing RDF
Without Stopping TMF (Using INITTIME Option)” (page 73).
If the above problem occurs, the purger has stopped purging files, and you are unable to stop RDF
to rebalance the number of audited files on the volume, you can try lowering the duration of the
updater's transaction to the minimum value of 10 seconds as a short term workaround. If this does
not correct the problem, then the easiest way to correct the problem is to suspend the extractor on
the primary system for 10 minutes. If you have RDF/ZLT protection, then you are not at risk of
losing any data if your primary system should fail while the extractor is suspended. If you do not
have RDF/ZLT protection, then you are vulnerable to loss of data to an unplanned outage of your
primary system from the point where you suspend the extractor to the point where the extractor
has caught up after you have activated it, but this workaround will allow the purger to purge files.
If suspending the extractor is not acceptable, then use the following steps to resolve the problem
for the short term:
1. Do status RDF to get the name of the latest image file on the image trail of the updater
generating the 813 events.
2. Stop RDF
3. Restart RDF with UPDATE OFF; this causes the receiver to rollover to a new image file on each
image trail.
4. After one minute, STOP RDF again
5. Restart RDF with UPDATE OFF; this causes the receiver to rollover to a new image file on each
image trail.
6. On the image trail for the updater generating the 813 events, move the next file in sequence
(the one after the file identified in step 1) to a different subvolume. For example, if the updater
is reading file AA000100, then move AA000101.
7. START UPDATE. The updater starts reporting 892 events after it completes reading the current
Image trail file AA000100.
8. Move the file AA000101 back to the image trail subvolume.
Completing Step 6 - Step 8 ensures that the updater does not read any further than the file it is
currently processing. On completing Step 8, the updater can continue with the next image file,
allowing the purger to purge the previous file.
Remember that all of this can be avoided by keeping your audit files balanced so that you do not
have more than 3,000 on any single RDF-protected volume.
Responding to Operational Failures
RDF can recover from any of the following events, as described in detail in the following pages:
• Communications line failure on the primary or backup system
• System failure that does not require an RDF Takeover operation
• Processor failure on the primary or backup system
Responding to Operational Failures 117










