EMS Manual

Configuring EMS
EMS Manual426909-005
12-18
Logging Integrity
The CPU in which the event message was generated fails before the message can
be delivered to the appropriate collector. (EMS cannot prevent this.)
A collector’s log-file space is inadequate. This occurs when a combination of these
attribute settings causes log files to fill up too fast:
ROTATEFILES is FALSE.
MAXFILE is too small.
PRIMARYEXTENT, SECONDARYEXTENT, or both are too small.
To avoid this problem, take one or more of these actions:
Let a collector overwrite existing log files, by setting ROTATEFILES to TRUE.
Dedicate more disk space to a collector, by increasing MAXFILE,
PRIMARYEXTENTS, or SECONDARYEXTENTS.
Archive and remove filled files at a faster rate than they are currently being
removed.
Both CPUs in which the disk process is running fail simultaneously, and the disk
cache or the end-of-file pointer—or both—are lost. To reduce the impact of such
CPU failures, set EOFREFRESH and WRITETHRUCACHE to TRUE. This fix does
impact peak logging performance. EOFREFRESH has greater potential benefit, in
terms of the number of messages that might otherwise be lost, than does
WRITETHRUCACHE, and EOFREFRESH costs less in terms of performance.
That is why collectors use default values of TRUE for EOFREFRESH and FALSE
for WRITETHRUCACHE.
The rate at which a collector can log event messages is exceeded. This occurs
when some combination of these circumstances floods the collector with event
messages:
A collector is slowed down by user-requested disk overhead when
EOFREFRESH or WRITETHRUCACHE, or both, are TRUE, and BLOCKING
is OFF.
There is significant contention for the disk volume to which a collector is
logging.
One or more event-message sources (subsystems or forwarding distributors)
are generating event messages at an exceedingly high rate.
A collector’s queuing space is insufficient. To avoid this problem, take one or
more of these actions:
Reduce the disk overhead, by setting EOFREFRESH and
WRITETHRUCACHE to FALSE and by logging to a disk volume for which
there is less contention. (Setting EOFREFRESH and WRITETHRUCACHE
to TRUE reduces the risk of losing event messages because of
simultaneous CPU failures in the disk-process CPUs. However, setting
these attributes to TRUE can actually cause messages to be lost if the