Dataloader/MP Reference Manual

Table Of Contents
Running DataLoader/MP
DataLoader/MP Reference Manual424148-003
3-14
Analyzing Your Configuration
experiment with this method. For example, you can start with one data source and
one DataLoader/MP process feeding into multiple downstream processes. You
may discover that if you use two or more DataLoader/MP processes for input, your
load becomes better balanced and throughput improves.
Inter-process communication (IPC) costs. When splitting the load by adding
processes, be aware of the increased IPC costs. With too many processes, your
configuration can reach a point of diminishing returns, and performance can
actually decrease. Another important consideration is that fewer processes are
simpler to manage.
In addition to these guidelines, examples of the four basic loading configurations are
provided in Section 6, DataLoader/MP Examples. These examples should provide a
basis for nearly every loading scenario you will encounter.
Analyzing Your Configuration
After your configuration is defined, you should analyze it for performance by running
DataLoader/MP, measuring it, and analyzing the results of a few simple experiments.
For most configurations, the statistics provided through the -S= parameter may be
sufficient. However, you can also use Measure and other performance tools.
When you run DataLoader/MP with the -S= parameter it generates statistics about the
internal buffer pool. DataLoader/MP uses this pool to buffer data for the downstream
reader processes (typically other DataLoader/MP processes). When a DataLoader/MP
process uses $RECEIVE as its output, it reserves 16 buffers to block the records for up
to 16 different reader processes. When all the buffers are full and no reader has
appeared, the initial DataLoader/MP must stop processing data until at least one buffer
is freed up.
DataLoader/MP’s statistics for the buffer pool provide a good indication of how well
balanced the configuration is, as follows:
If the buffer pool is full most of the time, you might not have enough downstream
DataLoader/MP processes to consume the data at the rate the initial
DataLoader/MP is generating it. You might want to increase the number of
downstream DataLoader/MP processes to the point at which the consumer rate is
as high as or higher than the production rate.
If the buffer pool is empty most of the time, the downstream processes are running
ahead of the initial DataLoader/MP process. You might want to consider shifting
some work from this DataLoader/MP process to the downstream DataLoader/MP
processes.
A well-designed loading application uses the smallest number of processes required to
accomplish the loading task, and distributes the load evenly across all available CPUs.
See Section 6, DataLoader/MP Examples for examples of creating and maintaining
parallelism within four basic loading scenarios. These scenarios are the basis for
nearly every loading situation you will encounter.