DataLoader/MX Reference Manual (G06.24+)
Running DataLoader/MX
DataLoader/MX Reference Manual—525872-002
3-13
Creating Parallelism
•
Taking advantage of parallelism
Creating Parallelism
If the input is a single stream, such as a set of tapes that cannot be processed
separately or a single LAN transfer that cannot be changed into multiple simultaneous
transfers, you must break the single input stream into multiple streams that can be
processed in parallel. To do this, direct a single DataLoader/MX process to read the
input and distribute it to other processes (usually other DataLoader/MX processes).
If it does not matter which process receives a given input record, you can run an initial
DataLoader/MX process with its output $RECEIVE and specify this DataLoader/MX
process as the input for the multiple downstream processes. This approach provides a
self-balancing and easily tunable way to create this type of parallelism.
If each input record must go to a specific downstream process, you can specify the
KEYRANGE interpretation for the initial DataLoader/MX process output file.
Sometimes the load or maintenance strategy does not involve doing complete
processing of the input at the time it is read (perhaps the data will be stored in
intermediate files for recovery or batch control purposes). If you want to break the input
into a number of files based only on the record count, use an INDIRECT file as the -O=
file with the MAX modifier on each of the file names in the INDIRECT file. This
approach makes DataLoader/MX divide the input into multiple output files (usually on
multiple disks on multiple processors), the basis for complete parallelism at the next
stage of processing.
Taking Advantage of Parallelism
When parallelism exists, whether it has been created by using the methods described
previously or is the result of having multiple, simultaneous input streams,
DataLoader/MX can continue to process in parallel.
Each of the now-parallel streams does inserts, updates, or deletes, each of the
DataLoader/MX processes continue, and the parallelism inherent in the underlying
system (like the sophisticated locking techniques used in SQL/MX) enables them to
process in parallel.
Alternatively, if each of the now-parallel streams performs a load, each of the
DataLoader/MX processes can supply records to its own import or SQLCI
LOAD/COPY (for SQL/MP) process, which would perform a partition load, and
parallelism is maintained.
Building Your Loading Application
A typical loading problem cannot be solved by running a single DataLoader/MX
process. Instead, loading is performed by an application consisting of multiple
DataLoader/MX processes, some customized, others standard, and perhaps one or