DataLoader/MX Reference Manual (G06.24+)

Running DataLoader/MX

DataLoader/MX Reference Manual—525872-002

3-13

Creating Parallelism

•

Taking advantage of parallelism

Creating Parallelism

If the input is a single stream, such as a set of tapes that cannot be processed

separately or a single LAN transfer that cannot be changed into multiple simultaneous

transfers, you must break the single input stream into multiple streams that can be

processed in parallel. To do this, direct a single DataLoader/MX process to read the

input and distribute it to other processes (usually other DataLoader/MX processes).

If it does not matter which process receives a given input record, you can run an initial

DataLoader/MX process with its output $RECEIVE and specify this DataLoader/MX

process as the input for the multiple downstream processes. This approach provides a

self-balancing and easily tunable way to create this type of parallelism.

If each input record must go to a specific downstream process, you can specify the

KEYRANGE interpretation for the initial DataLoader/MX process output file.

Sometimes the load or maintenance strategy does not involve doing complete

processing of the input at the time it is read (perhaps the data will be stored in

intermediate files for recovery or batch control purposes). If you want to break the input

into a number of files based only on the record count, use an INDIRECT file as the -O=

file with the MAX modifier on each of the file names in the INDIRECT file. This

approach makes DataLoader/MX divide the input into multiple output files (usually on

multiple disks on multiple processors), the basis for complete parallelism at the next

stage of processing.

Taking Advantage of Parallelism

When parallelism exists, whether it has been created by using the methods described

previously or is the result of having multiple, simultaneous input streams,

DataLoader/MX can continue to process in parallel.

Each of the now-parallel streams does inserts, updates, or deletes, each of the

DataLoader/MX processes continue, and the parallelism inherent in the underlying

system (like the sophisticated locking techniques used in SQL/MX) enables them to

process in parallel.

Alternatively, if each of the now-parallel streams performs a load, each of the

DataLoader/MX processes can supply records to its own import or SQLCI

LOAD/COPY (for SQL/MP) process, which would perform a partition load, and

parallelism is maintained.

Building Your Loading Application

A typical loading problem cannot be solved by running a single DataLoader/MX

process. Instead, loading is performed by an application consisting of multiple

DataLoader/MX processes, some customized, others standard, and perhaps one or