DataLoader/MX Reference Manual (G06.24+)

Recovery Strategies
DataLoader/MX Reference Manual525872-002
7-5
Restarting From An Unknown State
have been deleted by the failed run. For updates, however, the situation is more
complex.
Updating
Although many tables contain dates that pertain to the business use of the data, it
usually is not possible to use this information to know at what record the failed load
stopped performing updates. Because of the possibility of delays of data entering the
system, the date column is not necessarily be a reliable indicator of when the table or
record was last updated.
You can set up a system in which data is segregated by date into separate loads so
that you can determine when the table or record was last updated. For example, on
Sunday, if the posting of data from previous weeks is delayed, a separate load run is
done for each of the delayed weeks’ work, followed by the load run for the data of the
week just completed. If this loading can be done reliably, you can restart from an
unknown state based upon the business data.
However, in most cases, a table does not have business data that can be used to
identify the load run that last altered it. In this case, you have the option of adding a
column for this purpose. Give this column a name, such as LOAD_ID, and make it a
32-bit integer. LOAD_ID starts at zero and is used by the load program for comparison
purposes. If the load run finishes successfully, the load run increments the LOAD_ID
column’s value. If the load run fails, the load run does not increment LOAD_ID.
Following is an example of this algorithm:
At initialization, the load program reads the current LOAD_ID from a parameter file or
table.
The updating of each row depends upon that row not having its LOAD_ID equal to the
current LOAD_ID:
UPDATE
SALESTABLE
SET
LOAD_ID = :CurLoadID,
SALES_THIS_YEAR = SALES_THIS_YEAR + :SalesThisWeek
WHERE
LOAD_ID < :CurLoadID
AND
PART_NUM = :PartNum
At successful completion of the load job, the program increments the LOAD_ID in the
parameter file or table.
The LOAD_ID field can serve another purpose. It enables you to query the database to
determine which rows were and were not altered in the latest load run.
Generally, restarting from an unknown state is not the best recovery strategy when
multiple updates to a record can occur in a single load job. The previous example
could be extended to accommodate multiple updates by adding another column called