DataLoader/MX Reference Manual (H06.03+, J06.03+)

Recovery Strategies
DataLoader/MX Reference Manual543544-001
7-5
Restarting From An Unknown State
Inserting and Deleting
If the load consists of inserting new rows, you can use the restarting from an unknown
state method. Your program can work its way through the file, attempting to insert
rows. If the insert fails because of a duplicate key, the row was inserted by the failed
run of this same load job. The program can continue until it is able to insert a row
without error, and then insert the rest of the rows. The same is true of deletes. If the
load attempts a delete of a row and finds that the row does not exist, that row must
have been deleted by the failed run. For updates, however, the situation is more
complex.
Updating
Although many tables contain dates that pertain to the business use of the data, it
usually is not possible to use this information to know at what record the failed load
stopped performing updates. Because of the possibility of delays of data entering the
system, the date column is not necessarily be a reliable indicator of when the table or
record was last updated.
You can set up a system in which data is segregated by date into separate loads so
that you can determine when the table or record was last updated. For example, on
Sunday, if the posting of data from previous weeks is delayed, a separate load run is
done for each of the delayed weeks’ work, followed by the load run for the data of the
week just completed. If this loading can be done reliably, you can restart from an
unknown state based upon the business data.
However, in most cases, a table does not have business data that can be used to
identify the load run that last altered it. In this case, you have the option of adding a
column for this purpose. Give this column a name, such as LOAD_ID, and make it a
32-bit integer. LOAD_ID starts at zero and is used by the load program for comparison
purposes. If the load run finishes successfully, the load run increments the LOAD_ID
column’s value. If the load run fails, the load run does not increment LOAD_ID.
This is an example of this algorithm:
At initialization, the load program reads the current LOAD_ID from a parameter file or
table.
The updating of each row depends upon that row not having its LOAD_ID equal to the
current LOAD_ID:
UPDATE
SALESTABLE
SET
LOAD_ID = :CurLoadID,
SALES_THIS_YEAR = SALES_THIS_YEAR + :SalesThisWeek
WHERE
LOAD_ID < :CurLoadID
AND
PART_NUM = :PartNum