Dataloader/MP Reference Manual

Table Of Contents
Recovery Strategies
DataLoader/MP Reference Manual424148-003
7-2
Design Time Considerations
Design Time Considerations
You should consider how you will load your database at the time you design it, along
with other considerations, such as query performance and online transaction
processing performance. Along with those considerations, consider how you will
recover if the load process fails. You must ensure that DataLoader/MP will be able to
read the data again, possibly in the same order as it read it on its first run.
The database design process is a trade-off between a design that can be loaded
quickly and recover from failures during loads and a design that is optimized for other
functions. For example, putting all of the new data in a new partition or at the end of an
existing partition is good for load performance and recovery but can result in poor
application performance because placing all of the new data on a single disk eliminates
the possibility of using parallelism in the normal use of the database. No fixed set of
priorities exist, because different types of applications have different considerations.
Tools
The tools used to implement loading are:
DataLoader/MP: This tool uses single threaded or parallel processing to load
databases.
SQLCI LOAD: This tool replaces all of the data in a partition of a SQL/MP table.
SQLCI APPEND: This tool appends data to the end of a partition of a SQL/MP
table.
Some Basic Rules
In every case where it is necessary to perform recovery of the load process, recovery
will require that the DataLoader/MP process read the input data again, and in some
cases it must be in the exact same order, so you must determine if this is possible.
Whatever the data’s source, you must make sure that the data, or a copy of it, remains
unmodified until the load is successfully finished. If necessary, you must add steps to
your load scenario to make this occur.
For example, data in a disk file can easily be read again. However, if the data is being
received by FTP from another system, when you rerun the load after a failure, the data
might no longer be available, or might not be available in the same order. In that case,
you may need to add a step to the normal loading scenario to FTP the data to a disk
from which you can load, rather than loading it directly from the FTP process.
If you cannot reread the input data stream that was being processed when the failure
occurred, then you may not be able to craft a reliable failure recovery scenario.
Your goal should be to construct a failure recovery plan that will enable you to run the
failed loading batch job again. The loading batch job should contain logic, if necessary,
to recognize and recover from the failure. In some cases, you may have to do some
simple operations, such as setting a parameter, purging data, or copying a table to