Dataloader/MP Reference Manual

ManualsBrandsHP ManualsServerHP NonStop G-Series

111

112

113

114

115

116

117

118

119

120

Table Of Contents

Recovery Strategies

DataLoader/MP Reference Manual—424148-003

7-2

Design Time Considerations

You should consider how you will load your database at the time you design it, along

with other considerations, such as query performance and online transaction

processing performance. Along with those considerations, consider how you will

recover if the load process fails. You must ensure that DataLoader/MP will be able to

read the data again, possibly in the same order as it read it on its first run.

The database design process is a trade-off between a design that can be loaded

quickly and recover from failures during loads and a design that is optimized for other

functions. For example, putting all of the new data in a new partition or at the end of an

existing partition is good for load performance and recovery but can result in poor

application performance because placing all of the new data on a single disk eliminates

the possibility of using parallelism in the normal use of the database. No fixed set of

priorities exist, because different types of applications have different considerations.

Tools

The tools used to implement loading are:

•

DataLoader/MP: This tool uses single threaded or parallel processing to load

databases.

•

SQLCI LOAD: This tool replaces all of the data in a partition of a SQL/MP table.

•

SQLCI APPEND: This tool appends data to the end of a partition of a SQL/MP

table.

Some Basic Rules

In every case where it is necessary to perform recovery of the load process, recovery

will require that the DataLoader/MP process read the input data again, and in some

cases it must be in the exact same order, so you must determine if this is possible.

Whatever the data’s source, you must make sure that the data, or a copy of it, remains

unmodified until the load is successfully finished. If necessary, you must add steps to

your load scenario to make this occur.

For example, data in a disk file can easily be read again. However, if the data is being

received by FTP from another system, when you rerun the load after a failure, the data

might no longer be available, or might not be available in the same order. In that case,

you may need to add a step to the normal loading scenario to FTP the data to a disk

from which you can load, rather than loading it directly from the FTP process.

If you cannot reread the input data stream that was being processed when the failure

occurred, then you may not be able to craft a reliable failure recovery scenario.

Your goal should be to construct a failure recovery plan that will enable you to run the

failed loading batch job again. The loading batch job should contain logic, if necessary,

to recognize and recover from the failure. In some cases, you may have to do some

simple operations, such as setting a parameter, purging data, or copying a table to