NonStop Systems Introduction

Transaction Management

NonStop Systems Introduction—527825-001

5-11

Volume Recovery

accumulate in cache (a high-speed area of main memory) before writing the whole set

of records to disk in a single physical update operation.

This write-in-cache procedure significantly reduces the number of writes to audited files

and keeps performance high. The only problem is that when a system failure occurs,

some changed records in cache might not yet be changed on disk.

TMF solves this potential data integrity problem by redoing committed transactions

after a failure, to make sure that the database changes made by such transactions are

reflected on disk.

After the system comes back up, TMF automatically initiates volume recovery to redo

the committed transactions and then undo any incomplete transactions.

Volume recovery uses the after-images in the audit trail to redo committed

transactions. But how does volume recovery determine which transactions need to be

redone? It limits the amount of redoing by requiring the disk process to periodically

perform a routine known as control-point processing.

The disk process is an operating system process that manages physical updates to a

disk. When called upon to perform control-point processing, the disk process writes to

disk the changed records that have been accumulating in cache. Then the disk

process informs TMF of a new “redo location” in the audit trail, where volume recovery

must begin.

When it is time for volume recovery to apply the after-images of committed

transactions to the database, volume recovery needs to read only the portion of the

audit trail that follows the most recent redo location.

After redoing the successful transactions, volume recovery goes through the audit trail

again and backs out any incomplete transactions by applying before-images to the

affected records on disk.

The advantages of volume recovery are:

•

The database is restored to a consistent state within minutes because the use of

disk process control points and redo locations limits the number of transactions

that must be redone.

•

TMF starts volume recovery automatically when the system comes back up.

Figure 5-6 on page 5-12 shows the first stage of the recovery of the warehouse

database by volume recovery when power is restored in the warehouse. In this stage,

volume recovery applies after-images to redo committed transactions. In the next

stage, volume recovery will apply before-images to undo any incomplete transactions.

The end result of both stages is the restoration of the database to a consistent state.