Technical data

Optimizing with MACRO-32

Example 3–6 is another example of the use of a deferred arithmetic

instruction. In this case, a divide instruction is followed by an add

and then a load. The deferred instruction queue and the length of the

divide instruction combine to "hide" the load instruction (that is, the

execution time of the load instruction does not contribute to the total

execution time of the instruction sequence). Note also that the divide

instruction completes after the load completes. Out of order completion of

instructions is possible.

Example 3–6 Use of the Deferred Arithmetic Instruction Queue

Instruction Sequence

VVDIVL V1,V2,V3

VVADDL V3,V1,V4

VLDL base,#4,V5

Execution without Deferred Instruction Queue

Issue VVDIVL IEEEEEEEEEEEEEEEEEEEE

Issue VVADDL IEEEEEEEE

Issue VLDL IEEEEEEEEEEEEEE

Execution with Deferred Instruction Queue

Issue VVDIVL IEEEEEEEEEEEEEEEEEEEE

Issue deferred VVADDL I...................EEEEEEEE

Issue VLDL IEEEEEEEEEEEEEE

3.7 CHAINING

Vector operands are generally read from and written to the vector register

ﬁle. An exception to this process occurs when a store instruction is

waiting for the results of a currently executing arithmetic instruction.

(Divide instructions are not included in this exception because they do

not have the same degree of pipelining as the other instructions.) As

results are generated by the arithmetic instruction and are ready to be

written to the register ﬁle, they are also immediately available for input

to the waiting store instruction. Therefore, the store instruction can begin

processing the data before the arithmetic instruction has completed. This

process is called "chain into store." The store instruction will not overrun

the arithmetic instruction because the store instruction cannot process

data faster than the arithmetic unit can generate results.

3–20