Technical data

Optimizing with MACRO-32
Example 3–3 Effects of Register Conflict
Instruction Sequence 1
VVADDL V1,V2,V3 IEEEEE
VVMULL V1,V2,V4 I....EEEEE
VLDL base,#4,V1 IEEEEEEEEE
Instruction Sequence 2
VVADDL V1,V2,V3 IEEEEE
VVMULL V1,V2,V4 I....EEEEE
VLDL base,#4,V5 IEEEEEEEEE
Nonunity stride loads and stores can have a significantly higher impact on
the performance level of the XMI memory bus as compared to unity stride
operations. A far greater number of memory references are required for
nonunity stride than is the case for unity stride. If the ratio of cache
miss load/store to arithmetic instructions is sufficiently high and nonunity
stride is used, bus bandwidth can become the limiting performance factor.
3.6 OUT-OF-ORDER INSTRUCTION EXECUTION
The deferred instruction queue (of length 1) associated with the arithmetic
unit allows the vector issue unit to queue one instruction to the arithmetic
unit while that unit is still executing a previous instruction. The issue
unit checks the status of this queue when it does the functional unit
availability check for an instruction. (Both the deferred and currently
executing instructions are checked for register availability.) This frees the
issue unit to process another instruction rather than having to wait for
the arithmetic unit to complete its current instruction.
Example 3–4 shows the use of the deferred arithmetic instruction queue.
If a deferred instruction queue was not implemented, the VVMULL
instruction could not be issued until the VVADDL was completed (or
nearly completed). The VLDL instruction would then not issue until
after the VVMULL was issued and would complete much later than in
the deferred instruction case. Once the VLDL instruction is issued, no
other instructions may be issued. The overlap of instruction execution
made possible by the deferred instruction queue can significantly reduce
the total execution time. The VLDL instruction can overlap the deferred
VVMULL instruction because there are no register conflicts between the
two instructions.
3–18