Technical data
Vector Processing Concepts
1.9 COMBINING VECTOR OPERATIONS TO IMPROVE EFFICIENCY
Some of the techniques available to increase vector instruction efficiency
include overlapping and chaining.
1.9.1 Instruction Overlap
Overlapping instructions involves combining two or more instructions
to overlap their execution to save execution time. If a vector processor
has independent function units, it can perform different operations on
different operands simultaneously. Overlapping provides a significant
gain in performance. If a register must be reused or if data is not yet
available, overlapping may not be possible.
1.9.2 Chaining
Chaining, a special form of instruction overlap, is possible with multiple
function units. Chaining is passing the result of one operation in one
function unit to another function unit. For example, an add instruction
followed by a store command can "combine" so each element of the vector
is stored as soon as the result is obtained. The processor does not have to
wait for the add instruction to finish before storing the data.
VADD V1,V2,V3
VSTORE V3
As results are generated by the add instruction, they are immediately
available for input to the waiting store instruction. The store instruction
can then begin processing the data.
Instruction chaining only works if all the data to be processed is available
at the beginning of the pipeline. If the result of one operation must be
used as input to another operation in the same data stream, instruction
chaining cannot be used.
1–18