Technical data
Vector Processing Concepts
A scalar processor operates on single quantities of data. Consider the
following operation: A(I) = B(I) + 50. As illustrated in Figure 1–1, five
separate instructions must be performed, using one instruction per unit
of time, for each value from 1 to I
max
(some CPUs may combine steps and
use fewer units of time):
1 Load first element from location B.
2 Add 50 to the first element.
3 Store the result in location A.
4 Increment the counter.
5 Test the counter for index I
max
.
If I
max
is reached, the operation is complete. If not, steps 1 through 5 are
repeated. To calculate A(I) for 100 elements using these instructions (5 X
100), or 500 scalar instructions, takes 500 units of computer time.
Since a vector processor operates on complete vectors of independent data
at the same time, only three instructions are needed to perform the same
operation A(I) using a vector processor:
1 Load the array from memory into the vector processor register B.
2 Add 50 to all elements in the array, placing the results in register A.
3 Store the entire vector back into memory.
The flow of data optimizes the use of memory and reduces the overhead
to perform each operation. Within the vector processor, much the same
processing may occur as in the scalar processor, but the vector processor
is optimized to do it faster. It is important to remember that vector
operations generate the same result as scalar operations.
1–4