Technical data

Vector Processing Concepts
I/O device, controlled under program direction through special registers
and operating asynchronously from the host. Program data is moved back
and forth between the attached processor and the host with standard I/O
operations. The host processor requires no special internal hardware to
use an attached vector processor.
There is no "pairing" of a host processor to an attached vector processor. A
system can have multiple host scalar processors and one attached vector
processor. Some systems can also have one host processor and a number
of attached vector processors, all driven by a program executing on the
host.
Because it runs in parallel with its host scalar CPU, an attached vector
processor can give good performance for the proper applications. However,
attached vector processors can be difficult to program, and the need to use
I/O operations to transfer program data can result in very high overhead
when transferring data between processors. If the data format of the
attached processor is different from that of the host system, input and
output conversion of the data files will be required.
To perform well on an attached vector processor, an application must have
a high percentage of vector operations that need no I/O support from the
host. Also, the computational time of those vector operations should be
long compared to any required I/O operations.
An integrated vector processor, on the other hand, consists of a
coprocessor that is tightly coupled with its host scalar processor; the
two processors are considered a pair. The scalar processor is specifically
designed to support its vector coprocessor, and the vector processor
instruction set is implemented as part of the host’s native instruction
set. The two processors share the same memory and transfer program
instructions and data over a dedicated high-speed internal path. They
may also share special CPU resources such as cache or translation buffer
entries. Since they share a common memory, no I/O operations are needed
to transfer data between them. Thus, programs with a high ratio of
data access to arithmetic operations will perform more efficiently on an
integrated vector processor than on an attached vector processor.
An integrated vector processor can run synchronously or asynchronously
with its scalar coprocessor, depending on the hardware implementation.
When the scalar processor fetches and decodes a vector instruction,
it passes the instruction to the vector processor. At that point, the
scalar processor can either wait for the vector processor to complete
the instruction, or it can continue executing and synchronize with the
vector processor at a later time. Integrated processors that have this
1–7