Technical data
VAX 6000 Series Vector Processor
2.3 VECTOR CONTROL UNIT
When the vector control unit receives instructions, it buffers the
instructions and controls instruction issue to other functional units in the
vector module. The vector control unit is responsible for all scalar/vector
communication. The vector control unit also contains the necessary
register scoreboarding to control instruction overlap. The scoreboard
implements the algorithms that permit chaining of arithmetic operations
into store operations.
To summarize, the vector control unit performs the following functions:
• Interface to the scalar processor; receives instructions from the scalar
module and also returns status.
• Instruction issue. The vector control unit issues instructions to the
other functional units of the vector module and maintains a register
scoreboard for the detection of interinstruction dependencies.
• Cache data (CD) bus master control. It relinquishes partial control to
the load/store unit during execution of load/store instructions.
• Implementation of the Vector Count Register (VCR), Vector Processor
Status Register (VPSR), Vector Length Register (VLR), Vector
Arithmetic Exception Register (VAER), and Vector Memory Activity
Check Register (VMAC).
2.4 ARITHMETIC UNIT
All register-to-register vector instructions are handled by the arithmetic
unit. Each vector register file chip contains every fourth element of
the vector registers, thus permitting four-way parallelism. These chips
receive instructions from the vector contol unit and data from the cache or
load/store, read operands from the registers, and write results back into
the registers or into the mask register. If two 32-bit operands come over
in a single 64-bit transfer, they can be read or written by two separate
register file chips.
The register set has four 64-bit ports (one read/write for memory data,
two for read operands, and one for writing results). While one instruction
is writing its results, a second can start reading its operands, thus hiding
the instruction pipeline delay. Variations in pipeline length between
instructions are smoothly handled so that no gaps exist in the flow
of write data. The register file can hold two outstanding arithmetic
instructions in its internal queue. The arithmetic unit executes two
arithmetic instructions in about the time it takes one load or store
2–5