Technical data

Optimizing with MACRO-32

3.3.4.2 Precise Exceptions

The vector processor produces precise exceptions for memory management

faults. When a memory management exception occurs, microcode and

operating system handlers are used to ﬁx the exception.

The vector processor cannot service Translation Not Valid and Access-

Control Violation faults. To handle these exceptions, the vector processor

passes sufﬁcient state data back to the scalar CPU. Then if a memory

management fault occurs, the microcode can build a vector exception

frame on the stack so that vector processor memory manangement

exceptions will be handled precisely and the faulting instruction restarted.

To enforce synchronous operation, after a vector load/store operation is

issued, the scalar CPU will not issue additional instructions until memory

management has completed. To reduce the delay from the issue of a

load/store instruction to the issue of the next instruction, the load/store

unit has special logic which predicts when load/store instructions can

proceed fault free. When the load/store unit knows it can perform all

virtual to physical translations without incurring a memory management

fault, it issues the MMOK signal to the vector control unit. The scalar

CPU is then released to issue more instructions while the load/store unit

completes the remainder of the data transfers. This mechanism reduces

the overhead associated with providing precise memory management

faults.

3.4 INSTRUCTION FLOW

Vector instructions are read from the scalar CPU’s I-stream. The scalar

issue unit decodes the vector instructions and passes them to the vector

CPU. The instructions are decoded by the vector control unit and then

issued to the appropriate function unit through the internal bus. Before

instruction issue, the instruction is checked against a register scoreboard

to verify that it will not use a corrupted register or attempt to modify

a register that is already in use. Load, store, scatter, and gather

instructions are processed in the Load/Store chip. These instructions

either fetch data from memory and place it in the vector register ﬁle or

write data from the vector register ﬁle to memory.

Arithmetic instructions are passed to the arithmetic pipelines by way

of control registers in the register ﬁle chips. An arithmetic instruction

has a ﬁxed startup latency. To minimize the effects of this latency,

the arithmetic pipelines support the ability to queue two arithmetic

instructions. This permits the arithmetic pipeline controller to start

the second instruction without any startup latency. The removal of

3–11