Datasheet

1-8 MCF5407 User’s Manual

ColdFire Module Description

1.3.1.2 Operand Execution Pipeline (OEP)

The prefetched instruction stream is gated from the FIFO buffer into the ﬁve-stage OEP.

The OEP consists of two, traditional two-stage RISC compute engines with a register ﬁle

access feeding an arithmetic/logic unit (ALU). The compute engine located at the top of the

OEP is typically used for operand memory address calculations (the address ALU), while

the compute engine located at the bottom of the pipeline is used for instruction execution

(the execution ALU). The resulting structure provides 3.9 Gbytes/S data operand

bandwidth at 162 MHz to the two compute engines and supports single-cycle execution

speeds for most instructions, including all load, store, and most embedded-load operations.

In response to users and developers’ comments, the V4 design supports execution of the

ColdFire Revision B instruction set, which adds a small number of new instructions to

improve performance and code density.

The OEP also implements two advanced performance features. It dynamically determines

the appropriate location of instruction execution (either in the address ALU or the execution

ALU) based on the pipeline state. The address compute engine, in conjunction with register

renaming resources, can be used to execute a number of heavily-used opcodes and forward

the results to subsequent instructions without any pipeline stalls. Additionally, the OEP

implements instruction folding techniques involving MOVE instructions so that two

instructions can be issued in a single machine cycle. The resulting microarchitecture

approaches the performance of a full superscalar implementation, but at a much lower

silicon cost.

1.3.1.3 MAC Module

The MAC unit provides signal processing capabilities for the MCF5407 in a variety of

applications including digital audio and servo control. Integrated as an execution unit in the

processor’s OEP, the MAC unit implements a three-stage arithmetic pipeline optimized for

16 x 16 multiplies. Both 16- and 32-bit input operands are supported by this design in

addition to a full set of extensions for signed and unsigned integers, plus signed, ﬁxed-point

fractional input operands.

1.3.1.4 Integer Divide Module

Integrated into the OEP, the divide module performs operations using signed and unsigned

integers. The module supports word and longword divides producing quotients and/or

remainders.

1.3.2 Harvard Architecture

A Harvard memory architecture implements separate instruction and data buses to the

processor-local memories, removing conﬂicts between instruction fetches and operand

accesses.