user manual

AMD Athlon Processor Microarchitecture 133
22007E/0November 1999 AMD Athlon Processor x86 Code Optimization
return stack. Subsequent RETs pop a predicted return address
off the top of the stack.
Early Decoding
The DirectPath and VectorPath decoders perform
early-decoding of instructions into MacroOPs. A MacroOP is a
fixed length instruction which contains one or more OPs. The
outputs of the early decoders keep all (DirectPath or
VectorPath) instructions in program order. Early decoding
produces three MacroOPs per cycle from either path. The
outputs of both decoders are multiplexed together and passed
to the next stage in the pipeline, the instruction control unit.
When the target 16-byte instruction window is obtained from
the instruction cache, the predecode data is examined to
determine which type of basic decode should occur
DirectPath or VectorPath.
DirectPath Decoder DirectPath instructions can be decoded directly into a
MacroOP, and subsequently into one or two OPs in the final
issue stage. A DirectPath instruction is limited to those x86
instructions that can be further decoded into one or two OPs.
The length of the x86 instruction does not determine DirectPath
instructions. A maximum of three DirectPath x86 instructions
can occupy a given aligned 8-byte block. 16-bytes are fetched at
a time. Therefore, up to six DirectPath x86 instructions can be
passed into the DirectPath decode pipeline.
VectorPath Decoder Uncommon x86 instructions requiring two or more MacroOPs
proceed down the VectorPath pipeline. The sequence of
MacroOPs is produced by an on-chip ROM known as the MROM.
The VectorPath decoder can produce up to three MacroOPs per
cycle. Decoding a VectorPath instruction may prevent the
simultaneous decode of a DirectPath instruction.