Intel 64 and IA-32 Architectures Software Developers Manual Volume 1, Basic Architecture

Vol. 1 2-9

INTEL

64 AND IA-32 ARCHITECTURES

• Advanced Dynamic Execution

— Deep, out-of-order, speculative execution engine

• Up to 126 instructions in flight

• Up to 48 loads and 24 stores in pipeline

— Enhanced branch prediction capability

• Reduces the misprediction penalty associated with deeper pipelines

• Advanced branch prediction algorithm

• 4K-entry branch target array

• New cache subsystem

— First level caches

• Advanced Execution Trace Cache stores decoded instructions

• Execution Trace Cache removes decoder latency from main execution

loops

• Execution Trace Cache integrates path of program execution flow into a

single line

• Low latency data cache

— Second level cache

• Full-speed, unified 8-way Level 2 on-die Advance Transfer Cache

• Bandwidth and performance increases with processor frequency

• High-performance, quad-pumped bus interface to the Intel NetBurst microarchi-

tecture system bus

— Supports quad-pumped, scalable bus clock to achieve up to 4X effective

speed

— Capable of delivering up to 8.5 GBytes of bandwidth per second

• Superscalar issue to enable parallelism

• Expanded hardware registers with renaming to avoid register name space

limitations

• 64-byte cache line size (transfers data up to two lines per sector)

Figure 2-2 is an overview of the Intel NetBurst microarchitecture. This microarchitec-

ture pipeline is made up of three sections: (1) the front end pipeline, (2) the out-of-

order execution core, and (3) the retirement unit.

1. Intel 64 and IA-32 processors based on the Intel NetBurst microarchitecture at 90 nm process

can handle more than 24 stores in flight.