Intel 64 and IA-32 Architectures Software Developers Manual Volume 1, Basic Architecture

Vol. 1 2-9
INTEL
®
64 AND IA-32 ARCHITECTURES
Advanced Dynamic Execution
Deep, out-of-order, speculative execution engine
Up to 126 instructions in flight
Up to 48 loads and 24 stores in pipeline
1
Enhanced branch prediction capability
Reduces the misprediction penalty associated with deeper pipelines
Advanced branch prediction algorithm
4K-entry branch target array
New cache subsystem
First level caches
Advanced Execution Trace Cache stores decoded instructions
Execution Trace Cache removes decoder latency from main execution
loops
Execution Trace Cache integrates path of program execution flow into a
single line
Low latency data cache
Second level cache
Full-speed, unified 8-way Level 2 on-die Advance Transfer Cache
Bandwidth and performance increases with processor frequency
High-performance, quad-pumped bus interface to the Intel NetBurst microarchi-
tecture system bus
Supports quad-pumped, scalable bus clock to achieve up to 4X effective
speed
Capable of delivering up to 8.5 GBytes of bandwidth per second
Superscalar issue to enable parallelism
Expanded hardware registers with renaming to avoid register name space
limitations
64-byte cache line size (transfers data up to two lines per sector)
Figure 2-2 is an overview of the Intel NetBurst microarchitecture. This microarchitec-
ture pipeline is made up of three sections: (1) the front end pipeline, (2) the out-of-
order execution core, and (3) the retirement unit.
1. Intel 64 and IA-32 processors based on the Intel NetBurst microarchitecture at 90 nm process
can handle more than 24 stores in flight.