User manual
Table Of Contents
- Zynq-7000 All Programmable SoC- Table of Contents
- Ch. 1: Introduction
- Ch. 2: Signals, Interfaces, and Pins
- Ch. 3: Application Processing Unit
- Ch. 4: System Addresses
- Ch. 5: Interconnect
- Ch. 6: Boot and Configuration
- Ch. 7: Interrupts
- Ch. 8: Timers
- Ch. 9: DMA Controller- Introduction
- Functional Description- DMA Transfers on the AXI Interconnect
- AXI Transaction Considerations
- DMA Manager
- Multi-channel Data FIFO (MFIFO)
- Memory-to-Memory Transfers
- PL Peripheral AXI Transactions
- PL Peripheral Request Interface
- PL Peripheral - Length Managed by PL Peripheral
- PL Peripheral - Length Managed by DMAC
- Events and Interrupts
- Aborts
- Security
- IP Configuration Options
 
- Programming Guide for DMA Controller
- Programming Guide for DMA Engine
- Programming Restrictions
- System Functions
- I/O Interface
 
- Ch. 10: DDR Memory Controller- Introduction
- AXI Memory Port Interface (DDRI)
- DDR Core and Transaction Scheduler (DDRC)
- DDRC Arbitration
- Controller PHY (DDRP)
- Initialization and Calibration- DDR Clock Initialization
- DDR IOB Impedance Calibration
- DDR IOB Configuration
- DDR Controller Register Programming
- DRAM Reset and Initialization
- DRAM Input Impedance (ODT) Calibration
- DRAM Output Impedance (RON) Calibration
- DRAM Training
- Write Data Eye Adjustment
- Alternatives to Automatic DRAM Training
- DRAM Write Latency Restriction
 
- Register Overview
- Error Correction Code (ECC)
- Programming Model
 
- Ch. 11: Static Memory Controller
- Ch. 12: Quad-SPI Flash Controller
- Ch. 13: SD/SDIO Controller
- Ch. 14: General Purpose I/O (GPIO)
- Ch. 15: USB Host, Device, and OTG Controller- Introduction
- Functional Description
- Programming Overview and Reference
- Device Mode Control
- Device Endpoint Data Structures
- Device Endpoint Packet Operational Model
- Device Endpoint Descriptor Reference
- Programming Guide for Device Controller
- Programming Guide for Device Endpoint Data Structures
- Host Mode Data Structures
- EHCI Implementation
- Host Data Structures Reference
- Programming Guide for Host Controller
- OTG Description and Reference
- System Functions
- I/O Interfaces
 
- Ch. 16: Gigabit Ethernet Controller
- Ch. 17: SPI Controller
- Ch. 18: CAN Controller
- Ch. 19: UART Controller
- Ch. 20: I2C Controller
- Ch. 21: Programmable Logic Description
- Ch. 22: Programmable Logic Design Guide
- Ch. 23: Programmable Logic Test and Debug
- Ch. 24: Power Management
- Ch. 25: Clocks
- Ch. 26: Reset System
- Ch. 27: JTAG and DAP Subsystem
- Ch. 28: System Test and Debug
- Ch. 29: On-Chip Memory (OCM)
- Ch. 30: XADC Interface
- Ch. 31: PCI Express
- Ch. 32: Device Secure Boot
- Appx. A: Additional Resources
- Appx. B: Register Details- Overview
- Acronyms
- Module Summary
- AXI_HP Interface (AFI) (axi_hp)
- CAN Controller (can)
- DDR Memory Controller (ddrc)
- CoreSight Cross Trigger Interface (cti)
- Performance Monitor Unit (cortexa9_pmu)
- CoreSight Program Trace Macrocell (ptm)
- Debug Access Port (dap)
- CoreSight Embedded Trace Buffer (etb)
- PL Fabric Trace Monitor (ftm)
- CoreSight Trace Funnel (funnel)
- CoreSight Intstrumentation Trace Macrocell (itm)
- CoreSight Trace Packet Output (tpiu)
- Device Configuration Interface (devcfg)
- DMA Controller (dmac)
- Gigabit Ethernet Controller (GEM)
- General Purpose I/O (gpio)
- Interconnect QoS (qos301)
- NIC301 Address Region Control (nic301_addr_region_ctrl_registers)
- I2C Controller (IIC)
- L2 Cache (L2Cpl310)
- Application Processing Unit (mpcore)
- On-Chip Memory (ocm)
- Quad-SPI Flash Controller (qspi)
- SD Controller (sdio)
- System Level Control Registers (slcr)
- Static Memory Controller (pl353)
- SPI Controller (SPI)
- System Watchdog Timer (swdt)
- Triple Timer Counter (ttc)
- UART Controller (UART)
- USB Controller (usb)
 
 

Zynq-7000 AP SoC Technical Reference Manual www.xilinx.com 65
UG585 (v1.11) September 27, 2016
Chapter 3: Application Processing Unit
Pipeline
The pipeline implemented in the Cortex-A9 CPU employs advanced fetching of instructions and 
branch prediction that decouples the branch resolution from potential memory latency-induced 
instruction stalls. In the Cortex-A9 CPU, up to four instruction-cache lines are pre-fetched to reduce 
the impact of memory latency on the instruction throughput. The CPU fetch unit can continuously 
forward two to four instructions per cycle to the instruction decode buffer to ensure efficient 
superscalar pipeline utilization. The CPU implements a superscalar decoder capable of decoding two 
full instructions per cycle, and any of the four CPU pipelines can select instructions from the issue 
queue. The parallel pipelines support concurrent execution across full dual arithmetic units, 
load-store unit, plus resolution of any branch each cycle.
The Cortex-A9 CPU employs speculative execution of instructions enabled by dynamic renaming of 
physical registers into an available pool of virtual registers. The CPU employs this virtual register 
renaming to eliminate dependencies across registers without jeopardizing the correct execution of 
programs. This feature allows code acceleration through an effective hardware based unrolling of 
X-Ref Target - Figure 3-3
Figure 3-3: Cortex-A9 Architecture
CoreSight Debug
Access Port
Coresight
Debug
Cortex A9 Processor
Profiling Monitor
Block
Dual Instruction
Decode Stage
Instruction
Queue
Prediction
Queue
Fast Loop Mode
Instruction
Cache
Instruction Pre-fetch Stage
Branch Prediction
Global History Buffer
Branch Target
Address Cache
(BTAC) 
Return Stack
Register Rename Stage
Virtual to Physical
Register Pool 
Auto
Pre-fetcher
Data
Cache
Load-Store Unit
Store Buffer
Program Trace Unit
MemorySystem
µTLB
MMU
Instruction
Queue
&
Dispatch 
3 + 1 Dispatch Stage
Out of Order
Multi-issue
with Speculation 
ALU/MUL
ALU
FPU/NEON
Address
Out of Order
Write-back
Stage
Coresight
Trace
Branches
Instruction
Interface
Data
Interface
UG585_c3_04_030712










