User manual
Table Of Contents
- Zynq-7000 All Programmable SoC
- Table of Contents
- Ch. 1: Introduction
- Ch. 2: Signals, Interfaces, and Pins
- Ch. 3: Application Processing Unit
- Ch. 4: System Addresses
- Ch. 5: Interconnect
- Ch. 6: Boot and Configuration
- Ch. 7: Interrupts
- Ch. 8: Timers
- Ch. 9: DMA Controller
- Introduction
- Functional Description
- DMA Transfers on the AXI Interconnect
- AXI Transaction Considerations
- DMA Manager
- Multi-channel Data FIFO (MFIFO)
- Memory-to-Memory Transfers
- PL Peripheral AXI Transactions
- PL Peripheral Request Interface
- PL Peripheral - Length Managed by PL Peripheral
- PL Peripheral - Length Managed by DMAC
- Events and Interrupts
- Aborts
- Security
- IP Configuration Options
- Programming Guide for DMA Controller
- Programming Guide for DMA Engine
- Programming Restrictions
- System Functions
- I/O Interface
- Ch. 10: DDR Memory Controller
- Introduction
- AXI Memory Port Interface (DDRI)
- DDR Core and Transaction Scheduler (DDRC)
- DDRC Arbitration
- Controller PHY (DDRP)
- Initialization and Calibration
- DDR Clock Initialization
- DDR IOB Impedance Calibration
- DDR IOB Configuration
- DDR Controller Register Programming
- DRAM Reset and Initialization
- DRAM Input Impedance (ODT) Calibration
- DRAM Output Impedance (RON) Calibration
- DRAM Training
- Write Data Eye Adjustment
- Alternatives to Automatic DRAM Training
- DRAM Write Latency Restriction
- Register Overview
- Error Correction Code (ECC)
- Programming Model
- Ch. 11: Static Memory Controller
- Ch. 12: Quad-SPI Flash Controller
- Ch. 13: SD/SDIO Controller
- Ch. 14: General Purpose I/O (GPIO)
- Ch. 15: USB Host, Device, and OTG Controller
- Introduction
- Functional Description
- Programming Overview and Reference
- Device Mode Control
- Device Endpoint Data Structures
- Device Endpoint Packet Operational Model
- Device Endpoint Descriptor Reference
- Programming Guide for Device Controller
- Programming Guide for Device Endpoint Data Structures
- Host Mode Data Structures
- EHCI Implementation
- Host Data Structures Reference
- Programming Guide for Host Controller
- OTG Description and Reference
- System Functions
- I/O Interfaces
- Ch. 16: Gigabit Ethernet Controller
- Ch. 17: SPI Controller
- Ch. 18: CAN Controller
- Ch. 19: UART Controller
- Ch. 20: I2C Controller
- Ch. 21: Programmable Logic Description
- Ch. 22: Programmable Logic Design Guide
- Ch. 23: Programmable Logic Test and Debug
- Ch. 24: Power Management
- Ch. 25: Clocks
- Ch. 26: Reset System
- Ch. 27: JTAG and DAP Subsystem
- Ch. 28: System Test and Debug
- Ch. 29: On-Chip Memory (OCM)
- Ch. 30: XADC Interface
- Ch. 31: PCI Express
- Ch. 32: Device Secure Boot
- Appx. A: Additional Resources
- Appx. B: Register Details
- Overview
- Acronyms
- Module Summary
- AXI_HP Interface (AFI) (axi_hp)
- CAN Controller (can)
- DDR Memory Controller (ddrc)
- CoreSight Cross Trigger Interface (cti)
- Performance Monitor Unit (cortexa9_pmu)
- CoreSight Program Trace Macrocell (ptm)
- Debug Access Port (dap)
- CoreSight Embedded Trace Buffer (etb)
- PL Fabric Trace Monitor (ftm)
- CoreSight Trace Funnel (funnel)
- CoreSight Intstrumentation Trace Macrocell (itm)
- CoreSight Trace Packet Output (tpiu)
- Device Configuration Interface (devcfg)
- DMA Controller (dmac)
- Gigabit Ethernet Controller (GEM)
- General Purpose I/O (gpio)
- Interconnect QoS (qos301)
- NIC301 Address Region Control (nic301_addr_region_ctrl_registers)
- I2C Controller (IIC)
- L2 Cache (L2Cpl310)
- Application Processing Unit (mpcore)
- On-Chip Memory (ocm)
- Quad-SPI Flash Controller (qspi)
- SD Controller (sdio)
- System Level Control Registers (slcr)
- Static Memory Controller (pl353)
- SPI Controller (SPI)
- System Watchdog Timer (swdt)
- Triple Timer Counter (ttc)
- UART Controller (UART)
- USB Controller (usb)

Zynq-7000 AP SoC Technical Reference Manual www.xilinx.com 66
UG585 (v1.11) September 27, 2016
Chapter 3: Application Processing Unit
loops, and increases the pipeline utilization by removing data dependencies between adjacent
instructions, which also indirectly reduces interrupt latency.
In the Cortex-A9 CPU, dependent load-store instructions can be forwarded for resolution within the
memory system to further reduce pipeline stalls. The core supports up to four data cache line fill
requests that can be through automatic or user-driven pre-fetching.
A key feature of this CPU is the out-of-order write back of instructions that enables the pipeline
resources to be released independent of the order in which the system provides the required data.
Load/store instructions can be issued speculatively before condition of instruction or a preceding
branch has been resolved or before data to be written has become available. If the condition
required for the execution of the load/store fails, any of the side-effects, such as the action to modify
registers, are flushed.
Branch Prediction
To minimize the branch penalty in its highly pipelined CPU, the Cortex-A9 implements both static
and dynamic branch prediction. Static branch prediction is provided by the instructions and is
decided during compilation. Dynamic branch prediction uses the outcome of the previous
executions of a specific branch to determine whether the branch should be taken or not. The
dynamic branch prediction logic employs a global branch history buffer (GHB) which is a 4,096 entry
table holding 2-bit prediction information for specific branches and is updated every time a branch
gets executed.
The branch execution and the overall instruction throughput also benefit greatly from the
implementation of a branch target address cache (BTAC) which holds the target addresses of the
recent branches. This 512-entry address cache is organized as 2-way × 256 entries and provides the
target address for a specific branch to the pre-fetch unit before the actual target address is
generated based on the calculation of the effective address and its translation to the physical
address. Additionally, if an instruction loop fits in four BTAC entries, instruction cache accesses are
turned off to lower power consumption.
Note: Both GHB and BTAC RAMs implement parity for protection; however, this support has limited
diagnostic value. Corruption in GHB data or BTAC data does not generate functional errors in the
Cortex-A 9 processor. Corruption in GHB data or BTAC data results in faulty branch prediction that is
detected and corrected when the branch gets executed.
The Cortex-A9 CPU can predict conditional branches, unconditional branches, indirect branches,
PC-destination data-processing operations, and branches that switch between ARM and Thumb
states. However, the following branch instructions are not predicted:
• Branches that switch between states (except ARM to Thumb transitions, and Thumb to ARM
transitions)
• Instructions with the S suffix are not predicted, as they are typically used to return from
exceptions and have side effects that can change privilege mode and security state.
• All mode-changing instructions
Users can enable program flow prediction by setting the Z bit in the CP15 c1 Control register to 1.
Refer to the System Control Register in the ARM Cortex-A9 Technical Reference Manual (see
Appendix A, Additional Resources). Before switching the program flow prediction on, a BTAC flush
operation must be performed which has the additional effect of setting the GHB into a known state.










