User manual
Table Of Contents
- Zynq-7000 All Programmable SoC
- Table of Contents
- Ch. 1: Introduction
- Ch. 2: Signals, Interfaces, and Pins
- Ch. 3: Application Processing Unit
- Ch. 4: System Addresses
- Ch. 5: Interconnect
- Ch. 6: Boot and Configuration
- Ch. 7: Interrupts
- Ch. 8: Timers
- Ch. 9: DMA Controller
- Introduction
- Functional Description
- DMA Transfers on the AXI Interconnect
- AXI Transaction Considerations
- DMA Manager
- Multi-channel Data FIFO (MFIFO)
- Memory-to-Memory Transfers
- PL Peripheral AXI Transactions
- PL Peripheral Request Interface
- PL Peripheral - Length Managed by PL Peripheral
- PL Peripheral - Length Managed by DMAC
- Events and Interrupts
- Aborts
- Security
- IP Configuration Options
- Programming Guide for DMA Controller
- Programming Guide for DMA Engine
- Programming Restrictions
- System Functions
- I/O Interface
- Ch. 10: DDR Memory Controller
- Introduction
- AXI Memory Port Interface (DDRI)
- DDR Core and Transaction Scheduler (DDRC)
- DDRC Arbitration
- Controller PHY (DDRP)
- Initialization and Calibration
- DDR Clock Initialization
- DDR IOB Impedance Calibration
- DDR IOB Configuration
- DDR Controller Register Programming
- DRAM Reset and Initialization
- DRAM Input Impedance (ODT) Calibration
- DRAM Output Impedance (RON) Calibration
- DRAM Training
- Write Data Eye Adjustment
- Alternatives to Automatic DRAM Training
- DRAM Write Latency Restriction
- Register Overview
- Error Correction Code (ECC)
- Programming Model
- Ch. 11: Static Memory Controller
- Ch. 12: Quad-SPI Flash Controller
- Ch. 13: SD/SDIO Controller
- Ch. 14: General Purpose I/O (GPIO)
- Ch. 15: USB Host, Device, and OTG Controller
- Introduction
- Functional Description
- Programming Overview and Reference
- Device Mode Control
- Device Endpoint Data Structures
- Device Endpoint Packet Operational Model
- Device Endpoint Descriptor Reference
- Programming Guide for Device Controller
- Programming Guide for Device Endpoint Data Structures
- Host Mode Data Structures
- EHCI Implementation
- Host Data Structures Reference
- Programming Guide for Host Controller
- OTG Description and Reference
- System Functions
- I/O Interfaces
- Ch. 16: Gigabit Ethernet Controller
- Ch. 17: SPI Controller
- Ch. 18: CAN Controller
- Ch. 19: UART Controller
- Ch. 20: I2C Controller
- Ch. 21: Programmable Logic Description
- Ch. 22: Programmable Logic Design Guide
- Ch. 23: Programmable Logic Test and Debug
- Ch. 24: Power Management
- Ch. 25: Clocks
- Ch. 26: Reset System
- Ch. 27: JTAG and DAP Subsystem
- Ch. 28: System Test and Debug
- Ch. 29: On-Chip Memory (OCM)
- Ch. 30: XADC Interface
- Ch. 31: PCI Express
- Ch. 32: Device Secure Boot
- Appx. A: Additional Resources
- Appx. B: Register Details
- Overview
- Acronyms
- Module Summary
- AXI_HP Interface (AFI) (axi_hp)
- CAN Controller (can)
- DDR Memory Controller (ddrc)
- CoreSight Cross Trigger Interface (cti)
- Performance Monitor Unit (cortexa9_pmu)
- CoreSight Program Trace Macrocell (ptm)
- Debug Access Port (dap)
- CoreSight Embedded Trace Buffer (etb)
- PL Fabric Trace Monitor (ftm)
- CoreSight Trace Funnel (funnel)
- CoreSight Intstrumentation Trace Macrocell (itm)
- CoreSight Trace Packet Output (tpiu)
- Device Configuration Interface (devcfg)
- DMA Controller (dmac)
- Gigabit Ethernet Controller (GEM)
- General Purpose I/O (gpio)
- Interconnect QoS (qos301)
- NIC301 Address Region Control (nic301_addr_region_ctrl_registers)
- I2C Controller (IIC)
- L2 Cache (L2Cpl310)
- Application Processing Unit (mpcore)
- On-Chip Memory (ocm)
- Quad-SPI Flash Controller (qspi)
- SD Controller (sdio)
- System Level Control Registers (slcr)
- Static Memory Controller (pl353)
- SPI Controller (SPI)
- System Watchdog Timer (swdt)
- Triple Timer Counter (ttc)
- UART Controller (UART)
- USB Controller (usb)

Zynq-7000 AP SoC Technical Reference Manual www.xilinx.com 648
UG585 (v1.11) September 27, 2016
Chapter 22: Programmable Logic Design Guide
Power
An additional benefit of moving operations to the programmable logic is a reduction in power.
Depending on the operations, programmable logic can reduce power per OP by 10-100x. Thus it
might be useful to implement algorithms in the PL solely to reduce system power.
One issue to be aware of is that if the algorithm requires access to external memory, the energy cost
of accessing the external memory could dominate the energy budget making a reduction in the
operation power irrelevant.
Latency
Parallel logic in the PL has a low predictable delay, and cannot be interrupted. For this reason
algorithms which are used to respond to real time events originating the in PL might best be serviced
by algorithms in the programmable logic. This approach can reduce response time from thousands
of clocks to tens of clocks.
22.2.2 Designing PL Accelerators
Programmable logic accelerators are typically created in the RTL languages Verilog or VHDL.
Experienced RTL engineers can use C-code as a golden model to create an efficient hardware
implementation of the algorithm in programmable logic.
For software programmers who are more comfortable with the C language, C-to-Gates compilers
exist which can allow a user to build HW accelerators using the C language. Keeping in mind that C
is a sequential language, automated compiler methods can be used to map the sequential code to
parallel hardware without user intervention.
For instance, a FOR loop such as “for(i=0; i<10; i++){ x[i]=a[i]+b[i]; }” can be
unrolled to create 10 individual adders all operating in parallel.
For video and DSP algorithms, tools such as Matlab Simulink and Xilinx System Generator can be
used to directly create logic from algorithmic flowgraphs. A primary advantage of using Matlab
Simulink is the rich library of functions which can be used to model, simulate and verify the hardware
implementation.
Dataflow
Regardless of how an accelerator or offload engine is designed, once implemented it requires
efficient dataflow to and from the accelerator. In many cases, scheduling the dataflow between the
accelerator and DRAM can be more of a design challenge than implementing the actual algorithm.
The word dataflow is used to reference the motion of data between system memories and PL
functional units using AXI interconnect and local interconnect.










