User manual

Table Of Contents
Zynq-7000 AP SoC Technical Reference Manual www.xilinx.com 90
UG585 (v1.11) September 27, 2016
Chapter 3: Application Processing Unit
8 or 16-bit polynomial computation for single-bit coefficients
Structured data load capabilities
Dual issue with Cortex-A9 processor ARM or Thumb instructions
Independent pipelines for VFPv3 and advanced SIMD instructions
Large, shared register file, addressable as:
°
Thirty-two 32-bit S (single) registers
°
Thirty-two 64-bit D (double) registers
°
Sixteen 128-bit Q (quad) registers
See the ARM Architecture Reference Manual for details of the advanced SIMD instructions and the
NEON MPE operation.
3.2.8 Performance Monitoring Unit
The Cortex-A9 processor includes a performance monitoring unit (PMU) which provides six counters
to gather statistics on the operation of the processor and memory system. Each counter can count
any of 58 events available in the Cortex-A9 processor. The PMU counters and their associated control
registers are accessible from the internal CP15 interface as well as from the DAP interface. For details,
refer to the Performance Monitoring Unit section in the ARM Cortex-A9 Technical Reference Manual.
3.3 Snoop Control Unit (SCU)
3.3.1 Summary
The SCU block connects the two Cortex-A9 processors to the memory subsystem and contains the
intelligence to manage the data cache coherency between the two processors and the L2 cache. This
block is responsible for managing the interconnect arbitration, communication, cache and system
memory transfers, and cache coherence for the Cortex-A9 processors. The APU also exposes the
capabilities of the SCU to system accelerators that are implemented in the PL through the accelerator
coherency port (ACP) interface (see ACP Interface, page 103). This interface allows PL masters to
share and access the processor cache hierarchy. The offered system coherence here not only
improves performance but also reduces the software complexity involved in otherwise maintaining
software coherency within each OS driver.
The SCU block communicates with each of the Cortex-A9 processors through a cache coherency bus
(CCB) and manages the coherency between the L1 and the L2 caches. The SCU supports MESI
snooping which provides increased power efficiency and performance by avoiding unnecessary
system accesses. The block implements duplicated 4-way associative tag RAMs acting as a local
directory that lists coherent cache lines held in the CPU L1 data caches. The directory allows the SCU
to check if data is in the L1 data caches with great speed and without interrupting the processors.
Also, accesses can be filtered only to the processor that is sharing the data.