User manual

Table Of Contents
Zynq-7000 AP SoC Technical Reference Manual www.xilinx.com 104
UG585 (v1.11) September 27, 2016
Chapter 3: Application Processing Unit
Note: The transaction can optionally allocate into the L2 cache if the write parameters are set
accordingly.
ACP non-coherent write requests: An ACP write request is non-coherent when AWUSER[0] = 0 or
AWCACHE[1] = 0 alongside AWVALID. In this case, the SCU does not enforce coherency and the write
request is forwarded directly to one of the available SCU AXI master ports.
ACP Usage
The ACP provides a low latency path between the PS and the accelerators implemented in the PL
when compared with a legacy cache flushing and loading scheme. Steps that must take place in an
example of a PL-based accelerator are as follows:
1. The CPU prepares input data for the accelerator within its local cache space.
2. The CPU sends a message to the accelerator using one of the general purpose AXI master
interfaces to the PL.
3. The accelerator fetches the data through the ACP, processes the data, and returns the result
through the ACP.
4. The accelerator sets a flag by writing to a known location to indicate that the data processing is
complete. Status of this flag can be polled by the processor or could generate an interrupt.
Table 3-7 shows ACP read and write behavior based on current cache status. Clearly, access latency
is small when cache hits occur.
When compared to a tightly-coupled coprocessor, ACP access latencies are relatively long. Therefore,
ACP is not recommended for fine-grained instruction level acceleration. On the other hand, for
coarse-grain acceleration such as video frame-level processing, ACP does not have a clear advantage
over traditional memory-mapped PL acceleration because the transaction overhead is small relative
to the transaction time, and might potentially cause undesirable cache thrashing. ACP is therefore
optimal for medium-grain acceleration, such as block-level crypto accelerator and video
macro-block level processing.
Table 3-7: ACP Read and Write Behavior
Action Description
ACP read – I (invalid) SCU fetches data from external memory through one of two AXI master
interfaces. Data is forwarded to the ACP directly. It does not affect the CPU
L1 cache state.
ACP read – M (modified) SCU fetches data from L1 cache with M status. It does not affect the L1
cache state.
ACP read S (shared) SCU fetches data from any L1 cache with S status. It does not affect the L1
cache state.
ACP read E (exclusive) SCU fetches data from the L1 cache with E status. It does not affect the L1
cache state.
ACP write – I (invalid) Data is written to external memory through one of two AXI master
interfaces. It does not affect the CPU L1 cache state.
ACP write – M (modified) Data in L1 cache with M status is flushed out to external memory first.
After that, ACP data is written into external memory interface. L1 cache
previously with M status is changed to I status. If the SCU overwrites the
entire cache line, L1 cache flush is skipped.