Nios II Processor Reference Handbook 101 Innovation Drive San Jose, CA 95134 www.altera.com NII5V1-7.
Copyright © 2007 Altera Corporation. All rights reserved. Altera, The Programmable Solutions Company, the stylized Altera logo, specific device designations, and all other words and logos that are identified as trademarks and/or service marks are, unless noted otherwise, the trademarks and service marks of Altera Corporation in the U.S. and other countries. All other product or service names are the property of their respective holders. Altera products are protected under numerous U.S.
Contents Chapter Revision Dates ........................................................................... ix About This Handbook .............................................................................. xi Introduction ............................................................................................................................................... xi Prerequisites ............................................................................................................................
Contents Memory and I/O Organization ........................................................................................................... 2–8 Instruction and Data Buses ............................................................................................................. 2–9 Cache Memory ................................................................................................................................ 2–11 Tightly-Coupled Memory ...................................................
Contents Chapter 4. Instantiating the Nios II Processor in SOPC Builder Introduction ............................................................................................................................................ 4–1 Core Nios II Page ................................................................................................................................... 4–2 Core Selection ................................................................................................................
Contents Instruction Performance ................................................................................................................ Exception Handling ....................................................................................................................... JTAG Debug Module ..................................................................................................................... Unsupported Features .....................................................................
Contents R-Type ................................................................................................................................................ 8–2 J-Type ................................................................................................................................................. 8–2 Instruction Opcodes .............................................................................................................................. 8–3 Assembler Pseudo-instructions ........
Contents viii Nios II Processor Reference Handbook Altera Corporation
Chapter Revision Dates The chapters in this book, Nios II Processor Reference Handbook, were revised on the following dates. Where chapters or groups of chapters are available separately, part numbers are listed. Chapter 1. Introduction Revised: Part number: October 2007 NII51001-7.2.0 Chapter 2. Processor Architecture Revised: October 2007 Part number: NII51002-7.2.0 Chapter 3. Programming Model Revised: October 2007 Part number: NII51003-7.2.0 Chapter 4.
Chapter Revision Dates x Nios II Processor Reference Handbook Altera Corporation
About This Handbook Introduction This handbook is the primary reference for the Nios® II family of embedded processors. The handbook describes the Nios II processor from a high-level conceptual description to the low-level details of implementation. The chapters in this handbook define the Nios II processor architecture, the programming model, the instruction set, and more. This handbook is part of a larger collection of documents covering the Nios II processor and its usage.
How to Find Further Information How to Find Further Information This handbook is one part of the complete Nios II processor documentation. The following references are also available. ■ ■ ■ ■ ■ How to Contact Altera The Nios II Software Developer’s Handbook describes the software development environment, and discusses application programming for the Nios II processor.
About This Handbook Typographical Conventions Visual Cue This document uses the typographic conventions shown below. Meaning Bold Type with Initial Capital Letters Command names, dialog box titles, checkbox options, and dialog box options are shown in bold, initial capital letters. Example: Save As dialog box. Bold type External timing parameters, directory names, project names, disk drive names, filenames, filename extensions, and software utility names are shown in bold type.
Typographical Conventions xiv Nios II Processor Reference Handbook Altera Corporation
Section I. Nios II Processor This section provides information about the Nios® II processor.
Nios II Processor Section I–2 Nios II Processor Reference Handbook Altera Corporation
1. Introduction NII51001-7.2.0 Introduction This chapter is an introduction to the Nios® II embedded processor family. This chapter helps hardware and software engineers understand the similarities and differences between the Nios II processor and traditional embedded processors.
Getting Started with the Nios II Processor A Nios II processor system is equivalent to a microcontroller or “computer on a chip” that includes a processor and a combination of peripherals and memory on a single chip. The term “Nios II processor system” refers to a Nios II processor core, a set of on-chip peripherals, onchip memory, and interfaces to off-chip memory, all implemented on a single Altera device.
Introduction Figure 1–1. Example of a Nios II Processor System Reset Clock JTAG connection to software debugger JTAG Debug Module Data Nios II Processor Core UART TXD RXD Inst. SDRAM Controller On-Chip ROM Flash Memory Tristate bridge to off-chip memory SRAM Memory Avalon Switch Fabric Timer1 SDRAM Memory Timer2 LCD Display Driver LCD Screen General-Purpose I/O Buttons, LEDs, etc.
Configurable Soft-Core Processor Concepts Because the pins and logic resources in Altera devices are programmable, many customizations are possible: Configurable Soft-Core Processor Concepts ■ You can rearrange the pins on the chip simplify the board design. For example, you can move address and data pins for external SDRAM memory to any side of the chip to shorten board traces. ■ You can use extra pins and logic resources on the chip for functions unrelated to the processor.
Introduction Flexible Peripheral Set and Address Map A flexible peripheral set is one of the most notable differences between Nios II processor systems and fixed microcontrollers. Because of the softcore nature of the Nios II processor, you can easily build made-to-order Nios II processor systems with the exact peripheral set required for the target applications. A corollary of flexible peripherals is a flexible address map.
OpenCore Plus Evaluation Because the processor is implemented on reprogrammable Altera FPGAs, software and hardware engineers can work together to iteratively optimize the hardware and test the results of software running on hardware. From the software perspective, custom instructions appear as machinegenerated assembly macros or C functions, so programmers do not need to know assembly in order to use custom instructions.
Introduction Referenced Documents This chapter references the following documents: Document Revision History Table 1–1 shows the revision history for this document. ■ AN 320: OpenCore Plus Evaluation of Megafunctions. Table 1–1. Document Revision History Date & Document Version Changes Made October 2007 v7.2.0 Added OpenCore Plus section. May 2007 v7.1.0 ● March 2007 v7.0.0 No change from previous release. November 2006 v6.1.0 No change from previous release. May 2006 v6.0.
Document Revision History 1–8 Nios II Processor Reference Handbook Altera Corporation October 2007
2. Processor Architecture NII51002-7.2.0 Introduction This chapter describes the hardware structure of the Nios® II processor, including a discussion of all the functional units of the Nios II architecture and the fundamentals of the Nios II processor hardware implementation.
Processor Implementation Figure 2–1. Nios II Processor Core Block Diagram Nios II Processor Core reset clock cpu_resetrequest cpu_resettaken JTAG interface to software debugger Program Controller & Address Generation JTAG Debug Module Tightly Coupled Instruction Memory General Purpose Registers r0 to r31 Instruction Cache Tightly Coupled Instruction Memory Exception Controller irq[31..
Processor Architecture instruction set, not a particular hardware implementation. A functional unit can be implemented in hardware, emulated in software, or omitted entirely. A Nios II implementation is a set of design choices embodied by a particular Nios II processor core. All implementations support the instruction set defined in the Nios II Processor Reference Handbook. Each implementation achieves specific objectives, such as smaller core size or higher performance.
Arithmetic Logic Unit Arithmetic Logic Unit The Nios II arithmetic logic unit (ALU) operates on data stored in general-purpose registers. ALU operations take one or two inputs from registers, and store a result back in a register. The ALU supports the data operations shown in Table 2–1: Table 2–1. Operations Supported by the Nios II ALU Category Details Arithmetic The ALU supports addition, subtraction, multiplication, and division on signed and unsigned operands.
Processor Architecture Floating Point Instructions The Nios II architecture supports single precision floating point instructions as specified by the IEEE Std 754-1985. These floating point instructions are implemented as custom instructions. Table 2–2 provides a detailed description of the conformance to IEEE 754-1985. Table 2–2.
Reset Signals 1 Reset Signals The floating point custom instructions can be added to any Nios II processor core. The Nios II software development tools recognize C code that can take advantage of the floating point instructions when they are present in the processor core. The Nios II processor core supports two reset signals. ■ ■ reset - This a global hardware reset signal that forces the processor core to reset immediately.
Processor Architecture The software can enable and disable any interrupt source individually through the ienable control register, which contains an interrupt-enable bit for each of the IRQ inputs. Software can enable and disable interrupts globally using the PIE bit of the status control register.
Memory and I/O Organization Description: The interrupt vector custom instruction accelerates interrupt vector dispatch. This custom instruction identifies the highest priority interrupt, generates the vector table offset, and stores this offset to rC. The instruction generates a negative offset if there is no hardware interrupt (that is, the exception is caused by a software condition, such as a trap). Usage: The interrupt vector custom instruction is used exclusively by the exception handler.
Processor Architecture f For details that affect programming issues, see the Programming Model chapter of the Nios II Processor Reference Handbook. Figure 2–2 shows a diagram of the memory and I/O organization for a Nios II processor core. Figure 2–2.
Memory and I/O Organization Memory and Peripheral Access The Nios II architecture provides memory-mapped I/O access. Both data memory and peripherals are mapped into the address space of the data master port. The Nios II architecture is little endian. Words and halfwords are stored in memory with the more-significant bytes at higher addresses.
Processor Architecture Data Master Port The Nios II data bus is implemented as a 32-bit Avalon-MM master port. The data master port performs two functions: ■ ■ Read data from memory or a peripheral when the processor executes a load instruction Write data to memory or a peripheral when the processor executes a store instruction Byte-enable signals on the master port specify which of the four bytelane(s) to write during store operations.
Memory and I/O Organization core. The cache memories can improve the average memory access time for Nios II processor systems that use slow off-chip memory such as SDRAM for program and data storage. The instruction and data caches are enabled perpetually at run-time, but methods are provided for software to bypass the data cache so that peripheral accesses do not return cached data. Cache management and cache coherency are handled by software.
Processor Architecture If an application always requires certain data or sections of code to be located in cache memory for performance reasons, the tightly-coupled memory feature might provide a more appropriate solution. Refer to “Tightly-Coupled Memory” on page 2–13 for details.
Memory and I/O Organization instruction and data access. Each tightly-coupled memory port connects directly to exactly one memory with guaranteed low, fixed latency. The memory is external to the Nios II core and is usually located on chip. Accessing Tightly-Coupled Memory Tightly-coupled memories occupy normal address space, the same as other memory devices connected via system interconnect fabric. The address ranges for tightly-coupled memories (if any) are determined at system generation time.
Processor Architecture JTAG Debug Module The Nios II architecture supports a JTAG debug module that provides onchip emulation features to control the processor remotely from a host PC.
JTAG Debug Module Download and Execute Software Downloading software refers to the ability to download executable code and data to the processor’s memory via the JTAG connection. After downloading software to memory, the JTAG debug module can then exit debug mode and transfer execution to the start of executable code. Software Breakpoints Software breakpoints provide the ability to set a breakpoint on instructions residing in RAM.
Processor Architecture Table 2–4. Trigger Conditions Condition Description Bus (1) Specific address D, I Trigger when the bus accesses a specific address. Specific data value D Trigger when a specific data value appears on the bus. Read cycle D Trigger on a read bus cycle. Write cycle D Trigger on a write bus cycle. Armed D, I Trigger only after an armed trigger event. See “Armed Triggers” on page 2–17. Range D Trigger on a range of address values, data values, or both.
JTAG Debug Module Trace Capture Trace capture refers to ability to record the instruction-by-instruction execution of the processor as it executes code in real-time. The JTAG debug module offers the following trace features: ■ ■ ■ ■ ■ ■ ■ ■ Capture execution trace (instruction bus cycles). Capture data trace (data bus cycles). For each data bus cycle, capture address, data, or both. Start and stop capturing trace in real time, based on triggers. Manually start and stop trace under host control.
Processor Architecture Trace Frames A “frame” is a unit of memory allocated for collecting trace data. However, a frame is not an absolute measure of the trace depth. To keep pace with the processor executing in real time, execution trace is optimized to store only selected addresses, such as branches, calls, traps, and interrupts. From these addresses, host-side debug software can later reconstruct an exact instruction-by-instruction execution trace.
Document Revision History Document Revision History Table 2–6 shows the revision history for this document. Table 2–6. Document Revision History Date & Document Version Changes Made October 2007 v7.2.0 No change from previous release. May 2007 v7.1.0 ● March 2007 v7.0.0 No change from previous release. November 2006 v6.1.0 Describe interrupt vector custom instruction. May 2006 v6.0.0 ● October 2005 v5.1.0 No change from previous release. May 2005 v5.0.0 Added tightly-coupled memory.
3. Programming Model NII51003-7.2.0 Introduction This chapter describes the Nios® II programming model, covering processor features at the assembly language level. Fully understanding the contents of this chapter requires prior knowledge of computer architecture, exception handling, and instruction sets. This chapter assumes you have a detailed understanding of the aforementioned concepts and focuses on how these concepts are specifically implemented in the Nios II processor.
Control Registers accessed by call and ret instructions. C and C++ compilers use a common procedure-call convention, assigning specific meaning to registers r1 through r23 and r26 through r28. Table 3–1.
Programming Model 1 When writing to control registers, all undefined bits must be written as zero. For details on the relationship between the control registers and exception processing, see Figure 3–1 on page 3–9. Table 3–2.
Operating Modes bstatus The bstatus register holds a saved copy of the status register during break exception processing. One bit is defined: BPIE. This is the saved value of PIE, as defined in Table 3–3. When a break occurs, the value of the status register is copied into bstatus. Using bstatus, the debugger can restore the status register to the value prior to the break. The bret instruction causes the processor to copy bstatus back to status. See “Debug Mode” on page 3–5 for more information.
Programming Model The following sections define the modes and the transitions between modes. Normal Mode In general, system and application code execute in normal mode. The processor is in normal mode immediately after processor reset. General-purpose registers bt (r25) and ba (r30) are reserved for debugging and are not available in normal mode. Programs are not prevented from storing values in these registers, but if they do, the debug mode could overwrite the values.
Exception Processing ■ Instruction-related exceptions Table 3–4 shows all possible Nios II exceptions in order of highest to lowest priority. For each exception, an exception vector along with any control register indications help determine the exception type. Table 3–4.
Programming Model The reset state is undefined for all other system components, including but not limited to: ■ ■ ■ ■ ■ ■ ■ General-purpose registers, except for zero (r0) which is permanently zero. Control registers, except for status which is reset to 0x0. Instruction and data memory. Cache memory, except for the instruction-cache line associated with the reset vector. Peripherals. Refer to the appropriate peripheral data sheet or specification for reset conditions. Custom instruction logic.
Exception Processing 3. Writes the address of the instruction following the break to the ba register (r30) 4. Transfers execution to the break handler, stored at the break vector specified at system generation time Register Usage The bstatus control register and general-purpose registers bt (r25) and ba (r30) are reserved for debugging. Code is not prevented from writing to these registers, but debug code might overwrite the values. The break handler can use bt (r25) to help save additional registers.
Programming Model Figure 3–1. Relationship Between ienable, ipending, PIE and Hardware Interrupts Relationship Between ienable, ipending, PIE, and Interrupt Generation 31 0 ienable Register IENABLE0 IENABLE1 IENABLE2 IENABLE31 irq0 irq1 irq2 irq31 External hardware interrupt request inputs irq[31..0] ... 31 0 ipending Register IPENDING0 IPENDING1 IPENDING2 IPENDING31 ... ...
Exception Processing Instruction-Related Exceptions Instruction-related exceptions occur during execution of Nios II instructions and perform the steps outlined in “Processing Interrupt and Instruction-Related Exceptions” on page 3–11. The Nios II processor generates the following instruction-related exceptions. All instruction-related exceptions are precise.
Programming Model Other Exceptions The previous sections describe all of the exception types defined by the Nios II architecture at the time of publishing. However, some processor implementations might generate exceptions that do not fall into the above categories. For example, a future implementation might provide a memory management unit (MMU) that generates access violation exceptions.
Exception Processing Determining the Cause of Interrupt and Instruction-Related Exceptions The general exception handler must determine the cause of each exception and then transfer control to an appropriate exception routine. Figure 3–2 shows an example of the process used to determine the exception source. Figure 3–2.
Programming Model access to the code memory to read this address). If the instruction is trap, the exception is a software trap. If the instruction at address ea-4 is one of the instructions that can be implemented in software, the exception was caused by an unimplemented instruction. See “Potential Unimplemented Instructions” on page 3–21 for details. If none of the above conditions apply, the exception type is unrecognized, and the exception handler should report the condition.
Memory and Peripheral Access On the other hand, hardware interrupt exceptions must resume execution from the interrupted instruction itself. In this case, the exception handler must subtract 4 from ea to point to the interrupted instruction. Memory and Peripheral Access f Nios II addresses are 32 bits, allowing access up to a 4 gigabyte address space. However, many Nios II core implementations restrict addresses to 31 bits or fewer.
Programming Model Code written for a processor core with cache memory behaves correctly on a processor core without cache memory. The reverse is not true. Therefore, for a program to work properly on all Nios II processor core implementations, the program must behave as if the instruction and data caches exist. In systems without cache memory, the cache management instructions perform no operation, and their effects are benign.
Instruction Set Categories The data transfer instructions in Table 3–6 support byte and half-word transfers. Table 3–6. Narrow Data Transfer Instructions Instruction ldb ldbu stb ldh ldhu sth ldbio ldbuio stbio ldhio ldhuio sthio Description ldb, ldbu, ldh and ldhu load a byte or half-word from memory to a register. ldb and ldh signextend the value to 32 bits, and ldbu and ldhu zero-extend the value to 32 bits. stb and sth store byte and half-word values, respectively.
Programming Model Table 3–7. Arithmetic and Logical Instructions Instruction Description addi subi muli These instructions are immediate versions of the add, sub, and mul instructions. The instruction word includes a 16-bit signed value. mulxss mulxuu These instructions provide access to the upper 32 bits of a 32x32 multiplication operation. Choose the appropriate instruction depending on whether the operands should be treated as signed or unsigned values.
Instruction Set Categories Table 3–9. Comparison Instructions (Part 2 of 2) Instruction Description cmple unsigned <= cmpleu unsigned <= cmplt signed < cmpltu unsigned < cmpeqi cmpnei cmpgei cmpgeui cmpgti cmpgtui cmplei cmpleui cmplti cmpltui These instructions are immediate versions of the comparison operations. They compare the value of a register and a 16-bit immediate value. Signed operations sign-extend the immediate value to 32-bits. Unsigned operations fill the upper bits with zero.
Programming Model Program Control Instructions The Nios II architecture supports the unconditional jump and call instructions listed in Table 3–11. These instructions do not have delay slots. Table 3–11. Unconditional Jump and Call Instructions Instruction Description call This instruction calls a subroutine using an immediate value as the subroutine's absolute address, and stores the return address in register ra.
Instruction Set Categories The conditional-branch instructions do not have delay slots. Table 3–12. Conditional-Branch Instructions Instruction bge bgeu bgt bgtu ble bleu blt bltu beq bne Description These instructions provide relative branches that compare two register values and branch if the expression is true. See “Comparison Instructions” on page 3–17 for a description of the relational operations implemented. Other Control Instructions Table 3–13 shows other control instructions. Table 3–13.
Programming Model Custom Instructions The custom instruction provides low-level access to custom instruction logic. The inclusion of custom instructions is specified at system generation time, and the function implemented by custom instruction logic is design dependent. f For further details, see the “Custom Instructions” section of the Processor Architecture chapter of the Nios II Processor Reference Handbook and the Nios II Custom Instruction User Guide.
Document Revision History ■ ■ ■ ■ ■ ■ Document Revision History Application Binary Interface chapter of the Nios II Processor Reference Handbook Nios II Core Implementation Details chapter of the Nios II Processor Reference Handbook Exception Handling chapter of the Nios II Software Developer’s Handbook Cache and Tightly Coupled Memory chapter of the Nios II Software Developer’s Handbook Processor Architecture chapter of the Nios II Processor Reference Handbook Nios II Custom Instruction User Guide Tabl
4. Instantiating the Nios II Processor in SOPC Builder NII51004-7.2.0 Introduction This chapter describes the Nios® II Processor MegaWizard interface in SOPC Builder.
Core Nios II Page Core Nios II Page The Core Nios II page presents the main settings for configuring the Nios II processor. Figure 4–1 shows an example of the Core Nios II page. Figure 4–1.
Instantiating the Nios II Processor in SOPC Builder The following sections describe the configuration settings available. Core Selection The main purpose of the Core Nios II page is to select the processor core. The core you select on this page affects other options available on this and other pages. Currently, Altera® offers three Nios II cores: ■ Nios II/f—The Nios II/f “fast” core is designed for fast performance.
Core Nios II Page ■ None - This option conserves logic resources by eliminating multiply hardware. Multiply operations are implemented in software. Turning on Hardware Divide includes LE-based divide hardware in the ALU. The Hardware Divide option achieves much greater performance than software emulation of divide operations.
Instantiating the Nios II Processor in SOPC Builder Offset allows you to specify the location of the exception vector relative to the memory module’s base address. SOPC Builder calculates the physical address of the exception vector when you modify the memory module, the offset, or the memory module’s base address For details on exceptions, see the Programming Model chapter of the Nios II Processor Reference Handbook.
Caches and Memory Interfaces Page Caches and Memory Interfaces Page The Caches and Memory Interfaces page allows you to configure the cache and tightly-coupled memory usage for the instruction and data master ports. Figure 4–2 shows an example of the Caches and Memory Interfaces page. Figure 4–2.
Instantiating the Nios II Processor in SOPC Builder The following sections describe the configuration settings available. Instruction Master Settings The Instruction Master settings provide the following options for the Nios II/f and Nios II/s cores: ■ Instruction Cache - Specifies the size of the instruction cache. Valid sizes are from 512 bytes to 64 KBytes, or None. Choosing None disables the instruction cache, which also removes the Avalon-MM instruction master port from the Nios II processor.
Caches and Memory Interfaces Page Data Master Settings The Data Master settings provide the following options for the Nios II/f core: ■ Data Cache - Specifies the size of the data cache. Valid sizes are from 512 bytes to 64 KBytes, or None. Depending on the value specified for Data Cache, the following options are available: ● ● Data Cache Line Size - Valid sizes are 4, 16, or 32 bytes.
Instantiating the Nios II Processor in SOPC Builder Advanced Features Page The Advanced Features page allows you to enable specialized features of the Nios II processor. Figure 4–3 shows the Advanced Features page. Figure 4–3.
JTAG Debug Module Page Reset Signals Include cpu_resetrequest and cpu_resettaken signals adds processoronly reset request signals to the Nios II processor. These signals let another device individually reset the Nios II processor without resetting the entire SOPC Builder system. The signals are exported to the top level of your SOPC Builder system.
Instantiating the Nios II Processor in SOPC Builder Table 4–1 describes the debug features available to you for debugging your system. Table 4–1. Debug Configuration Features Feature JTAG Target Connection Description Connects to the processor through the standard JTAG pins on the Altera FPGA. This provides the basic capabilities to start and stop the processor, and examine/edit registers and memory. Download Software Downloads executable code to the processor’s memory via the JTAG connection.
JTAG Debug Module Page The following sections describe the configuration settings available. Debug Level Settings There are five debug levels in the JTAG Debug Module page as shown in Figure 4–4. Figure 4–4.
Instantiating the Nios II Processor in SOPC Builder Table 4–2 on page 4–13 is a detailed list of the characteristics of each debug level. Different levels consume different amounts of on-chip resources. Certain Nios II cores have restricted debug options, and certain options require debug tools provided by First Silicon Solutions (FS2) or Lauterbach. f For details on the debug features available from FS2, visit www.fs2.com, and from Lauterbach, see www.lauterbach.com. Table 4–2.
Custom Instructions Page Advanced Debug Settings Debug levels 3 and 4 support trace data collection into an on-chip memory buffer. You can set the on-chip trace buffer size to sizes from 128 to 64K trace frames, using OCI Onchip Trace. Larger buffer sizes consume more on-chip M4K RAM blocks. Every M4K RAM block can store up to 128 trace frames. Debug level 4 also supports manual 2X clock signal specification.
Instantiating the Nios II Processor in SOPC Builder by implementing performance-critical operations in hardware using custom-instruction logic. Figure 4–5 shows an example of the Custom Instructions page. Figure 4–5. Custom Instructions Page in the Nios II Processor MegaWizard To add a custom instruction to the Nios II processor, select the custom instruction from the list at the left side of the page, and click Add. The added instruction appears on the right side of the page.
Custom Instructions Page 1 To display custom instructions in the table of active components on the SOPC Builder System Contents tab, click Filter in the lower-right of the System Contents tab, and turn on Nios Custom Instruction. To create your own custom instruction using the component editor, click Import. After finishing in the component editor, click Refresh Component List on the File menu to add the new instruction to the list at the left side of the Custom Instructions page.
Instantiating the Nios II Processor in SOPC Builder Floating Point Hardware Custom Instruction The Nios II processor offers a set of optional predefined custom instructions that implement floating-point arithmetic operations. You can include these custom instructions to support computation-intensive floating-point applications. The basic set of floating-point custom instructions includes single precision (32-bit) floating-point addition, subtraction, and multiplication.
Custom Instructions Page Figure 4–6. Nios II Floating Point Hardware Dialog Box Turn on Use floating point division hardware to include floating-point division hardware. The floating-point division hardware requires more resources than the other instructions, so you might wish to omit it if your application does not make heavy use of floating-point division. Click Finish to add the floating point custom instructions to the Nios II processor.
Instantiating the Nios II Processor in SOPC Builder f For details integrating the bitswap custom instruction into your own algorithm, see the Nios II Custom Instruction User Guide. Bitswap Custom Instruction The Nios II processor core offers a bitswap custom instruction to reduce the time spent performing bit reversal operations. To add the bitswap custom instruction to the Nios II processor, select Bitswap from the list, and click Add.
Document Revision History Document Revision History Table 4–3 shows the revision history for this document. Table 4–3. Document Revision History Date & Document Version Changes Made October 2007 v7.2.0 Changed title to match other Altera documentation. May 2007 v7.1.0 ● ● ● ● Revised to reflect new MegaWizard interface. Added “Endian Converter Custom Instruction” on page 4–18 and “Bitswap Custom Instruction” on page 4–19. Added table of contents to Introduction section.
Section II. Appendices This section provides additional information about the Nios® II processor.
Appendices Nios II Processor Reference Handbook Section II–2 Altera Corporation
5. Nios II Core Implementation Details NII51015-7.2.0 Introduction f This document describes all of the Nios® II processor core implementations available at the time of publishing. This document describes only implementation-specific features of each processor core. All cores support the Nios II instruction set architecture. For more information regarding the Nios II instruction set architecture, refer to the Instruction Set Reference chapter of the Nios II Processor Reference Handbook.
Introduction Table 5–1.
Nios II Core Implementation Details Device Family Support All Nios II cores provide the same support for target Altera device families.
Nios II/f Core Overview The Nios II/f core: ■ ■ ■ ■ ■ ■ ■ ■ ■ Has separate instruction and data caches Can access up to 2 GBytes of external address space Supports optional tightly-coupled memory for instructions and data Employs a 6-stage pipeline to achieve maximum DMIPS/MHz Performs dynamic branch prediction Provides hardware multiply, divide, and shift options to improve arithmetic performance Supports the addition of custom instructions Supports the JTAG debug module Supports optional JTAG debug modu
Nios II Core Implementation Details 1 The performance of the embedded multipliers differ, depending on the target FPGA family. Table 5–3 lists the details of the hardware multiply and divide options. Table 5–3.
Nios II/f Core addi r1, r1, 100 ; r1 = r1 + 100 (Depends on result of mul) Shift and Rotate Performance The performance of shift operations depends on the hardware multiply option. When a hardware multiplier is present, the ALU achieves shift and rotate operations in one or two clock cycles. Otherwise, the ALU includes dedicated shift circuitry that achieves one-bit-per-cycle shift and rotate performance. Refer to Table 5–6 on page 5–11 for details.
Nios II Core Implementation Details Both the instruction and data cache addresses are divided into fields. Table 5–4 shows the cache byte address fields. Table 5–4. Cache Byte Address Fields 31 . . . . tag line . .
Nios II/f Core The Nios II/f core implements all the data cache bypass methods. f For information regarding the data cache bypass methods, refer to the Processor Architecture chapter of the Nios II Processor Reference Handbook Mixing cached and noncached accesses to the same cache line can result in invalid data reads. For example, the following sequence of events causes cache incoherency. 1. The Nios II core writes data to cache, creating a dirty data cache line. 2.
Nios II Core Implementation Details Accessing tightly-coupled memory bypasses cache memory. The processor core functions as if cache were not present for the address span of the tightly-coupled memory. Instructions for managing cache, such as initd and flushd, do not affect the tightly-coupled memory, even if the instruction specifies an address in tightly-coupled memory. Execution Pipeline This section provides an overview of the pipeline behavior for the benefit of performance-critical applications.
Nios II/f Core Only the A-stage and D-stage are allowed to create stalls. The A-stage stall occurs if any of the following conditions occurs: ■ ■ ■ ■ An A-stage memory instruction is waiting for Avalon-MM data master requests to complete. Typically this happens when a load or store misses in the data cache, or a flushd instruction needs to write back a dirty line. An A-stage shift/rotate instruction is still performing its operation. This only occurs with the multi-cycle shift circuitry (i.e.
Nios II Core Implementation Details Execution performance for all instructions is shown in Table 5–6. Table 5–6. Instruction Execution Performance for Nios II/f Core Instruction Cycles Normal ALU instructions (e.g.
Nios II/s Core JTAG Debug Module The Nios II/f core supports the JTAG debug module to provide a JTAG interface to software debugging tools. The Nios II/f core supports an optional enhanced interface that allows real-time trace data to be routed out of the processor and stored in an external debug probe. Unsupported Features The Nios II/f core does not handle the execution of instructions with undefined opcodes.
Nios II Core Implementation Details The following sections discuss the noteworthy details of the Nios II/s core implementation. This document does not discuss low-level design issues, or implementation details that do not affect Nios II hardware or software designers. Arithmetic Logic Unit The Nios II/s core provides several ALU options to improve the performance of multiply, divide, and shift operations.
Nios II/s Core Table 5–7.
Nios II Core Implementation Details Instruction Cache The instruction cache for the Nios II/s core is nearly identical to the instruction cache in the Nios II/f core. The instruction cache memory has the following characteristics: ■ ■ ■ Direct-mapped cache implementation The instruction master port reads an entire cache line at a time from memory, and issues one read per clock cycle. Critical word first Table 5–8 shows the instruction byte address fields. Table 5–8. Instruction Byte Address Fields 31 .
Nios II/s Core Accessing tightly-coupled memory bypasses cache memory. The processor core functions as if cache were not present for the address span of the tightly-coupled memory. Instructions for managing cache, such as initi and flushi, do not affect the tightly-coupled memory, even if the instruction specifies an address in tightly-coupled memory. Execution Pipeline This section provides an overview of the pipeline behavior for the benefit of performance-critical applications.
Nios II Core Implementation Details Pipeline Stalls The pipeline is set up so that if a stage stalls, no new values enter that stage or any earlier stages. No “catching up” of pipeline stages is allowed, even if a pipeline stage is empty. Only the M-stage is allowed to create stalls. The M-stage stall occurs if any of the following conditions occurs: ■ ■ ■ ■ An M-stage load/store instruction is waiting for Avalon-MM data master transfer to complete.
Nios II/s Core Table 5–10.
Nios II Core Implementation Details Nios II/e Core The Nios II/e “economy” core is designed to achieve the smallest possible core size. Altera designed the Nios II/e core with a singular design goal: Reduce resource utilization any way possible, while still maintaining compatibility with the Nios II instruction set architecture. Hardware resources are conserved at the expense of execution performance.
Nios II/e Core f For information regarding data cache bypass methods, refer to the Processor Architecture chapter of the Nios II Processor Reference Handbook. Instruction Execution Stages This section provides an overview of the pipeline behavior as a means of estimating assembly execution time. Most application programmers never need to analyze the performance of individual instructions.
Nios II Core Implementation Details Exception Handling The Nios II/e core supports the following exception types: ■ ■ ■ Hardware interrupt Software traps Unimplemented instruction JTAG Debug Module The Nios II/e core supports the JTAG debug module to provide a JTAG interface to software debugging tools. The JTAG debug module on the Nios II/e core does not support hardware breakpoints or trace. Unsupported Features The Nios II/e core does not handle the execution of instructions with undefined opcodes.
Document Revision History Document Revision History Table 5–12 shows the revision history for this document. Table 5–12. Document Revision History Date & Document Version Changes Made October 2007 v7.2.0 Added jmpi instruction to tables. May 2007 v7.1.0 ● March 2007 v7.0.0 Add preliminary Cyclone III device family support November 2006 v6.1.0 Add preliminary Stratix III device family support May 2006 v6.0.
6. Nios II Processor Revision History NII51018-7.2.0 Introduction Each release of the Nios® II Embedded Design Suite (EDS) introduces improvements to the Nios II processor, the software development tools, or both. This document catalogs the history of revisions to the Nios II processor; it does not track revisions to development tools, such as the Nios II IDE.
Architecture Revisions Table 6–1 lists the version numbers of all releases of the Nios II processor. Table 6–1. Nios II Processor Revision History Version Release Date Notes 7.2 October 2007 Added the jmpi instruction. 7.1 May 2007 No changes. 7.0 March 2007 No changes. 6.1 November 2006 No changes. 6.0 May 2006 The name Nios II Development Kit describing the software development tools changed to Nios II Embedded Design Suite. 5.1 SP1 January 2006 Bug fix for Nios II/f core. 5.
Nios II Processor Revision History instruction to the instruction set, Altera consequently must update all Nios II cores to recognize the new instruction. Table 6–2 lists revisions to the Nios II architecture. Table 6–2. Nios II Architecture Revisions Version Release Date Notes 7.2 October 2007 7.1 May 2007 No changes. 7.0 March 2007 No changes. 6.1 November 2006 No changes. 6.0 May 2006 Added optional cpu_resetrequest and cpu_resettaken signals to all processor cores. 5.
Core Revisions Table 6–3. Nios II/f Core Revisions Version Release Date Notes 5.1 SP1 January 2006 Bug Fix: Back-to-back store instructions can cause memory corruption to the stored data. If the first store is not to the last word of a cache line and the second store is to the last word of the line, memory corruption occurs. 5.1 October 2005 No changes. 5.0 May 2005 ● ● ● ● 1.1 December 2004 ● ● ● Added optional tightly-coupled memory ports.
Nios II Processor Revision History Nios II/s Core Table 6–4 lists revisions to the Nios II/s core. Table 6–4. Nios II/s Core Revisions Version 7.2 Release Date October 2007 Notes Implemented the jmpi instruction. 7.1 May 2007 No changes. 7.0 March 2007 No changes. 6.1 November 2006 No changes. 6.0 May 2006 ● 5.1 October 2005 No changes. 5.0 May 2005 ● ● ● 1.1 December 2004 ● ● ● Cycle count for flushi and initi instructions changes from 1 to 4 cycles.
JTAG Debug Module Revisions Nios II/e Core Table 6–5 lists revisions to the Nios II/e core. Table 6–5. Nios II/e Core Revisions Version Release Date Notes 7.2 October 2007 7.1 May 2007 No changes. 7.0 March 2007 No changes. 6.1 November 2006 No changes. 6.0 May 2006 No changes. 5.1 October 2005 No changes. 5.0 May 2005 ● 1.1 December 2004 Added cpuid control register. 1.01 September 2004 ● 1.0 May 2004 Initial release of the Nios II/e core.
Nios II Processor Revision History Table 6–6 lists revisions to the JTAG debug module. Table 6–6. JTAG Debug Module Revisions Version Release Date Notes 7.2 October 2007 No changes. 7.1 May 2007 No changes. 7.0 March 2007 No changes. 6.1 November 2006 No changes. 6.0 May 2006 No changes. 5.1 October 2005 No changes. 5.0 May 2005 Full support for HardCopy devices (previous versions of the JTAG debug module did not support HardCopy devices). 1.
Document Revision History Document Revision History Table 6–7 shows the revision history for this document. Table 6–7. Document Revision History Date & Document Version Changes Made October 2007 v7.2.0 ● May 2007 v7.1.0 ● ● ● ● Summary of Changes Added jmpi instruction information. Added exception handling information. Updated tables to reflect no changes to cores. Added table of contents to Introduction section. Added Referenced Documents section. March 2007 v7.0.
7. Application Binary Interface NII51016-7.2.0 This section describes the Application Binary Interface (ABI) for the Nios® II processor.
Memory Alignment Memory Alignment Contents in memory are aligned as follows: ■ ■ A function must be aligned to a minimum of 32-bit boundary. The minimum alignment of a data element is its natural size. A data element larger than 32-bits need only be aligned to a 32-bit boundary. ■ Structures, unions, and strings must be aligned to a minimum of 32 bits. Bit-fields inside structures are always 32-bit aligned.
Application Binary Interface Table 7–2.
Stacks Figure 7–1. Stack Pointer, Frame Pointer and the Current Frame In Function a() Just prior to calling b() In Function b() Just after executing prolog Higher addresses fp and sp incoming stack arguments outgoing stack arguments Allocated and freed by a() (i.e. the calling function) saved registers space for stack temporaries fp and sp Allocated and freed by b() (i.e.
Application Binary Interface Further Examples of Stacks There are a number of special cases for stack layout, which are described in this section. Stack Frame for a Function With alloca() Figure 7–2 depicts what the frame looks like after alloca() is called. The space allocated by alloca() replaces the outgoing arguments and the outgoing arguments get new space allocated at the bottom of the frame.
Stacks Figure 7–3. Stack Frame Using Variable Arguments In Function a() Just Prior to Calling b() In Function b() Just after Executing Prolog Higher addresses fp and sp outgoing stack arguments incoming stack arguments Allocated and freed by a() (i.e. the calling function) copy of r7 copy of r6 copy of r5 copy of r4 saved registers space for stack temporaries Lower addresses fp and sp Allocated and freed by b() (i.e.
Application Binary Interface Debuggers can use the knowledge of how the function prologs work to disassemble the instructions to reconstruct state when doing a back trace. Preferably, debuggers can use information stored in the DWARF2 debugging information to find out what a prolog has done. The instructions found in a Nios II function prolog perform the following tasks: ■ ■ ■ Adjust the SP (to allocate the frame) Store registers to the frame.
Arguments and Return Values Arguments and Return Values This section discusses the details of passing arguments to functions and returning values from functions. Arguments The first 16-bytes to a function are passed in registers r4 through r7. The arguments are passed as if a structure containing the types of the arguments was constructed, and the first 16-bytes of the structure are located in r4 through r7.
Application Binary Interface Example 7–2. Example: function a() calls function b(), which returns a struct. /* b() computes a structure-type result and returns it */ STRUCT b(int i, int j) { ... return result; } void a(...) { ... value = b(i, j); } In this example, as long as the result type is no larger than 8 bytes, b() will return its result in r2 and r3. If the return type is larger than 8 bytes, the Nios II C/C++ compiler treats this program as if a() had passed a pointer to b().
Document Revision History Document Revision History Table 7–3 shows the revision history for this document. Table 7–3. Document Revision History Date & Document Version Changes Made October 2007 v7.2.0 No change from previous release. May 2007 v7.1.0 ● March 2007 v7.0.0 No change from previous release. November 2006 v6.1.0 No change from previous release. May 2006 v6.0.0 No change from previous release. October 2005 v5.1.0 No change from previous release. May 2005 v5.0.
8. Instruction Set Reference NII51017-7.2.0 Introduction This section introduces the Nios® II instruction-word format and provides a detailed reference of the Nios II instruction set.
Word Formats R-Type The defining characteristic of the R-type instruction-word format is that all arguments and results are specified as registers. R-type instructions contain: ■ ■ ■ A 6-bit opcode field OP Three 5-bit register fields A, B, and C An 11-bit opcode-extension field OPX In most cases, fields A and B specify the source operands, and field C specifies the destination register. Some R-Type instructions embed a small immediate value in the low-order bits of OPX.
Instruction Set Reference J-Type J-type instructions contain: ■ ■ A 6-bit opcode field A 26-bit immediate data field J-type instructions, such as call and jmpi, transfer execution anywhere within a 256 MByte range.
Instruction Opcodes Instruction Opcodes The OP field in the Nios II instruction word specifies the major class of an opcode as shown in Table 8–1 and Table 8–2. Most values of OP are encodings for I-type instructions. One encoding, OP = 0x00, is the J-type instruction call. Another encoding, OP = 0x3a, is used for all R-type instructions, in which case, the OPX field differentiates the instructions. All undefined encodings of OP and OPX are reserved. Table 8–1.
Instruction Set Reference Table 8–2.
Assembler Pseudo-instructions Assembler Pseudoinstructions Table 8–3 lists pseudoinstructions available in Nios II assembly language. Pseudoinstructions are used in assembly source code like regular assembly instructions. Each pseudoinstruction is implemented at the machine level using an equivalent instruction. The movia pseudoinstruction is the only exception, being implemented with two instructions. Most pseudoinstructions do not appear in disassembly views of machine code. Table 8–3.
Instruction Set Reference Assembler Macros The Nios II assembler provides macros to extract halfwords from labels and from 32-bit immediate values. Table 8–4 lists the available macros. These macros return 16-bit signed values or 16-bit unsigned values depending on where they are used. When used with an instruction that requires a 16-bit signed immediate value, these macros return a value ranging from –32768 to 32767.
Instruction Set Reference Instruction Set Reference The following pages list all Nios II instruction mnemonics in alphabetical order. Table 8–5 shows the notation conventions used to describe instruction operation. Table 8–5.
add add add Operation: rC ← rA + rB Assembler Syntax: add rC, rA, rB Example: add r6, r7, r8 Description: Calculates the sum of rA and rB. Stores the result in rC. Used for both signed and unsigned addition. Usage: Carry Detection (unsigned operands): Following an add operation, a carry out of the MSB can be detected by checking whether the unsigned sum is less than one of the unsigned operands.
addi addi add immediate Operation: rB ← rA + σ (IMM16) Assembler Syntax: addi rB, rA, IMM16 Example: addi r6, r7, -100 Description: Sign-extends the 16-bit immediate value and adds it to the value of rA. Stores the sum in rB. Usage: Carry Detection (unsigned operands): Following an addi operation, a carry out of the MSB can be detected by checking whether the unsigned sum is less than one of the unsigned operands.
and and bitwise logical and Operation: rC ← rA & rB Assembler Syntax: and rC, rA, rB Example: and r6, r7, r8 Description: Calculates the bitwise logical AND of rA and rB and stores the result in rC.
andhi andhi bitwise logical and immediate into high halfword Operation: rB ← rA & (IMM16 : 0x0000) Assembler Syntax: andhi rB, rA, IMM16 Example: andhi r6, r7, 100 Description: Calculates the bitwise logical AND of rA and (IMM16 : 0x0000) and stores the result in rB.
andi andi bitwise logical and immediate Operation: rB ← rA & (0x0000 : IMM16) Assembler Syntax: andi rB, rA, IMM16 Example: andi r6, r7, 100 Description: Calculates the bitwise logical AND of rA and (0x0000 : IMM16) and stores the result in rB.
beq beq branch if equal Operation: if (rA == rB) then PC ← PC + 4 + σ (IMM16) else PC ← PC + 4 Assembler Syntax: beq rA, rB, label Example: beq r6, r7, label Description: If rA == rB, then beq transfers program control to the instruction at label. In the instruction encoding, the offset given by IMM16 is treated as a signed number of bytes relative to the instruction immediately following beq.
bge bge branch if greater than or equal signed Operation: if ((signed) rA >= (signed) rB) then PC ← PC + 4 + σ (IMM16) else PC ← PC + 4 Assembler Syntax: bge rA, rB, label Example: bge r6, r7, top_of_loop Description: If (signed) rA >= (signed) rB, then bge transfers program control to the instruction at label. In the instruction encoding, the offset given by IMM16 is treated as a signed number of bytes relative to the instruction immediately following bge.
bgeu bgeu branch if greater than or equal unsigned Operation: if ((unsigned) rA >= (unsigned) rB) then PC ← PC + 4 + σ (IMM16) else PC ← PC + 4 Assembler Syntax: bgeu rA, rB, label Example: bgeu r6, r7, top_of_loop Description: If (unsigned) rA >= (unsigned) rB, then bgeu transfers program control to the instruction at label. In the instruction encoding, the offset given by IMM16 is treated as a signed number of bytes relative to the instruction immediately following bgeu.
bgt bgt branch if greater than signed Operation: if ((signed) rA > (signed) rB) then PC ← label else PC ← PC + 4 Assembler Syntax: bgt rA, rB, label Example: bgt r6, r7, top_of_loop Description: If (signed) rA > (signed) rB, then bgt transfers program control to the instruction at label. Pseudoinstruction: bgt is implemented with the blt instruction by swapping the register operands.
bgtu bgtu branch if greater than unsigned Operation: if ((unsigned) rA > (unsigned) rB) then PC ← label else PC ← PC + 4 Assembler Syntax: bgtu rA, rB, label Example: bgtu r6, r7, top_of_loop Description: If (unsigned) rA > (unsigned) rB, then bgtu transfers program control to the instruction at label. Pseudoinstruction: bgtu is implemented with the bltu instruction by swapping the register operands.
ble ble branch if less than or equal signed Operation: if ((signed) rA <= (signed) rB) then PC ← label else PC ← PC + 4 Assembler Syntax: ble rA, rB, label Example: ble r6, r7, top_of_loop Description: If (signed) rA <= (signed) rB, then ble transfers program control to the instruction at label. Pseudoinstruction: ble is implemented with the bge instruction by swapping the register operands.
bleu bleu branch if less than or equal to unsigned Operation: if ((unsigned) rA <= (unsigned) rB) then PC ← label else PC ← PC + 4 Assembler Syntax: bleu rA, rB, label Example: bleu r6, r7, top_of_loop Description: If (unsigned) rA <= (unsigned) rB, then bleu transfers program counter to the instruction at label. Pseudoinstruction: bleu is implemented with the bgeu instruction by swapping the register operands.
blt blt branch if less than signed Operation: if ((signed) rA < (signed) rB) then PC ← PC + 4 + σ (IMM16) else PC ← PC + 4 Assembler Syntax: blt rA, rB, label Example: blt r6, r7, top_of_loop Description: If (signed) rA < (signed) rB, then blt transfers program control to the instruction at label. In the instruction encoding, the offset given by IMM16 is treated as a signed number of bytes relative to the instruction immediately following blt.
bltu bltu branch if less than unsigned Operation: if ((unsigned) rA < (unsigned) rB) then PC ← PC + 4 + σ (IMM16) else PC ← PC + 4 Assembler Syntax: bltu rA, rB, label Example: bltu r6, r7, top_of_loop Description: If (unsigned) rA < (unsigned) rB, then bltu transfers program control to the instruction at label. In the instruction encoding, the offset given by IMM16 is treated as a signed number of bytes relative to the instruction immediately following bltu.
bne bne branch if not equal Operation: if (rA != rB) then PC ← PC + 4 + σ (IMM16) else PC ← PC + 4 Assembler Syntax: bne rA, rB, label Example: bne r6, r7, top_of_loop Description: If rA != rB, then bne transfers program control to the instruction at label. In the instruction encoding, the offset given by IMM16 is treated as a signed number of bytes relative to the instruction immediately following bne.
br br unconditional branch Operation: PC ← PC + 4 + σ (IMM16) Assembler Syntax: br label Example: br top_of_loop Description: Transfers program control to the instruction at label. In the instruction encoding, the offset given by IMM16 is treated as a signed number of bytes relative to the instruction immediately following br. The two least-significant bits of IMM16 are always zero, because instruction addresses must be word-aligned.
break break debugging breakpoint bstatus ← status Operation: PIE ← 0 U← 0 ba ← PC + 4 PC ← break handler address Assembler Syntax: break break imm5 Example: break Description: Breaks program execution and transfers control to the debugger break-processing routine. Saves the address of the next instruction in register ba and saves the contents of the status register in bstatus. Disables interrupts, then transfers execution to the break handler.
bret bret breakpoint return Operation: status ← bstatus PC ← ba Assembler Syntax: bret Example: bret Description: Copies the value of bstatus into the status register, then transfers execution to the address in ba. Usage: bret is used by debuggers exclusively and should not appear in user programs, operating systems, or exception handlers.
call call call subroutine ra ← PC + 4 Operation: PC ← (PC31..28 : IMM26 × 4) Assembler Syntax: call label Example: call write_char Description: Saves the address of the next instruction in register ra, and transfers execution to the instruction at address (PC31..28 : IMM26 × 4). Usage: call can transfer execution anywhere within the 256 MByte range determined by PC31..28. The Nios II GNU linker does not automatically handle cases in which the address is out of this range.
callr callr call subroutine in register Operation: ra ← PC + 4 PC ← rA Assembler Syntax: callr rA Example: callr r6 Description: Saves the address of the next instruction in the return-address register, and transfers execution to the address contained in register rA. Usage: callr is used to dereference C-language function pointers.
cmpeq cmpeq compare equal Operation: if (rA == rB) then rC ← 1 else rC ← 0 Assembler Syntax: cmpeq rC, rA, rB Example: cmpeq r6, r7, r8 Description: If rA == rB, then stores 1 to rC; otherwise, stores 0 to rC. Usage: cmpeq performs the == operation of the C programming language. Also, cmpeq can be used to implement the C logical-negation operator “!”.
cmpeqi cmpeqi compare equal immediate Operation: if (rA σ (IMM16)) then rB ← 1 else rB ← 0 Assembler Syntax: cmpeqi rB, rA, IMM16 Example: cmpeqi r6, r7, 100 Description: Sign-extends the 16-bit immediate value IMM16 to 32 bits and compares it to the value of rA. If rA == σ (IMM16), cmpeqi stores 1 to rB; otherwise stores 0 to rB. Usage: cmpeqi performs the == operation of the C programming language.
cmpge cmpge compare greater than or equal signed Operation: if ((signed) rA >= (signed) rB) then rC ← 1 else rC ← 0 Assembler Syntax: cmpge rC, rA, rB Example: cmpge r6, r7, r8 Description: If rA >= rB, then stores 1 to rC; otherwise stores 0 to rC. Usage: cmpge performs the signed >= operation of the C programming language.
cmpgei cmpgei compare greater than or equal signed immediate Operation: if ((signed) rA >= (signed) σ (IMM16)) then rB ← 1 else rB ← 0 Assembler Syntax: cmpgei rB, rA, IMM16 Example: cmpgei r6, r7, 100 Description: Sign-extends the 16-bit immediate value IMM16 to 32 bits and compares it to the value of rA. If rA >= σ(IMM16), then cmpgei stores 1 to rB; otherwise stores 0 to rB. Usage: cmpgei performs the signed >= operation of the C programming language.
cmpgeu cmpgeu compare greater than or equal unsigned Operation: if ((unsigned) rA >= (unsigned) rB) then rC ← 1 else rC ← 0 Assembler Syntax: cmpgeu rC, rA, rB Example: cmpgeu r6, r7, r8 Description: If rA >= rB, then stores 1 to rC; otherwise stores 0 to rC. Usage: cmpgeu performs the unsigned >= operation of the C programming language.
cmpgeui cmpgeui compare greater than or equal unsigned immediate Operation: if ((unsigned) rA >= (unsigned) (0x0000 : IMM16)) then rB ← 1 else rB ← 0 Assembler Syntax: cmpgeui rB, rA, IMM16 Example: cmpgeui r6, r7, 100 Description: Zero-extends the 16-bit immediate value IMM16 to 32 bits and compares it to the value of rA. If rA >= (0x0000 : IMM16), then cmpgeui stores 1 to rB; otherwise stores 0 to rB. Usage: cmpgeui performs the unsigned >= operation of the C programming language.
cmpgt cmpgt compare greater than signed Operation: if ((signed) rA > (signed) rB) then rC ← 1 else rC ← 0 Assembler Syntax: cmpgt rC, rA, rB Example: cmpgt r6, r7, r8 Description: If rA > rB, then stores 1 to rC; otherwise stores 0 to rC. Usage: cmpgt performs the signed > operation of the C programming language. Pseudoinstruction: cmpgt is implemented with the cmplt instruction by swapping its rA and rB operands.
cmpgti cmpgti compare greater than signed immediate Operation: if ((signed) rA > (signed) IMMED) then rB ← 1 else rB ← 0 Assembler Syntax: cmpgti rB, rA, IMMED Example: cmpgti r6, r7, 100 Description: Sign-extends the immediate value IMMED to 32 bits and compares it to the value of rA. If rA > σ(IMMED), then cmpgti stores 1 to rB; otherwise stores 0 to rB. Usage: cmpgti performs the signed > operation of the C programming language. The maximum allowed value of IMMED is 32766.
cmpgtu cmpgtu compare greater than unsigned Operation: if ((unsigned) rA > (unsigned) rB) then rC ← 1 else rC ← 0 Assembler Syntax: cmpgtu rC, rA, rB Example: cmpgtu r6, r7, r8 Description: If rA > rB, then stores 1 to rC; otherwise stores 0 to rC. Usage: cmpgtu performs the unsigned > operation of the C programming language. Pseudoinstruction: cmpgtu is implemented with the cmpltu instruction by swapping its rA and rB operands.
cmpgtui cmpgtui compare greater than unsigned immediate Operation: if ((unsigned) rA > (unsigned) IMMED) then rB ← 1 else rB ← 0 Assembler Syntax: cmpgtui rB, rA, IMMED Example: cmpgtui r6, r7, 100 Description: Zero-extends the immediate value IMMED to 32 bits and compares it to the value of rA. If rA > IMMED, then cmpgtui stores 1 to rB; otherwise stores 0 to rB. Usage: cmpgtui performs the unsigned > operation of the C programming language. The maximum allowed value of IMMED is 65534.
cmple cmple compare less than or equal signed Operation: if ((signed) rA <= (signed) rB) then rC ← 1 else rC ← 0 Assembler Syntax: cmple rC, rA, rB Example: cmple r6, r7, r8 Description: If rA <= rB, then stores 1 to rC; otherwise stores 0 to rC. Usage: cmple performs the signed <= operation of the C programming language. Pseudoinstruction: cmple is implemented with the cmpge instruction by swapping its rA and rB operands.
cmplei cmplei compare less than or equal signed immediate Operation: if ((signed) rA < (signed) IMMED) then rB ← 1 else rB ← 0 Assembler Syntax: cmplei rB, rA, IMMED Example: cmplei r6, r7, 100 Description: Sign-extends the immediate value IMMED to 32 bits and compares it to the value of rA. If rA <= σ(IMMED), then cmplei stores 1 to rB; otherwise stores 0 to rB. Usage: cmplei performs the signed <= operation of the C programming language. The maximum allowed value of IMMED is 32766.
cmpleu cmpleu compare less than or equal unsigned Operation: if ((unsigned) rA < (unsigned) rB) then rC ← 1 else rC ← 0 Assembler Syntax: cmpleu rC, rA, rB Example: cmpleu r6, r7, r8 Description: If rA <= rB, then stores 1 to rC; otherwise stores 0 to rC. Usage: cmpleu performs the unsigned <= operation of the C programming language. Pseudoinstruction: cmpleu is implemented with the cmpgeu instruction by swapping its rA and rB operands.
cmpleui cmpleui compare less than or equal unsigned immediate Operation: if ((unsigned) rA <= (unsigned) IMMED) then rB ← 1 else rB ← 0 Assembler Syntax: cmpleui rB, rA, IMMED Example: cmpleui r6, r7, 100 Description: Zero-extends the immediate value IMMED to 32 bits and compares it to the value of rA. If rA <= IMMED, then cmpleui stores 1 to rB; otherwise stores 0 to rB. Usage: cmpleui performs the unsigned <= operation of the C programming language. The maximum allowed value of IMMED is 65534.
cmplt cmplt compare less than signed Operation: if ((signed) rA < (signed) rB) then rC ← 1 else rC ← 0 Assembler Syntax: cmplt rC, rA, rB Example: cmplt r6, r7, r8 Description: If rA < rB, then stores 1 to rC; otherwise stores 0 to rC. Usage: cmplt performs the signed < operation of the C programming language.
cmplti cmplti compare less than signed immediate Operation: if ((signed) rA < (signed) σ (IMM16)) then rB ← 1 else rB ← 0 Assembler Syntax: cmplti rB, rA, IMM16 Example: cmplti r6, r7, 100 Description: Sign-extends the 16-bit immediate value IMM16 to 32 bits and compares it to the value of rA. If rA < σ (IMM16), then cmplti stores 1 to rB; otherwise stores 0 to rB. Usage: cmplti performs the signed < operation of the C programming language.
cmpltu cmpltu compare less than unsigned Operation: if ((unsigned) rA < (unsigned) rB) then rC ← 1 else rC ← 0 Assembler Syntax: cmpltu rC, rA, rB Example: cmpltu r6, r7, r8 Description: If rA < rB, then stores 1 to rC; otherwise stores 0 to rC. Usage: cmpltu performs the unsigned < operation of the C programming language.
cmpltui cmpltui compare less than unsigned immediate Operation: if ((unsigned) rA < (unsigned) (0x0000 : IMM16)) then rB ← 1 else rB ← 0 Assembler Syntax: cmpltui rB, rA, IMM16 Example: cmpltui r6, r7, 100 Description: Zero-extends the 16-bit immediate value IMM16 to 32 bits and compares it to the value of rA. If rA < (0x0000 : IMM16), then cmpltui stores 1 to rB; otherwise stores 0 to rB. Usage: cmpltui performs the unsigned < operation of the C programming language.
cmpne cmpne compare not equal Operation: if (rA != rB) then rC ← 1 else rC ← 0 Assembler Syntax: cmpne rC, rA, rB Example: cmpne r6, r7, r8 Description: If rA != rB, then stores 1 to rC; otherwise stores 0 to rC. Usage: cmpne performs the != operation of the C programming language.
cmpnei cmpnei compare not equal immediate Operation: if (rA != σ (IMM16)) then rB ← 1 else rB ← 0 Assembler Syntax: cmpnei rB, rA, IMM16 Example: cmpnei r6, r7, 100 Description: Sign-extends the 16-bit immediate value IMM16 to 32 bits and compares it to the value of rA. If rA != σ (IMM16), then cmpnei stores 1 to rB; otherwise stores 0 to rB. Usage: cmpnei performs the != operation of the C programming language.
custom custom custom instruction Operation: if c == 1 then rC ← fN(rA, rB, A, B, C) else Ø ← fN(rA, rB, A, B, C) Assembler Syntax: custom N, xC, xA, xB Where xA means either general purpose register rA, or custom register cA. Example: custom 0, c6, r7, r8 Description: The custom opcode provides access to up to 256 custom instructions allowed by the Nios II architecture. The function implemented by a custom instruction is user-defined and is specified at system generation time.
div div divide Operation: rC ← rA ÷ rB Assembler Syntax: div rC, rA, rB Example: div r6, r7, r8 Description: Treating rA and rB as signed integers, this instruction divides rA by rB and then stores the integer portion of the resulting quotient to rC. After attempted division by zero, the value of rC is undefined. There is no divide-by-zero exception. After dividing –2147483648 by –1, the value of rC is undefined (the number +2147483648 is not representable in 32 bits).
divu divu divide unsigned Operation: rC ← rA ÷ rB Assembler Syntax: divu rC, rA, rB Example: divu r6, r7, r8 Description: Treating rA and rB as unsigned integers, this instruction divides rA by rB and then stores the integer portion of the resulting quotient to rC. After attempted division by zero, the value of rC is undefined. There is no divide-by-zero exception. Nios II processors that do not implement the divu instruction cause an unimplemented-instruction exception.
eret eret exception return Operation: status ← estatus PC ← ea Assembler Syntax: eret Example: eret Description: Copies the value of estatus into the status register, and transfers execution to the address in ea. Usage: Use eret to return from traps, external interrupts, and other exception-handling routines. Note that before returning from hardware interrupt exceptions, the exception handler must adjust the ea register.
flushd flushd flush data cache line Operation: Flushes the data cache line associated with address rA + σ (IMM16). Assembler Syntax: flushd IMM16(rA) Example: flushd -100(r6) Description: If the Nios II processor implements a direct mapped data cache, flushd flushes the cache line that is mapped to the specified address, regardless whether the addressed data is currently cached.
flushda flushda flush data cache address Operation: Flushes the data cache line currently caching address rA + σ (IMM16) Assembler Syntax: flushda IMM16(rA) Example: flushda -100(r6) Description: If the addressed data is currently cached, flushda flushes the cache line mapped to that address.
flushi flushi flush instruction cache line Operation: Flushes the instruction-cache line associated with address rA. Assembler Syntax: flushi rA Example: flushi r6 Description: Ignoring the tag, flushi identifies the instruction-cache line associated with the byte address in rA, and invalidates that line. If the Nios II processor core does not have an instruction cache, the flushi instruction performs no operation.
flushp flushp flush pipeline Operation: Flushes the processor pipeline of any pre-fetched instructions. Assembler Syntax: flushp Example: flushp Description: Ensures that any instructions pre-fetched after the flushp instruction are removed from the pipeline. Usage: Use flushp before transferring control to newly updated instruction memory.
initd initd initialize data cache line Operation: Initializes the data cache line associated with address rA + σ (IMM16). Assembler Syntax: initd IMM16(rA) Example: initd 0(r6) Description: initd computes the effective address specified by the sum of rA and the signed 16bit immediate value. Ignoring the tag, initd identifies the data cache line associated with the effective address, and then initd invalidates that line.
initi initi initialize instruction cache line Operation: Initializes the instruction-cache line associated with address rA. Assembler Syntax: initi rA Example: initi r6 Description: Ignoring the tag, initi identifies the instruction-cache line associated with the byte address in ra, and initi invalidates that line. If the Nios II processor core does not have an instruction cache, the initi instruction performs no operation.
jmp jmp computed jump Operation: PC ← rA Assembler Syntax: jmp rA Example: jmp r12 Description: Transfers execution to the address contained in register rA. Usage: It is illegal to jump to the address contained in register r31. To return from subroutines called by call or callr, use ret instead of jmp.
jmpi jmpi jump immediate Operation: PC ← (PC31..28 : IMM26 × 4) Assembler Syntax: jmpi label Example: jmpi write_char Description: Transfers execution to the instruction at address (PC31..28 : IMM26 × 4). Usage: jmpi can transfer execution anywhere within the 256 MByte range determined by PC31..28. The Nios II GNU linker does not automatically handle cases in which the address is out of this range.
ldb / ldbio ldb / ldbio load byte from memory or I/O peripheral Operation: rB ← σ (Mem8[rA + σ (IMM16)]) Assembler Syntax: ldb rB, byte_offset(rA) ldbio rB, byte_offset(rA) Example: ldb r6, 100(r5) Description: Computes the effective byte address specified by the sum of rA and the instruction's signed 16-bit immediate value. Loads register rB with the desired memory byte, sign extending the 8-bit value to 32 bits.
ldbu / ldbuio ldbu / ldbuio load unsigned byte from memory or I/O peripheral Operation: rB ← 0x000000 : Mem8[rA + σ (IMM16)] Assembler Syntax: ldbu rB, byte_offset(rA) ldbuio rB, byte_offset(rA) Example: ldbu r6, 100(r5) Description: Computes the effective byte address specified by the sum of rA and the instruction's signed 16-bit immediate value. Loads register rB with the desired memory byte, zero extending the 8-bit value to 32 bits.
ldh / ldhio ldh / ldhio load halfword from memory or I/O peripheral Operation: rB ← σ (Mem16[rA + σ (IMM16)]) Assembler Syntax: ldh rB, byte_offset(rA) ldhio rB, byte_offset(rA) Example: ldh r6, 100(r5) Description: Computes the effective byte address specified by the sum of rA and the instruction's signed 16-bit immediate value. Loads register rB with the memory halfword located at the effective byte address, sign extending the 16-bit value to 32 bits.
ldhu / ldhuio ldhu / ldhuio load unsigned halfword from memory or I/O peripheral Operation: rB ← 0x0000 : Mem16[rA + σ (IMM16)] Assembler Syntax: ldhu rB, byte_offset(rA) ldhuio rB, byte_offset(rA) Example: ldhu r6, 100(r5) Description: Computes the effective byte address specified by the sum of rA and the instruction's signed 16-bit immediate value. Loads register rB with the memory halfword located at the effective byte address, zero extending the 16-bit value to 32 bits.
ldw / ldwio ldw / ldwio load 32-bit word from memory or I/O peripheral Operation: rB ← Mem32[rA + σ (IMM14)] Assembler Syntax: ldw rB, byte_offset(rA) ldwio rB, byte_offset(rA) Example: ldw r6, 100(r5) Description: Computes the effective byte address specified by the sum of rA and the instruction's signed 16-bit immediate value. Loads register rB with the memory word located at the effective byte address. The effective byte address must be word aligned.
mov mov move register to register Operation: rC ← rA Assembler Syntax: mov rC, rA Example: mov r6, r7 Description: Moves the contents of rA to rC. Pseudoinstruction: mov is implemented as add rC, rA, r0.
movhi movhi move immediate into high halfword Operation: rB ← (IMMED : 0x0000) Assembler Syntax: movhi rB, IMMED Example: movhi r6, 0x8000 Description: Writes the immediate value IMMED into the high halfword of rB, and clears the lower halfword of rB to 0x0000. Usage: The maximum allowed value of IMMED is 65535. The minimum allowed value is 0. To load a 32-bit constant into a register, first load the upper 16 bits using a movhi pseudoinstruction.
movi movi move signed immediate into word Operation: rB ← σ (IMMED) Assembler Syntax: movi rB, IMMED Example: movi r6, -30 Description: Sign-extends the immediate value IMMED to 32 bits and writes it to rB. Usage: The maximum allowed value of IMMED is 32767. The minimum allowed value is –32768. To load a 32-bit constant into a register, see the movhi instruction. Pseudoinstruction: movi is implemented as addi rB, r0, IMMED.
movia movia move immediate address into word Operation: rB ← label Assembler Syntax: movia rB, label Example: movia r6, function_address Description: Writes the address of label to rB.
movui movui move unsigned immediate into word Operation: rB ← (0x0000 : IMMED) Assembler Syntax: movui rB, IMMED Example: movui r6, 100 Description: Zero-extends the immediate value IMMED to 32 bits and writes it to rB. Usage: The maximum allowed value of IMMED is 65535. The minimum allowed value is 0. To load a 32-bit constant into a register, see the movhi instruction. Pseudoinstruction: movui is implemented as ori rB, r0, IMMED.
mul mul multiply Operation: rC ← (rA × rB) 31..0 Assembler Syntax: mul rC, rA, rB Example: mul r6, r7, r8 Description: Multiplies rA times rB and stores the 32 low-order bits of the product to rC. The result is the same whether the operands are treated as signed or unsigned integers. Nios II processors that do not implement the mul instruction cause an unimplemented-instruction exception.
muli muli multiply immediate Operation: rB ← (rA × σ(IMM16)) 31..0 Assembler Syntax: muli rB, rA, IMM16 Example: muli r6, r7, -100 Description: Sign-extends the 16-bit immediate value IMM16 to 32 bits and multiplies it by the value of rA. Stores the 32 low-order bits of the product to rB. The result is independent of whether rA is treated as a signed or unsigned number. Nios II processors that do not implement the muli instruction cause an unimplemented-instruction exception.
mulxss mulxss multiply extended signed/signed Operation: rC ← ((signed) rA) × ((signed) rB)) 63..32 Assembler Syntax: mulxss rC, rA, rB Example: mulxss r6, r7, r8 Description: Treating rA and rB as signed integers, mulxss multiplies rA times rB, and stores the 32 high-order bits of the product to rC. Nios II processors that do not implement the mulxss instruction cause an unimplemented-instruction exception.
mulxsu mulxsu multiply extended signed/unsigned Operation: rC ← ((signed) rA) × ((unsigned) rB)) 63..32 Assembler Syntax: mulxsu rC, rA, rB Example: mulxsu r6, r7, r8 Description: Treating rA as a signed integer and rB as an unsigned integer, mulxsu multiplies rA times rB, and stores the 32 high-order bits of the product to rC. Nios II processors that do not implement the mulxsu instruction cause an unimplemented-instruction exception.
mulxuu mulxuu multiply extended unsigned/unsigned Operation: rC ← ((unsigned) rA) × ((unsigned) rB)) 63..32 Assembler Syntax: mulxuu rC, rA, rB Example: mulxuu r6, r7, r8 Description: Treating rA and rB as unsigned integers, mulxuu multiplies rA times rB and stores the 32 high-order bits of the product to rC. Nios II processors that do not implement the mulxss instruction cause an unimplemented-instruction exception.
nextpc nextpc get address of following instruction Operation: rC ← PC + 4 Assembler Syntax: nextpc rC Example: nextpc r6 Description: Stores the address of the next instruction to register rC. Usage: A relocatable code fragment can use nextpc to calculate the address of its data segment. nextpc is the only way to access the PC directly.
nop nop no operation Operation: None Assembler Syntax: nop Example: nop Description: nop does nothing. Pseudoinstruction: nop is implemented as add r0, r0, r0.
nor nor bitwise logical nor Operation: rC ← ~(rA | rB) Assembler Syntax: nor rC, rA, rB Example: nor r6, r7, r8 Description: Calculates the bitwise logical NOR of rA and rB and stores the result in rC.
or or bitwise logical or Operation: rC ← rA | rB Assembler Syntax: or rC, rA, rB Example: or r6, r7, r8 Description: Calculates the bitwise logical OR of rA and rB and stores the result in rC.
orhi orhi bitwise logical or immediate into high halfword Operation: rB ← rA | (IMM16 : 0x0000) Assembler Syntax: orhi rB, rA, IMM16 Example: orhi r6, r7, 100 Description: Calculates the bitwise logical OR of rA and (IMM16 : 0x0000) and stores the result in rB.
ori ori bitwise logical or immediate Operation: rB ← rA | (0x0000 : IMM16) Assembler Syntax: ori rB, rA, IMM16 Example: ori r6, r7, 100 Description: Calculates the bitwise logical OR of rA and (0x0000 : IMM16) and stores the result in rB.
rdctl rdctl read from control register Operation: rC ← ctlN Assembler Syntax: rdctl rC, ctlN Example: rdctl r3, ctl31 Description: Reads the value contained in control register ctlN and writes it to register rC.
ret ret return from subroutine Operation: PC ← ra Assembler Syntax: ret Example: ret Description: Transfers execution to the address in ra. Usage: Any subroutine called by call or callr must use ret to return.
rol rol rotate left Operation: rC ← rA rotated left rB4..0 bit positions Assembler Syntax: rol rC, rA, rB Example: rol r6, r7, r8 Description: Rotates rA left by the number of bits specified in rB4..0 and stores the result in rC. The bits that shift out of the register rotate into the least-significant bit positions. Bits 31–5 of rB are ignored.
roli roli rotate left immediate Operation: rC ← rA rotated left IMM5 bit positions Assembler Syntax: roli rC, rA, IMM5 Example: roli r6, r7, 3 Description: Rotates rA left by the number of bits specified in IMM5 and stores the result in rC. The bits that shift out of the register rotate into the least-significant bit positions. Usage: In addition to the rotate-left operation, roli can be used to implement a rotate-right operation.
ror ror rotate right Operation: rC ← rA rotated right rB4..0 bit positions Assembler Syntax: ror rC, rA, rB Example: ror r6, r7, r8 Description: Rotates rA right by the number of bits specified in rB4..0 and stores the result in rC. The bits that shift out of the register rotate into the most-significant bit positions. Bits 31– 5 of rB are ignored.
sll sll shift left logical Operation: rC ← rA << (rB4..0) Assembler Syntax: sll rC, rA, rB Example: sll r6, r7, r8 Description: Shifts rA left by the number of bits specified in rB4..0 (inserting zeroes), and then stores the result in rC. sll performs the << operation of the C programming language.
slli slli shift left logical immediate Operation: rC ← rA << IMM5 Assembler Syntax: slli rC, rA, IMM5 Example: slli r6, r7, 3 Description: Shifts rA left by the number of bits specified in IMM5 (inserting zeroes), and then stores the result in rC. Usage: slli performs the << operation of the C programming language.
sra sra shift right arithmetic Operation: rC ← (signed) rA >> ((unsigned) rB4..0) Assembler Syntax: sra rC, rA, rB Example: sra r6, r7, r8 Description: Shifts rA right by the number of bits specified in rB4..0 (duplicating the sign bit), and then stores the result in rC. Bits 31–5 are ignored. Usage: sra performs the signed >> operation of the C programming language.
srai srai shift right arithmetic immediate Operation: rC ← (signed) rA >> ((unsigned) IMM5) Assembler Syntax: srai rC, rA, IMM5 Example: srai r6, r7, 3 Description: Shifts rA right by the number of bits specified in IMM5 (duplicating the sign bit), and then stores the result in rC. Usage: srai performs the signed >> operation of the C programming language.
srl srl shift right logical Operation: rC ← (unsigned) rA >> ((unsigned) rB4..0) Assembler Syntax: srl rC, rA, rB Example: srl r6, r7, r8 Description: Shifts rA right by the number of bits specified in rB4..0 (inserting zeroes), and then stores the result in rC. Bits 31–5 are ignored. Usage: srl performs the unsigned >> operation of the C programming language.
srli srli shift right logical immediate Operation: rC ← (unsigned) rA >> ((unsigned) IMM5) Assembler Syntax: srli rC, rA, IMM5 Example: srli r6, r7, 3 Description: Shifts rA right by the number of bits specified in IMM5 (inserting zeroes), and then stores the result in rC. Usage: srli performs the unsigned >> operation of the C programming language.
stb / stbio stb / stbio store byte to memory or I/O peripheral Operation: Mem8[rA + σ (IMM16)] ← rB7..0 Assembler Syntax: stb rB, byte_offset(rA) stbio rB, byte_offset(rA) Example: stb r6, 100(r5) Description: Computes the effective byte address specified by the sum of rA and the instruction's signed 16-bit immediate value. Stores the low byte of rB to the memory byte specified by the effective address.
sth / sthio sth / sthio store halfword to memory or I/O peripheral Operation: Mem16[rA + σ (IMM16)] ← rB15..0 Assembler Syntax: sth rB, byte_offset(rA) sthio rB, byte_offset(rA) Example: sth r6, 100(r5) Description: Computes the effective byte address specified by the sum of rA and the instruction's signed 16-bit immediate value. Stores the low halfword of rB to the memory location specified by the effective byte address. The effective byte address must be halfword aligned.
stw / stwio stw / stwio store word to memory or I/O peripheral Operation: Mem32[rA + σ (IMM16)] ← rB Assembler Syntax: stw rB, byte_offset(rA) stwio rB, byte_offset(rA) Example: stw r6, 100(r5) Description: Computes the effective byte address specified by the sum of rA and the instruction's signed 16-bit immediate value. Stores rB to the memory location specified by the effective byte address. The effective byte address must be word aligned.
sub sub subtract Operation: rC ← rA – rB Assembler Syntax: sub rC, rA, rB Example: sub r6, r7, r8 Description: Subtract rB from rA and store the result in rC. Usage: Carry Detection (unsigned operands): The carry bit indicates an unsigned overflow. Before or after a sub operation, a carry out of the MSB can be detected by checking whether the first operand is less than the second operand. The carry bit can be written to a register, or a conditional branch can be taken based on the carry condition.
subi subi subtract immediate Operation: rB ← rA – σ (IMMED) Assembler Syntax: subi rB, rA, IMMED Example: subi r8, r8, 4 Description: Sign-extends the immediate value IMMED to 32 bits, subtracts it from the value of rA and then stores the result in rB. Usage: The maximum allowed value of IMMED is 32768. The minimum allowed value is –32767.
sync sync memory synchronization Operation: None Assembler Syntax: sync Example: sync Description: Forces all pending memory accesses to complete before allowing execution of subsequent instructions. In processor cores that support in-order memory accesses only, this instruction performs no operation.
trap trap trap estatus ← status Operation: PIE ← 0 U← 0 ea ← PC + 4 PC ← exception handler address Assembler Syntax: trap Example: trap Description: Saves the address of the next instruction in register ea, saves the contents of the status register in estatus, disables interrupts, and transfers execution to the exception handler. The address of the exception handler is specified at system generation time. Usage: To return from the exception handler, execute an eret instruction.
wrctl wrctl write to control register Operation: ctlN ← rA Assembler Syntax: wrctl ctlN, rA Example: wrctl ctl6, r3 Description: Writes the value contained in register rA to the control register ctlN.
xor xor bitwise logical exclusive or Operation: rC ← rA ^ rB Assembler Syntax: xor rC, rA, rB Example: xor r6, r7, r8 Description: Calculates the bitwise logical exclusive XOR of rA and rB and stores the result in rC.
xorhi xorhi bitwise logical exclusive or immediate into high halfword Operation: rB ← rA ^ (IMM16 : 0x0000) Assembler Syntax: xorhi rB, rA, IMM16 Example: xorhi r6, r7, 100 Description: Calculates the bitwise logical exclusive XOR of rA and (IMM16 : 0x0000) and stores the result in rB.
xori xori bitwise logical exclusive or immediate Operation: rB ← rA ^ (0x0000 : IMM16) Assembler Syntax: xori rB, rA, IMM16 Example: xori r6, r7, 100 Description: Calculates the bitwise logical exclusive or of rA and (0x0000 : IMM16) and stores the result in rB.
Referenced Documents Referenced Documents This chapter references no other documents. Document Revision History Table 8–6 shows the revision history for this document. Table 8–6. Document Revision History Date & Document Version Changes Made October 2007 v7.2.0 Added jmpi instruction. May 2007 v7.1.0 ● March 2007 v7.0.0 No change from previous release. November 2006 v6.1.0 No change from previous release. May 2006 v6.0.0 No change from previous release. October 2005 v5.1.0 ● July 2005 v5.