The 32-bit CPU core is a high-performance processor with FPU ideal for integration into ASIC and/or FPGA designs with off-chip memories or large on-chip memories that require a cache. The 32-bit CPU core is suited to a wide range of applications including running complex operating systems such Linux and uClinux.
The processor features separate instruction and data caches that can be configured in size (from 1-64kB) and associativity (direct mapped, 2 or 4-way associative) to increase performance when accessing off-chip memory. The optional paged memory management unit (MMU) enables the implementation of virtual memory and the ability to run operating systems such as Linux.
The 5-stage pipeline allows high clock frequencies to be achieved. While most instructions effectively execute in a single clock cycle, the deep pipeline allows the C and C++ compiler to schedule independent instructions such that instructions that can take multiple cycles to execute, appear to only take 1 clock cycle. Static branch prediction is employed to minimize the cost of branch instructions.
The 32-bit CPU core's instruction set includes arithmetic and logical instructions (including barrel-shift, multiply and divide), comparisons, load and stores, branches and calls as well as system level instructions to control interrupts and enter lower power states. The 32-bit CPU core supports a set of IEEE-754 compliant single-precision floating point instructions, including divide and square root. There are also a number of optional instructions and addressing modes that can be selected, should a specific application require them. For those applications that require extreme performance or ultra low power operation, user defined instructions can be implemented.
A number of instructions are reserved to allow the user to utilise user defined logic via a simple interface. User defined registers and condition codes are also supported, allowing the most complicated applications to be accelerated.
Instructions are encoded in either 16 or 32-bits, depending upon the size of the operands and the type of instruction. All of the commonly used instructions can be encoded in 16-bits. This ensures that high code density is achieved, which helps to increase performance by increasing the number of instructions that can be stored in the instruction cache. The processor supports both user and supervisor operating modes, with privileged instructions and memory areas, to allow an O/S kernel to be fully protected from user applications.
- 32-bit RISC architecture
- 32 general purpose registers
- 8/16 or 32 floating point registers
- 104 basic instructions and 10 addressing modes
- IEEE 754 single-precision floating point unit (FPU)
- Supports up to 96 user-defined instructions
- 5-stage pipeline
- Optional memory management unit (MMU)
- Configurable instruction and data caches (1-64kB. Direct mapped or 2 or 4-way associative)
- AMBA AXI or AHB interconnect and APB peripheral bus
- User and supervisor operating modes
- Up to 32 interrupts plus NMI and system call
- Fast interrupt response time of 6-9 cycles
- JTAG or serial debug
- Delivers 2.71 CoreMark per MHz
- High code density
- Intermixed 16 and 32-bit instructions
- High quality IP:
- C and C++ s/w development using license-free GNU tools, under industry standard Eclipse IDE
- Easy migration path to cacheless implementation
Block Diagram of the High-performance 32-bit Processor with cache and FPU