Wave Computing s MIPS I6500 multiprocessor core extends the variety and scalability of the company s “off-the-shelf” licensable IP cores based on the proven and respected MIPS64 architecture, and delivering a compelling solution for heterogeneous computing.
This IP core offering provides key features to deliver “heterogeneous inside and out”, many core/multi-cluster scalable processing, and real-time deterministic execution – even when utilizing its support for hardware virtualization – making the I6500 family one of the most scalable, flexible and powerful IP cores in the industry.
The flexibility and scalability makes it ideal for the growing and varied requirements of heterogeneous computing applications, including advanced driver assistance systems (ADAS) and autonomous driving, high performance networking, machine learning, drones, industrial automation, security, and video analytics.
Like the I6400 IP core family before it, the foundation of the I6500 family is a multi-threaded superscalar CPU core which, in a single multi-core cluster, can utilize up to six cores. As the basis for its “heterogeneous inside” capabilities, each core in the cluster can now be individually configured as part of silicon design to optimize and align the performance, area and power of the total solution to application requirements. This includes varying the number of hardware threads, the size of each L1 cache, as well as optional inclusion of a SIMD/FPU processing unit.
Data ScratchPad RAM (SPRAM) per core, up to four AXI ports for low latency peripherals or cluster-level SPRAM, and inter-thread communications (ITC) support are added (and optional). These features support both deterministic, low-latency operation and fast path messaging in embedded systems, and implementations of high-performance networking/data processing applications operating as a complement to the standard cached memory system.
The combination of simultaneous multi-threading with hardware virtualization in the I6500 processor enables multiple execution environments to run simultaneously, isolated from each other, with zero context switch overhead.
- MIPS I-class I6500 Base Core Features
- 64-bit MIPS64® Release 6 Instruction Set Architecture
- Proven, successful, well supported 64-bit architecture
- Superset of MIPS32 – runs MIPS32 software directly
- Balanced, 9-stage, dual-issue pipeline with Simultaneous Multi-Threading (SMT)
- Superscalar on a single thread or two threads simultaneously per cycle
- Up to four threads per core
- Instruction bonding – merges sequential integer or floating point loads or stores into one operation for up to 2x increase on memory-intensive data movement routines
- High-performance dual-issue FPU/SIMD Unit – optional
- 32 x 128-bit register set, 128-bit loads/stores to/from SIMD unit
- Native data types: 8-/16-/32-/64-bit integer and fixed point, 16-/32-/64-bit floating point
- IEEE-754 2008 compliant
- Full hardware virtualization
- Provides root and guest privilege levels for kernel and user space
- Supports multiple guests, with full virtual CPU per guest = guest OSs run unmodified
- Separate TLBs, COP0 contexts for root and guests –> full isolation, fast context switching, exception and interrupt handling by root
- Complete SoC virtualization support (IOMMU and interrupt handling – see multi-core features)
- L1 cache.
- Instruction and Data of 32 KB or 64 KB each with ECC, 4-way set associative
- Data ScratchPad RAM (D-SPRAM)
- Up to 1 MB with ECC, for deterministic low latency access and/or high performance data processing and movement outside of standard cached memory hierarchy (e.g. DMA directly into a core’s local D-SPRAM)
- Programmable Memory Management Unit (MMU)
- First and second level TLBs with arrays for variable and fixed page size support
- MIPS I-class I6500 Series Multi-Core & Multi-Cluster Features
- Coherent multi-core and multi-cluster platform, providing extensible implementations in support of both homogeneous and heterogeneous computing applications
- Flexibility on the mix of cores and I/O coherency unit (IOCU) ports enables compute and throughput optimization to deliver better heterogeneous performance to application needs
- Support for multi-cluster implementations of up to 64 compute clusters
- IP available as:
- Single cluster IP deliverable for use in combination with coherent fabric alternatives (ACE-compatible) for multi-cluster scalability, or
- Complete multi-cluster sub-system deliverable
- Per cluster multi-core system designed for maximum cluster-level bandwidth
- Coherence Manager (CMv3.5)
- Extensible to coherent multi-cluster implementations
- Within a single cluster, supports multi-port configurations of up to:
- Six cores in a single cluster (plus up to two hardware I/O coherency unit) IOCU ports, or
- Eight IOCU ports for “clustering” hardware accelerators (even without a CPU core on the same cluster)
- New directory-based coherency scheme – improves power consumption, performance and scalability
- High-bandwidth 256-bit internal data paths and external system interface
- Integrated L2 cache (L2$): 16-way set associative, up to 8MB of memory
- Dual pipelines for maximizing bandwidth on L1$ misses
- ECC option on L2$ RAM for higher data reliability
- Configurable wait states to RAM for optimal L2$ design
- L2$ hardware pre-fetch for higher throughput and performance
- Up to four auxiliary AXI ports provide for enabling features such as:
- Separate path for non-coherent memory transactions
- Shared access to low latency peripherals
- Shared access to low latency and deterministic SPRAM (within a cluster, or even across clusters)
- Inter-Thread Communication (ITC)
- Fast path, higher efficiency alternative for messaging/data passing between threads within a core or a cluster
- Global interrupt controller (GIC) with 256-interrupts per cluster
- Advanced power management
- Core-level DVFS (dynamic voltage and frequency scaling) – each core can be run at independent clock and voltage level
- Virtualization support at system and SoC level
- Up to 31 guest execution environments per cluster
- IOCUs include I/O MMU; GIC has virtualized interrupts
- Guest ID brought out on system i/f for integration into multi-cluster and virtualized SoC designs
- Advanced debug capabilities – Debug and Trace
- Debug unit (DBU) supporting JTAG or APB i/f for Coresight compatibility
- Program and Data Trace (PDtrace), with on-chip or off chip trace buffering
- Heterogeneous Inside: In a single cluster, designers can optimize power consumption with the ability to configure each CPU with different combinations of threads, different cache sizes, different frequencies, and even different voltage levels.
- Heterogeneous Outside: The latest MIPS Coherence Manager with an AMBA ACE interface to popular ACE coherent fabric solutions such as those from Arteris and Netspeed lets designers mix on a chip configurations of processing clusters – including PowerVR GPUs – for high system efficiency.
- Simultaneous Multi-threading (SMT): Based on a superscalar dual issue design implemented across generations of MIPS CPUs, this proven feature enables execution of multiple instructions from multiple threads every clock cycle, providing higher utilization and CPU efficiency.
- Hardware virtualization (VZ): I6500 builds on the real time hardware virtualization capability pioneered in the MIPS I6400 core. Designers can save costs by safely and securely consolidating multiple CPU cores with a single core, save power where multiple cores are required, and dynamically and deterministically allocate CPU bandwidth per application.
- SMT + VZ: The combination of SMT with VZ in the I6500 offers “zero context switching” for applications requiring real-time response. This feature, alongside the provision of scratchpad memory, makes the I6500 ideal for applications which require deterministic code execution.
- Ideal for compute intensive, data processing and networking applications: The I6500 is designed for high-performance/high-efficiency data transfers to localized compute resources with data scratchpad memories per CPU, and features for fast path message/data passing between threads and cores.
- OmniShield-ready: the multi-domain security technology used across the MIPS processing families enables isolation of applications in trusted environments, providing a foundation for security by separation.
- Straightforward software development: The I6500 is based on the mature MIPS ISA which is broadly supported in the development ecosystem by multiple vendors. Customers adopting the I6500 can enjoy a wide choice of compilers, debuggers, operating systems, hypervisors and application software all optimized for the MIPS ISA.