MIPS' Aptiv: Will New Core Families Make the Company More Competitive?

Submitted by BDTI on Wed, 05/16/2012 - 19:30

Processor core provider MIPS Technologies has seemingly fallen on hard times in recent years. Consider, for example, a report published by the Linley Group just last week that indicated chief competitor ARM supplied CPU cores used in 78% of the estimated 10 billion CPU cores in SoCs shipped last year. ARM's estimated per-core license price was 4.6 cents, versus 7 cents for MIPS. However, with MIPS licensees shipping only 665 million MIPS cores (6% of the market, in third place, behind Synopsys/ARC at 10%), MIPS’ revenues trail far behind ARM’s.

Or consider that MIPS hasn't released a notable new core since September 2010, that being the MIPS32 1074K, an evolutionary follow-on to the 74K core, which was unveiled back in mid-2007. The company hopes to turn around its fortunes with last week's three-family announcement.  Of the three new families, one (microAptiv) is already available for new designs and the other two will be coming to market by early Q3 (Figure 1).

Figure 1. After several quiet years, MIPS' core stable is revitalized with the three-family Aptiv series

This writeup will discuss the high-end proAptiv core; next month's InsideDSP will cover the mid-range interAptiv and entry-level microAptiv core architectures. And speaking of ARM, any sense of déjà vu you may be feeling right now is understandable, since that company also subdivides its core suite into three families; Cortex-M for microcontrollers, Cortex-R for real-time embedded applications, and Cortex-A for high-performance applications. However, as you'll see, Cortex-M and Cortex-A are specifically in MIPS' sights with its new offerings (Figure 2).

Figure 2. proAptiv, interAptiv and microAptiv target a diversity of applications, befitting their respective silicon area, performance and power specifications, with some inter-core overlaps at the application boundaries

MIPS' proAptiv core represents the evolutionary successor to the 1074K (Figure 3).

Figure 3. proAptiv is a superscalar, out-of-order architecture; up to six proAptiv cores can be used in a multi-core configuration in high-end SoC designs

The company claims that proAptiv delivers equivalent DMIPS/MHz benchmark results to ARM's latest Cortex-A15 at nearly half the die size (based on ARM's public general comments; no exact Cortex-A15 die size dimensions have yet been published), along with notably higher DMIPS/MHz results than those of the popular ARM Cortex-A9 at a roughly equivalent die size (Figure 4). Note that these die size comparisons encompass MIPS' DSP ASE (application specific extensions) SIMD function block but do not include ARM's optional NEON SIMD engine (64-bit on Cortex-A9, 128-bit on Cortex-A15). Note, too, that MIPS is not necessarily claiming that proAptiv will be able to achieve comparable maximum clock speeds to the ARM core competitors.

Figure 4. MIPS believes that proAptiv compares favorably to ARM's Cortex-A9 and Cortex-A15 in terms of performance/MHz based on the DMIPS and (especially) CoreMark benchmarks

MIPS feels that proAptiv's advantages versus ARM are even more evident when evaluated using the CoreMark benchmark. As review, DMIPS (Dhrystone MIPS) is a simplistic, widely used, integer-only benchmark, originally developed in 1984, whose name was a play on that of the floating point-focused Whetstone benchmark. In going beyond rudimentary instruction counting, Dhrystone attempted to level the playing field between CISC and RISC CPUs, counting the number of times a program loop executed in a particular amount of time and thereby not favoring to either CISC processors (which might require fewer instructions to implement a given function, but at a lower clock speed) or RISC architectures (potentially running at higher frequencies, but requiring more instructions to complete a particular task). The "MIPS" in DMIPS doesn't strictly refer to "millions of instructions per second;" instead it's an attempt to normalize a particular processor's performance to that of the 1 MIPS VAX 11/780.

Dhrystone's developers attempted to develop a general-purpose CPU benchmark, an honorable goal and a largely achieved one, at least at first. However, the benchmark is 30 years old at this point, and its shortcomings are becoming increasingly evident with modern architectures. It comfortably fits within the L1 cache of even entry-level modern microprocessor cores, for example, thereby not exposing the entire CPU (higher-level caches and controllers, external memory controllers, etc) to analysis. Dhrystone also doesn't fully evaluate architecture characteristics such as the likelihood of burst access pattern misprediction.

Integer-based CoreMark, developed by EEMBC and intended to replace Dhrystone, represents an attempt to modernize the benchmark landscape. Although MIPS admits that CoreMark also fits within its cores' L1 caches, the benchmark comprises approximately 700,000 lines of code, versus around 300 instructions in Dhrystone. CoreMark, derived from real-life algorithm code, implements list processing (find and sort), matrix manipulation, state machine, and CRC functions.

CoreMark also makes heavy use of complex branching and control flow operations and is therefore particularly harsh on deep-pipeline processor cores that mispredict branches and consequently end up needing to recover from cache misses. Conversely, deep-pipeline architectures tend to run at higher clock speeds than their short-pipeline competitors, all other factors being equal, and therefore deliver higher performance in scenarios where branch misprediction is infrequent. Keep this assessment in mind as you peruse the following table:

 

proAptiv

ARM Cortex-A9

Qualcomm Krait

ARM Cortex-A15

Pipeline depth (stages)

13

8

11

15

Comparison of pipeline stage counts for MIPS' proAptiv and various ARM cores

MIPS forecasts that the proAptiv core will run at 1.1 GHz or higher frequencies in TSMC's 40 nm "G" process, and projects a 1.15 GHz preliminary clock speed on the foundry's 28 mn HPM (high performance mobile) process. Speeds beyond that, up to an envisioned 2 GHz, will require dispensing with the DSP ACE SIMD engine, harnessing the higher power consumption 28 nm HP (high performance) process, using low threshold voltage transistors, employment of OD (oxide diffusion) layer technology, and other specialized design and fabrication techniques.

proAptiv, currently implemented as a FPGA prototype with SoC design availability slated for next month or early Q3, is a superscalar out-of-order core with quad instruction fetch and triple bonded dispatch capabilities. If inter-operation dependencies aren't present, the core can simultaneously issue up to four integer and two floating-point operations. Translation look-aside buffer and branch prediction logic improvements are additional factors in proAptiv's 60-75% higher estimated performance versus the 1074K at equivalent clock speeds. proAptiv's floating-point unit is dual-issue, runs at the same speed as the remainder of the core and touts native double-precision support.

MIPS claims the ability to include up to six proAptiv cores in a SoC; configurations above four cores will run at a lower clock speed. The Cluster Power Controller is exposed to the operating system and allows for per-core voltage domain management and gating, along with per-core clock gating. Speaking of operating systems, MIPS recently touted full support for its cores within the latest revision of the Android NDK (native development kit), enabling optimized MIPS architecture compilation for Android applications going forward. However, although Android-based smartphones and tablets are key focus growth applications for MIPS going forward, the company does not offer dynamic recompilation support for existing NDK-developed and ARM-optimized applications, as Intel does with x86-based Medfield.

For existing MIPS licensees, proAptiv is a logical next-generation migration step beyond the MIPS32 1074K. But given the market share statistics noted at the beginning of this article, MIPS will need to begin stealing design wins away from ARM Cortex-A9 and Cortex-A15 in order to reverse its multi-year market share slide. proAptiv touts eye-catching comparative die size, performance and power consumption claims, albeit unproven ones at this time. And OEMs have a vested interest in continually assessing core architecture alternatives, if for no other reason than to keep ARM honest. But ARM and its licensees aren't standing still, either. And MIPS' graphics core strategy remains shaky, reliant on advocating partners such as Imagination Technologies and Vivante in lieu of internal technology a la ARM's Mali. It's therefore unclear whether or not MIPS' generational gains, particularly after several years worth of evolutionary inactivity, are sufficient to ensure serious reconsideration.

Stay tuned for an in-depth discussion of MIPS' two other new core families, interAptiv and microAptiv, in next month's InsideDSP.

Add new comment

Log in to post comments