Update: Cadence Completes Acquisition of Tensilica (Apr 24, 2013)
Provides the Highest Productivity Configurable Processor Design Environment with More Options, Flexibility and Automation
Santa Clara , CA, December 10, 2007 — Tensilica, Inc. today announced that that it has upgraded its two Xtensa® configurable processor families (the Xtensa 7 and Xtensa LX2) with new hardware options and software tool enhancements that make it appeal to an even wider audience of SOC (system-on-chip) designers. Highlights of these capabilities include a new, smaller general purpose register file option, new integer multiplier and divider execution unit options, two new AMBA™ (Advanced Microcontroller Bus Architecture 3.0) bridge options, as well as an easy-to-use new configuration tool that analyzes source C/C++ code and automatically suggests VLIW (very long instruction word) instruction extensions that lead to 30-60 percent improvements in general purpose code performance. These new capabilities provide designers with the most productive configurable processor design environment, with automated features that ensure each processor design is correct by construction.
“This new product generation represents a significant enhancement of our Xtensa processor line in three dimensions – in support of even leaner deeply embedded ‘data engine’ configurations, in richer high-end system support, and in significantly enhancing our processor analysis, modeling, and software tools,” explained Chris Rowen, Tensilica’s president and CEO. “Our Xtensa processors are already widely demanded. These advances both help existing Xtensa users design more sophisticated SOCs and enable new designers to get the full benefits of configurable processors with less design effort and design time than ever before.”
Steve Roddy, Tensilica’s vice president of marketing added, “The Xtensa configurable processor architecture is so flexible that it is currently in production in functions as varied as a simple cacheless controller, a mid-range Linux applications processor, a high-performance 3-issue VLIW general purpose processor, an audio DSP (digital signal processor), a video DSP engine, high performance image processors, and high performance network processors. No other processor architecture comes close to matching this versatility.”
New Hardware Options
The five most significant new hardware options introduced by Tensilica include: a 16-entry register file, a relocatable exception vector option, a low-area multiplier, an integer divider, and new AMBA-compatible bridges.
First, Tensilica added support for a smaller 16-entry main register file in the Xtensa processor, in addition to the existing support for 32-entry and 64-entry configuration options. This enables the instantiation of a very small processor core that competes in area and power with 8-bit and 16-bit microcontrollers and yet provides the performance, flexibility, and features of a 32-bit controller.
Second, by adding support for relocatable exception vectors, Tensilica is enabling customers to change the memory location of exception and interrupt handlers in software post-silicon. This gives more flexibility to the SOC designer and eases system design.
Third, Tensilica added a low-area, multi-cycle 32x32 multiplier configuration option, which enables the design of an Xtensa configuration that is very small in area, but still has good performance on multiply-rich applications, such as MP3 decoding. This gives designers a new choice that is more area efficient than the existing single-cycle, fully-pipelined 32-bit and 16-bit multiplier configuration choices and still much higher performance than pure software emulation of multiply instructions.
Fourth, Tensilica added a low-area divider configuration option, requiring only about 4000 gates. This provides a standardized and powerful way to boost performance on numerically intensive applications such as those running on GPS (global positioning satellite) controllers and real-time control code applications that are typical of servo, motor and engine control.
Finally, Tensilica added the AMBA 3 AXI bridge as a click-box configuration option. This, in addition to the existing AMBA 2 AHB-lite (Advanced High-performance Bus-lite) bridge option, allows designers to seamlessly drop Xtensa processors into AMBA-based systems and eases the use of Xtensa processors with other AMBA peripherals.
Software Tools Enhancements
Tensilica made many significant enhancements to its software development toolkit and the Eclipse-based Xtensa Xplorer™ design environment to make it even easier and faster for designers who have never used configurable processors before. The most important of the processor configuration tool enhancements is the automated Flexible Length Instruction Extension (FLIX) generator for Xtensa LX2, which profiles a designer’s target C code and suggests VLIW instruction specifications that can significantly accelerate the most critical code. By allowing two or three instructions to execute simultaneously, FLIX allows an Xtensa LX2 processor to act as a 2- or 3-issue VLIW CPU.
Designers can accelerate general purpose code between 40-60 percent by using simple, general purpose VLIW instructions. This tool eliminates the need for the designer to analyze the code for areas that can be sped up in this way, significantly speeding the design experience. After the processor core has been created using these new VLIW instructions, software developers programming the Xtensa core need only use the standard Xtensa C/C++ Compiler (XCC), which automatically extracts the instruction-level parallelism from C/C++ code and bundles operations into VLIW instructions whenever possible. So, the programmer does not have to modify the application C/C++ code to take advantage of the VLIW instruction extensions to speed up the code.
Second, Tensilica introduced the “Manual Fusion Editor,” a graphical tool that enables the SOC designer to quickly and graphically create chains or fusions of fundamental computation operations in order to improve performance. For example, basic ADD and SHIFT operations can be combined to form an ADD_SHIFT instruction that executes in one cycle. This ADD_SHIFT instruction could replace two sequentially issued instructions (ADD followed by SHIFT), thus saving a clock cycle and saving code size. As with the FLIX Generator, the operation fusion instructions created by the Manual Fusion Editor are included in the finished processor core RTL and implemented by the SOC designer into silicon. Software developers take advantage of these new instructions merely by using the standard Tensilica software development environment. Fused instructions are automatically inferred by the Xtensa C/C++ Compiler (XCC), so that the application C/C++ code does not need to be modified.
Third, designers will experience an average of 20 percent compilation run-time speed improvement for C/C++ source files in the XCC compiler, the centerpiece of the Xtensa compiler tool chain. This optimizing compiler allows designers to run their C and C++ code on Xtensa processors, taking full advantage of all optimizations made to that processor. To increase code execution speed and reduce code size, XCC employs sophisticated multi-level optimizations such as function inlining, software pipelining, static single assignment (SSA) optimizations, and other code generation techniques. Tensilica’s enhancements make XCC not just one of the fastest compilers for 32-bit processors, but also the most efficient, with exceptional code density.
Fourth, Tensilica offers a new dynamic loader, a software tool that allows binaries to be loaded in different memory addresses at run time. This is useful, for example, for audio and video codecs, so that the same codec can be loaded into different memory locations at run-time depending on available memory.
With the fifth major enhancement, Tensilica sped up the execution of its cycle-accurate instruction set simulator (ISS) by 15 to 30 percent. Likewise, the TurboXim fast functional simulator, which already executes at 50x the speed of the ISS, has also received numerous enhancements that improve its execution speed.
Finally, this release of the Xtensa software tools also enhances the functionality of the Xenergy energy estimation tool by adding visualization and automatic cache energy search tools to the Xplorer Eclipse-based IDE. Xenergy generates an energy profile of the Xtensa processor and its memory sub-systems for application code by modeling each instruction and its memory accesses while the application is being executed. The additions in this release include not only the ability to visually view and compare the energy profiles, but also a tool that automatically sweeps over different instruction and data cache configurations and graphically charts the energy profile for the application code on the particular Xtensa configuration for each cache configuration. Xenergy is a valuable tool in the system designer’s toolkit to guide in the energy-efficient choices to make while configuring an Xtensa processor, writing application-specific TIE instructions, and deciding the write memory and cache configuration.
All of these enhancements just started shipping with the November 2007 release of Xtensa LX2 and Xtensa 7 processor cores and the Tensilica software development tools.
Tensilica offers the broadest line of controller, CPU and specialty DSP processors on the market today, in both an off-the-shelf format via the Diamond Standard Series cores and with full designer configurability with the Xtensa processor family. Tensilica’s low-power, benchmark proven processors have been designed into high-volume products at industry leaders in the digital consumer, networking and telecommunications markets. All Tensilica processor cores are complete with a matching software development tool environment, portfolio of system simulation models, and hardware implementation tool support. For more information on Tensilica's patented approach to the creation of application-specific building blocks for SOC design, visit www.tensilica.com.