Update: Cadence Completes Acquisition of Tensilica (Apr 24, 2013)
SANTA CLARA, Calif. - June 22, 2009
- Tensilica, Inc. today announced the first member of its new ConnX family of digital signal processor (DSP) cores for system-on-chip (SOC) design. The ConnX Baseband Engine enables efficient baseband processing for 3G, LTE (Long-Term Evolution) and 4G wireless equipment with its scalable, high-performance DSP architecture that provides industry leading computational throughput of 16 18-bit MACs per cycle. The ConnX Baseband Engine features an optimized instruction set, high memory bandwidth, scalable clustering, and efficient compiler support with an easy programming model for SIMD (Single Instruction, Multiple Data) vectorization and other DSP functions. This high performance core is also an effective solution for multi-standard fixed and mobile DTV broadcast demodulators.
The new ConnX Baseband Engine builds on Tensilica's customizable Xtensa LX dataplane processor (DPU) technology and leverages the proven Vectra LX DSP engine option to become one of the fastest DSP cores on the market. The ConnX Baseband Engine is ideal for emerging baseband PHY standards, especially those using Orthogonal Frequency Division Multiplexing (OFDM) modulation and Multiple Input Multiple Output (MIMO) transmission.
Easy programmability, including automatic vectorization for ANSI C programs, and optimized instructions for fast complex FFT (Fast Fourier Transform), FIR (Finite Impulse Response) filters, and complex matrix operations make the new architecture particularly suitable for low-cost base station designs, femto-cell projects, digital media broadcast receivers and software-defined radio handsets.
"Our engineers have utilized the full extent of our customizable DPU technology to make the ConnX Baseband Engine," stated Chris Rowen, Tensilica's chief technology officer. "We've added over 200 baseband-specific instructions for compute-intensive functions that slow down other DSP cores. By making this into a 3-slot VLIW (Very Long Instruction Word) machine with up to two load/stores plus one MAC (Multiply Accumulate) and one ALU (Arithmetic Logic Unit) operation per cycle, we get outstanding performance for challenging 4G data throughput requirements."Architected for 4G and Beyond
While the ConnX Baseband Engine can be used in 3G applications, the architecture of the ConnX Baseband Engine is designed anew for 4G and beyond. It is designed for 8-way SIMD and 3-way VLIW for maximum throughput. It has two 160-byte vector register files supporting 20bx8 and 40bx4 vector types for DSP operations. It is extremely efficient at matrix operations, and offers rich vector operations with complex arithmetic support. It can do four complex FIR taps per cycle and one Radix-4 FFT butterfly per cycle.Optimized Instruction Set
The ConnX Baseband Engine achieves this efficiency with an application-specific instruction set optimized for DSP functions with native support for FFT, FIR filters, and complex matrix operations. By implementing many functions in hardware, the ConnX Baseband Engine gets the performance needed for 4G applications. Special features include:
- Aligned and unaligned vector load and store instructions for 16-bit(20b) and 32-bit(40b) data.
- Vector operations: ADD, SUB, MIN, MAX, NEG, ABS, MUL, DIV
- Complex vector operations: CMUL (REAL, IMG), CMAG (magnitude squared), and complex conjugate functions
- Radix-4 FFT butterfly operations
- Multiple addressing modes: circular, bit-reverse addressing
- SIMD reciprocal square root - 4-way on 40-bit operations
- SIMD divide - 8-way on 20 bit operations
- Extended precision fast FIR - 4 x (40-bit,40-bit) complex taps/cycle
- 16 x 18-by-18-bit multiply-add instruction per cycle.
The ConnX Baseband Engine joins the proven quad-MAC Vectra DSP option, which is now re-branded as the ConnX Vectra Engine. The ConnX Vectra Engine has been used in many customer designs and is an integral part of the re-branded ConnX 545CK. The ConnX 545CK was previously known as the Diamond 545CK, which received a BDTIsimMark2000TM score of 3820. The BDTIsimMark is a summary measure of overall DSP processing speed based on BDTI's DSP Kernel BenchmarksTM *, and the 545CK received the highest BDTIsimMark2000 score of all licensable cores evaluated by BDTI. The ConnX family of DSP engines includes other functions that are in high demand for next-generation compute-intensive tasks. Tensilica expects that it can quickly leverage its customizable processor technology to develop key functions that will significantly expand its business in several market areas.High Memory Bandwidth
The ConnX Baseband Engine provides basic data memory bandwidth of 32 bytes per cycle. While this is adequate for many applications, the ConnX Baseband Engine provides the highest memory and interconnect bandwidth in the industry with direct pipeline access to multiple processors and memories with very wide ports using Tensilica's proven Queue (FIFO) capability. Queues can sustain data rates as high as one transfer every clock cycle or up to 350 Gbits/sec. The high bandwidth and low control overhead of Queues allows the ConnX Baseband Engine to be used in applications with extreme data rates.Optimizing Compiler with Support for SIMD Vectorization
For the ConnX Baseband Engine DSP, Tensilica offers efficient compiler support including automatic code scheduling, software pipelining, and SIMD vectorization of ANSI C code. In addition, the ConnX Baseband Engine compilers include direct access to all of the advanced architecture features via embedded functions. Together, these compiler features enable assembly code performance and code density without the time and risk of assembly code programming.Toolkits Speed the Design Process
In addition to the optimizing compiler, Tensilica offers two Eclipse-based toolkits to speed the design process. The Xtensa Processor Developer's Toolkit is a set of powerful yet easy-to-use tools for processor customization. Designers can pick from several processor options (memories, interfaces, etc.), run the pipeline-modeled, cycle-accurate instruction set simulator or the high-speed instruction-accurate simulator, and perform system modeling with these proven tools.
The Xtensa Software Developer's Toolkit provides a comprehensive collection of code generation and analysis tools that speeds the development process. The Xtensa software development environment is generated - automatically - from the same database as the processor hardware description so designers are guaranteed a perfect match. All configuration options are supported, so there is no need to manually edit or extend the tools. This approach ensures correctness and consistency by construction. Designers get a compiler, linker, assembler, and debugger tuned exactly - and matched exactly - to their tailored processor hardware. The software development environment includes powerful source-level multi-core debugging with flexible data-type display and profiling of both performance and energy to support rapid development of optimal system solutions.Scalable Architecture for Maximum Performance
While many designs will only require one ConnX Baseband Engine, this architecture can be scaled easily with up to eight instances, providing over 250 GOPS (Giga Operations Per Second) performance.
A major semiconductor company is the lead customer for ConnX Baseband engine and will sample first silicon in the third quarter of this year. Availability to other early access customers will begin in September 2009.About Tensilica
Tensilica, Inc. - the leader in customizable dataplane processors - is a semiconductor IP licensor recognized by the Gartner Group as the fastest growing semiconductor IP supplier in 2008. Dataplane Processor Units (DPUs) combine the best capabilities of CPUs and DSPs while delivering 10-to-100-times the performance because they can be customized using Tensilica's automated design tools to meet specific dataplane performance targets. Tensilica's DPUs power SOC designs at system OEMs and five out of the top 10 semiconductor companies for products including mobile phones, consumer electronics devices (including digital TV, Blu-ray Disc players, broadband set top boxes and portable media players), computers, and storage, networking and communications equipment. For more information on Tensilica's patented, benchmark-proven DPUs visit www.tensilica.com
* BDTI DSP Kernel Benchmark conditions: All processors are benchmarked with 16-bit fixed-point data. All scores use worst-case clock speeds for the TSMC CL013G process and ARM Artisan SAGE-X library. Vendors can choose different speed/area/power tradeoffs; to understand the trade-offs, please view all BDTI metrics for each core. BDTIsimMark2000TM scores may be based on projected clock speeds. For more information, see www.BDTI.com/benchmarks.html