Massively parallel processor array delivers 30 GigaMACs at 160MHz: dramatically out-performs legacy DSPs
picoChip Designs Ltd, the 3G wireless "Software System on Chip" (SSoC) provider, today announced it has sampled its first device, the PC101, which delivers a massive computational power of 30 Giga-MACs per second, twenty times more than traditional devices despite needing only a fraction of the clock rate (160MHz). Shipped with a complete toolchain and a comprehensive UMTS systems library, the PC101 SSoC dramatically cuts the cost of third generation mobile infrastructure and enables the strategic advantage of in-the-field flexibility - a reprogrammable basestation.
The PC101 has a heterogeneous array of 430 16-bit processors on a single die interconnected by a fast deterministic fabric, delivering 200 Giga-instructions per second. The SSoC delivers the performance and efficiency of a conventional fixed function System on Chip but is completely programmable from standard C or assembler. The architecture was developed with a strong emphasis on ease of design/verification and deterministic performance for embedded signal processing - especially wireless and 3G. Complementing the device is a complete development tool-chain and a comprehensive systems library, providing a complete baseband platform for 3G infrastructure.
Co-Founder and CTO, Doug Pulley, said, "picoChip has overcome the design challenges of 0.13 um through architectural innovation and given our customers a viable alternative to the pains of SoC design. With the introduction of the picoArray, systems companies can place their own algorithms onto groups of processors in the same way as they would like to be able to place hardware IP blocks onto silicon in an ASIC - the Software System on Chip, or SSoC. In one device, users can integrate different processing tasks and algorithms, including DSP, protocol handling and control all using one consistent design environment and toolchain."
CEO, Dr. Rodger Sykes, added," We are proud of the PC101 - right first time 0.13um working silicon. We've implemented redundancy that provides very good yield despite the PC101's massive processing power. This provides our customers with a truly dramatic cost advantage compared to traditional approaches, delivering a compelling roadmap to the lowest cost/channel. As importantly, the tools and system library reduce development time and risk, accelerating time-to-market. Developing a 3G basestation can cost well over $100M - using our solution can reduce that significantly."
The PC101 is optimized for wireless communications tasks in two principal ways. At one level, the structure of the array and the arrangement of the different element types across the chip reflects the balance of requirements of a wireless system. Secondly, the characteristics and instruction set within processing elements varies according to their role, to include specialist operations or larger memory complements for control type operations.
Sean Lavey, an analyst from IDC, concludes, "Current WCDMA basestation designs are still not economic enough for mass deployment in the market. The need for flexibility in this evolving standard has made costly FPGAs the choice of most designs shipped over the past year, but we are now finding OEMs shifting most if not all of their baseband architectures to more cost effective flexible ASSP approaches. We expect this change will result in lower costs to the OEM sparking further cuts in pricing making WCDMA a more mainstream build-out issue for service providers over the next few years."
Comparing the performance of different processors is notoriously difficult, and most measures have their drawbacks. However, the picoArray is so dramatically faster than any alternative, there is little ground for debate.
Analog Devices recently announced the 300MHz version of their TigerSHARC TS101 with 2,400 MMACs (16 bit fixed point), roughly comparable to the 600MHz TI TMS320C6416. Later this year, Motorola will release their MSC8102 with 4,800 MMACs.
However, the PC101 operating at just 160MHz can deliver 30,000 MMACs: twelve times more than the TigerSHARC or C6416 and six times more than the announced MSC8102 - despite needing a lower clock rate.
Just to underline the significance of this, Motorola were quoted [Electronics Weekly 15/01/03] as predicting: "By early 2005 Motorola expect to see devices with more than 10,000 MMAC as a result of improving process technologies such as Silicon on Insulator [and] sub-100nm processes". The PC101 delivers three times this performance today; two years earlier, with a slower clock and without requiring any exotic process developments or (risky) lithography shrinks: it is achieved purely through architectural improvements.
Moreover, while a conventional device would be fully occupied with achieving this performance, it would only be using a portion of the picoArray. For example, a device could deliver an additional >100 Giga-instructions per second at the same time, completely independently, in the rest of the array. The heterogeneous structure of the parallel array means these tasks could be very different, integrating data-path and control/supervisory functions. Similarly, conventional processors cannot usually sustain their peak performance due to I/O bottlenecks, but the distributed nature of processing across a picoArray eliminates that constraint.
This ability to execute many distinct operations at the same time is at the heart of the PC101's performance and the fact that these tasks are completely independent (or orthogonal) and deterministic makes development very much easier than a complex DSP - let alone a cluster of several devices.