A decade ago, the programmability of embedded microcontrollers and digital signal processors made them a popular and flexible alternative to the microprocessor. Since then, the programmability of FPGAs has been a boon to prototyping complex ASIC designs; at the same time, reconfigurable FPGAs allow end products to be customized. Now, in the system-on-chip era, programmable and configurable intellectual property will meet the equivalent challenges.
While SoC technology can and will enable the ever-more-advanced products demanded by customers, it can never seem to do that fast enough. Either the market has moved on-typified by high-growth consumer applications where standards are evolving rapidly and long-term prediction is difficult and risky-or an already crowded market means reduced sales at lower prices and severely dented profits.
The reuse of predesigned hardware building blocks is speeding up the design process. But the rush to build a so-called virtual board on silicon is fraught with problems. Smaller geometries have changed some of the fundamental relationships between components, and today's design methodologies are struggling to cope. Also, the hard macro blocks can be difficult to port to new process technologies while soft macros are generally optimized for their previous use.
Meanwhile, the design costs associated with achieving the required efficiency and performance of an SoC are high. Specialized ASIC designers are needed, as are costly and complex tools. Most important, the process takes time. With communications protocols, audio and video standards and interface preferences changing so rapidly, few design teams can guarantee selection of the right one when silicon is finally delivered, say, 15 months hence.
However, deep-submicron capacity coupled with the ability to mix memory and logic on chip opens new possibilities for constructing platforms that do not exist in the board-level world. What is n eeded is a design approach that allows rapid development and redesign without significant upfront costs.
So consider a different approach. Software running on an embedded processor provides a high degree of reuse; designers can incorporate an embedded processor in their next design and gain the use of software developed for that processor in previous designs. Programmable hardware offers a rapid path to market, particularly if it can be integrated easily into a larger system. Configurable processors take this type of reuse to the next level. Not only does the designer gain the advantages of reuse but also he or she can configure (or deconfigure) and customize the processor. Taking this one step further, configurable platforms provide a development environment for the design to be configured and customized for the specific needs of the new application.
Configurable processors for SoC use are becoming available, including the Jazz processor from Improv Systems. It is a very long instruction wor d (VLIW) processor optimized for both high-performance signal-processing applications and advanced control processing.
The Jazz processor offers an innovative approach to SoC design-in particular to design reuse. The architecture was designed specifically for data- and computation-intensive applications such as video, audio and image processing and embedded connectivity. They typically require a high-performance data path and a state machine to control it. Improv realized that this would be implemented best in a deep-submicron IC using a VLIW processor architecture and configurable logic and, since a central memory structure has no advantages, interspersed with shared memories. This cascade memory architecture allows compute resources to consume and produce data without bandwidth limitation, making possible simultaneous access to memory from all computational resources and performance of a higher number of operations per cycle.
The processor's architecture was designed from the outset to be compiler programmable. Further, it has programmable I/O sections that can support serial and parallel interfaces such as SDRAM and 1394 with user-definable clocks. The interface definition can be programmed and changed after the hardware is done, thus speeding deployment of the product with the most recent or popular interface standard.
The modular architecture is scalable to suit particular needs, supporting the early prototype phase as well as optimizing for high-volume, low-power and reduced-cost production. It is licensed to be produced by a number of key semiconductor manufacturers.
A typical processor, the JazzE6 (six engines), has 256 user pins, 1.44 Mbits of data SRAM (dual port) and 2.88 Mbits of instruction memory. Fabricated in a 0.25-micron process, it measures about 11 x 11 mm and at 100 MHz draws 2.5 W to 4 W, depending on the application.
In addition to the processor architecture, Improv provides a fully configurable programmable platform called the Programmable System Architecture (PSA). The PSA is a combination of multiple processors, on-chip memories and programmable I/O modules tied together through a dedicated communication structure. Improv's Jazz Composer tool allows the designer to configure different features of the processor including the number and type of computation units in the data path, various aspects of the VLIW instruction, register availability and routing and external integration structures (for example, memory ports). In addition, designers can add their own computation units into the data path and access them in Java using Improv's Solo Compilation Environment.
The PSA platform provides a stable but customizable hardware platform that decouples application development from the hardware implementation. The Jazz PSA Platform can be implemented either as a scalable core or as a complete standalone chip in multiple geometries down to 0.18 micron and below.
Key to the programmability and reconfigurability of the Improv solution is the co mpiler. Called Solo, it provides the bridge between intuitive application development in the Java high-level language and the architecture. It employs a combination of classic compiler technology with behavioral synthesis and VLIW code-generation technology.
While the application designer uses the Java design environment to specify and verify the applications, Solo will efficiently and automatically allocate tasks to processing engines to maximize performance and resource utilization, yet based on the application developer's constraints. A control and data flow graph is generated by Solo along with the Java class files used by the compiler.
In conventional SoC designs containing different on-chip processors such as microcontrollers, DSPs or other custom blocks the system designer is left to partition an application into different pieces and manually allocate tasks to the processors on which they will be run.
To optimize VLIW processor performance, it is essential to maximize the numb er of operations that can be executed in a single instruction (12 to 15 operations per cycle). Solo transforms a sequence of operations implementing a given task into a collection of VLIW instructions for the target processing engine. The compiler will automatically minimize the number of cycles required to execute the task and minimize the number of overall required slots and instructions. In addition, it will aim to minimize overhead and maximize instruction-level parallelism including hardware loops, byte addressing, block updates, conditional execution and stacked results registers.
In the final analysis, Solo can generate results comparable to those of high-end DSP chips both in code density and cycle execution.
One of the trickiest aspects of SoC design today is verification. For example, existing system-level modeling methodologies like data flow, state charts and synchronous languages have each specialized on specific aspects of modeling but have not been able to scale up to the full requirements of system modeling. This has led to heterogeneous modeling, where different computational models are used for various aspects of an application. But it creates an overly complicated environment for compilation, analysis and synthesis tools.
Improv has focused on defining a single computational model that can handle the seemingly conflicting requirements of data flow and control. The company developed a modeling approach that pulled from the strengths of HDLs, C programming and object-oriented methods, paying particular attention to support for managing concurrent execution among multiple tasks. Improv's model is implemented as a Java class system called the Notation Framework. The Notation Debug tool uses Java features to provide functional simulation support directly in the Java class system implementing the system-level model. The Notation Framework has a built-in run-time system that interacts directly with the application developer's test bench.
Verification approach< /P>
Once the compiler has been run on the application code, the assembly-level code can be read into an instruction-set simulator to allow verification at the cycle-accurate level. It simulates all the processing engines running in parallel.
Final system-level verification involves running the programmed device in the system in which it is intended to operate. The ISS can be embedded inside an HDL (VHDL or Verilog) wrapper before mapping the application onto a physical chip and run within an HDL simulation of the system.
Or, after mapping, the chip can be run in-system on an evaluation board, with a debug driver interacting with a host to provide debug or timing information.
The configurable processor and the configurable platform bring the benefits of deep-submicron technology to the wider market. Users can save considerable time in developing new products, which can be customized later in the design cycle . The design process is simpler (and thus less expensive), requiring fewe r specialized IC design skills, as is verification.
Foundries can produce cost-effective devices for lower-volume applications because the implementation, verification and manufacturing costs can be leveraged over multiple customer projects. Product respins, new product variants and functionality changes take weeks instead of months and can often be done on existing silicon.