Creating a Star IP Core
Creating a Star IP Core
Creating a Star IP Core
By Bertrand Noel-Baron, Infineon Technologies; Benedikt Schmaenk, Stefan Schmechtig and Robert Huber, Synopsys Inc., EE Times
August 30, 2002 (10:00 a.m. EST)
The Infineon C166 is a 16-bit microcontroller developed for embedded control systems and is typically used in wireless, automotive or hard-disk drive applications. The C166 is one of the leading architectures in the 16-bit embedded microcontroller market, with a total worldwide volume of more than 100 million devices in 2000.
Originally created in 1988, the C166 was implemented using schematic entry (first-generation full-custom design) and maintained until 1995 in this form, when the netlist was converted to a cell-based, non-synthesizable design (second-generation cell-based core), the C166CBC. Engineers found it increasingly difficult, unpredictable and time-consuming to port it to ever-shrinking process geometries. Furthermore, performance bottlenecks and enhancements were extremely difficult to address.
To overcome the above drawbacks and to additionally enable the distribution of the microcontroller as part Synopsys' Star IP (intellectual property) program, Infineon and Synopsys engineers worked together to create the third-generation implementation, the C166S. This project involved a complete redesign of the C166 microcontroller, surrounding subsystem, verification environment, and packaging all of these items in a manner suitable for commercial delivery to customers.
The C166 core consists of a RISC-like CPU with a four-stage pipeline, flexible-addressing modes up to 16 Mbytes and two clock cycles per machine cycle. It also contains an integrated interrupt controller with up to 112 nodes, a multichannel DMA controller, an external bus-controller and an on-chip debug system. The core is extended with a standard set of peripherals including serial ports, general-purpose I/O, timer and an Amba bus interface added to allow the C166S to be connected to a large variety of third-party peripherals.
To enable a high degree of flexibility to system-on-chip (SoC) designers, the C166S is parameterized to all ow them to configure it to match their system requirements. For example, the number of interrupt nodes is configurable, and a multiply-accumulate (MAC) unit is optionally selectable.
Phased design approach
The Phased Design approach shown in the figure below partitions the design problem into a series of discrete phases, each having specific inputs and outputs that are tracked. This allows managers and team members to measure progress throughout the project.
The planning phase for this type of project is essential for success and entails three basic activities: understanding requirements and setting goals; assessing the existing intellectual property (IP); and planning the implementation and verification activities.
Design goals need to be established for the area, speed and power dissipation. An estimate is made on what is possible and achievable given architectural limitations and customer demands. In this case, an initial customer was going to be using the C166S in a 0.18-micron technology and had set goals for it to be 100 percent faster, use 30 percent less area, and most importantly, consume 50 percent less power than the 0.25-micron cell-based version.
After the definition of the design goals, a detailed assessment of both the IP and the testbench were made. Weaknesses of architectural implementation were pointed out, collected and improvements or changes were then proposed. The existing verification strategy then had to be checked to ensure the availability of testbench models that were compatible with these architectural changes.
After analyzing the architecture, the C166S team saw that the aggressive timing target would not be reachable with the existing microarchitecture. Bottlenecks in the top-level bus structure prevented the processor from running at top speed. A new microarchitecture was developed that would allow th e bus to run at different speeds from the main processor clock and also provide easier SoC integration due to its use of multiplexed instead of tri-state buses.
In addition, a critical path was identified in design that led to the development of a new fetch mechanism. At issue was read data, which is late arriving from memory. By registering this path, an additional pipeline stage was added. While throughput was reduced by one instruction after every jump, the architecture could run at twice the original frequency. The negative effect of these mostly positive changes was that the new architecture was no longer 100 percent cycle-accurate to the old cell-based design, complicating the verification effort.
In parallel, we reviewed the verification environment to see what elements could be reused for verification of the new architecture and which elements would need to be created or enhanced to enable new tests to be written for corner case conditions and new features.
Creating the plan
Our development plan covered both the implementation and verification aspects of the project. Implementation aspects included design partitioning, coding guidelines, detailed definition of all parameters and implementation details associated with the microarchitecture changes identified during the assessment. Verification aspects of the plan included a complete summary of all the existing tests that had to pass, and a list of new tests that would be needed to test the changes to the new microarchitecture.
Because the C166S would be used by a large number of design teams, extra care was taken to ensure the quality of the implementation above and beyond functional validation. A regression test was devised to verify that the test suite passed against multiple configurations of the C166S and to insure that the verification environment worked with multiple simulators. Synthesis regressions were also established to test against multiple technology libraries to insure that the core was truly technology indepe ndent.
Finally, the entire verification environment was developed in Vera, an open high-level verification language with object-oriented features that allowed many powerful verification capabilities to be used in both a Verilog and a VHDL environment.
Implementation and verification
We initially coded the C166S in Verilog following the guidelines set forth in the Reuse Methodology Manual and Infineon's Soft Core Quality Guidelines. This required us to code using a subset of the Verilog language so that the C166S could be easily translated to VHDL. This made the C166S available to customers in either format.
Each of the individual modules of the C166S were functionally validated through simulation and synthesized as a standalone unit. In addition, code reviews were conducted and Synopsys' LEDA ProVerilog was used to check for Reuse Methodology Manual and other possible coding violations. Synopsys' CoverMeter tool was used to measure the effectiveness of unit-level tes tbenches. Once all the modules were completed, they were assembled to create the top-level view of the C166S that could be plugged into a system-level testbench.
As the new C166S architecture was no longer 100 percent cycle-accurate to the older C166 architecture, a new verification environment was needed that could execute the existing test programs but not rely on vector-based comparisons to determine pass/fail success. The new environment used many of the powerful language constructs of Vera to create a self-checking testbench that had some capabilities for creating random stimuli to test the operation of the interrupt controller.
In addition, different regressions were created to rerun the test suites across several different configurations of the processor. We chose three base configuration sets to validate across:
Minimum. This sets all parameters of the C166S to their minimum values to test the processor with 'bare bones' capabilities such as mini mum number of interrupt nodes and without the MAC unit.
Typical. This sets all parameters of the C166S to their 'default' values that represent the typical configuration that most customers would choose.
Maximum. This sets all parameters of the C166S to their maximum values to test the processor with 'full-blown' capabilities such as maximum number of interrupt nodes and a MAC unit.
The C166S gate-level netlist was simulated on a Quickturn system running a real user application. This phase was coupled with the implementation and test phase and several loops were necessary to develop new test cases that were not considered during the planning phase. To support debugging, the C166S offers a 'trace interface' facility that gives information about the pipeline status, and can be used, for example, to hook up a disassembler. Connecting a real debugger to the on-chip debug system in the Quickturn test environment helped us to find synchronization problems betw een clock domains (JTAG and CPU) during the prototyping phase.
The C166S results
The detailed planning undertaken for the design resulted in a greatly improved architecture that can easily integrate into modern SoC designs. The table below shows the C166S results that were achieved on this project along with the C166 cell-based design data that was the starting point of the project.
The above C166S is about 100-k gates with 80 Interrupts nodes. The power number is derived from a simulation using Synopsys' PowerMill and one reference pattern.
Not only was the project a success, achieving all of its technical goals, it was also a model example of how two companies can work together as a team, driven by a common goal. Today, the efforts of the team continue to pay off as the C166S is offered for commercial licensing through the Synopsys Star IP program. Companies with a Synopsys DesignWare license can download and evaluate the C166S (among other titles) to see if it meets their needs and if so, purchase a license from Infineon to use it in their product.
Copyright © 2003 CMP Media, LLC | Privacy Statement