By Warren Savage, IPextreme Inc. Jeff Haight, Sonics Campbell, CA
Abstract : Introduction
For the past decade, the march of Moore’s “Law” has witnessed the phenomenal growth in System on Chip (SoC) gate counts, allowing the implementation of a confluence of sophisticated algorithms at price points feasible for consumer electronics (e.g., HDTVs, DVD recorders, multi-functional mobile phones, etc.). Unfortunately, gate counts over the past decade have grown far faster than IC designers’ productivity. With so much pressure to launch high end consumer products before prices (and margins) erode, generations change, new standard features are added, and additional competition surfaces, every aspect of the design flow requires analysis to see how, if, when, and where time-to-silicon can be shortened.
Recent history indicates that the most successful of the most sophisticated product development trends in the high-end consumer space rely ever more heavily on the rapid acquisition, integration, optimization, and verification of semiconductor intellectual property (IP) with a socket-based approach. With heterogeneous multi-processors (HMPs) cleverly integrated to enable the broadest diversity of desired tasks, fitting the function to the requirement simplifies the mapping but complicates the integration and verification.
Consider the highly successful TI OMAP processor for cellular telephones. A large team of extremely experienced individuals must rapidly integrate a large number of configurable IP cores, some acquired and some internally generated. Enormous test and verification suites need to be created and integrated, the effort of which may exceed architectural and logic design. Bi-Directional paths through different levels of abstraction are critical, as glitches from simple logic errors to last stage difficulties with timing closure require changes to the architecture or logic before tape out, especially as mask sets approach a million dollars, and lost market share and falling ASPs from an extra required turn can cost much more.
Linking these HMPs with an intelligent data flow management fabric simplifies multi-generational development, expedites optimization and design exploration at different layers of abstraction, and contributes to a methodology from napkin sketches to tape-out with faster bi-directional optimization and verification with standard third party tools. Despite the time pressures, these designs must be cognizant of efficient power savings, capable of multi-threaded non-blocking data flows to the different processors, simplified test bench development and verification, dynamic Quality of Service (QoS) capabilities with significantly different data rates and word sizes, endianness, etc. All of these factors need to be increasingly cognizant of simplified re-use for multi-generational designs. At the detailed block diagram level, with examples from wireless and mobile video plus related peripherals, this paper will explain in detail how and why this is so.
Many functional IP cores are complete subsystems in their own right, having a processor, memory, peripherals and logic. The design of these smart IP subsystems and their hardware and software interfaces greatly effects how quickly and easily they can be integrated into the system chip. What’s worse integration issues are also compounded by the rise in popularity of configurable IP.
Making IP configurable and easy to integrate consists of far more than utilizing the configuration capabilities of a high level design language. It requires a holistic design approach that considers the whole IP product from the end user perspective and, to some extent, anticipates the configuration needs of the user’s end product. To ensure the easiest integration AND maximum configurability, a proper IP design approach must:
Consider the Totality of Tasks
IC design has proceeded from a “six blind men and the elephant” type of challenge, with architects, library creators, EDA vendors, etc., perceiving their contribution as key (perhaps too uniquely so!). Now, we have an enormous AND-Gate paradigm, for which a single low input means unacceptably low output (i.e., non-shippable or non-yieldable at an economically viable price point). Consider the following design task summary figure:
Now, and increasingly so going forward, the task becomes increasingly multi-faceted and complex. For multimedia quality verification, where no engineering metric successfully corresponds fully to subjective video and audio quality, it is often necessary to push test sequences of motion video or streams of audio through the architecture to verify acceptable performance. The volume of data in the requisite video stream strongly motivates the creation of models with higher levels of abstraction to accelerate this verification by orders of magnitude. Consider the following model (courtesy of CoWare)
The creation of this model greatly simplifies collaborative architectural explorations as well.
A priori recognition of the need for multi-generational design has equally important implications. There may be a strong need for subsequent market segmentation, with features providing super-sets or sub-sets of the initial product. There is typically recognition, at least on senior management’s part, of the need to port these HMPs to the next generation process technology, typically with additional integration. Clever IP choice and smart interconnect technology greatly simplifies these tasks, allowing marginal efforts from high level modeling and architectural re-optimizations and explorations to successful timing closure.
- Isolate the cores with intelligent Interconnects
By isolating the cores with a well-architected fabric, data flow services and their associated verification and integration can be isolated from the bulk of the design. Clearly, this has massive implications for multi-generational efforts by large, often geographically far flung teams. The following figure illustrates the concept of smart interconnects.
- Plan configurability upfront.
This includes not only configuration of the RTL but also of the associated firmware, drivers and tests. Developers need to plan what should be configured in the hardware that will be realized in the final silicon and what will be configured in software that can be changed at the board level and over the course of the chip’s life.
- Partition the architecture along standard interface boundaries.
Proprietary interfaces should be avoided wherever possible to maximize reusability.
- Partition the implementation to minimize manufacturing technology specific sections.
Ideally these are avoided all together or at least isolated from the fully reusable sections of the IP. To keep chips using IP more generally applicable, process specific portions such as memory interfaces and clock gating logic should be brought to top-level ports of the core.
- Provide integration tests that follow the IP configuration.
It is much easier for the integrator if the test bench automatically tests only for the configured functions and just gives a simple pass/fail result as ours does. This requires that the integration tests are based on the configuration settings.
In conclusion, IP cores and integration platforms that meet the above requirements enable SOC integration engineers to assemble great new consumer electronic systems, faster and more economically than ever before.