SOC: Submicron Issues -> SiPs enable new network processors

SiPs enable new network processors

SiPs enable new network processors
By Dave Sherman, Vice President of Engineering, Alpine Microsystems Inc., Campbell, Calif., EE Times
October 16, 2000 (3:42 p.m. EST)
URL: http://www.eetimes.com/story/OEG20001016S0049

For many applications the system-on-chip (SoC) approach introduces an unnecessary level of complexity and cost into chip design and development. A complete system requires the integration of disparate functions. Functions may be designed as full custom, synthesized from foundry or third-party libraries, or acquired as intellectual property (IP). Embedded RISC processors are a typical example of the latter. There are numerous issues with the financial implications of licensing IP that lie outside the scope of this article. The technical issues still loom large. The integration of functional blocks must be accomplished in such a way that there are no hidden interactions between blocks that can compromise functionality or yield. This places a heavy burden on the CAD and test design methodology employed.

After the chip d esign is complete, product managers will likely encounter higher manufacturing cost. For example, digital and memory functional blocks have different electronic structures and hence different design rules. In order to integrate these disparate technologies, the wafer foundry must provide a merged process that could add as many as ten additional mask levels and many more process steps, increasing the wafer price. Yield for a merged design is also compromised when all of the functional blocks are subjected to additional process steps. Test costs, proportional to tester cost and test time, are increasing as a fraction of total finished goods cost. The mixed functionality of an SoC increases tester time and may require multiple test equipment types to support the various functional blocks. Lastly, the trend is for SoC die sizes to be as large as practical, dramatically increasing yield loss from defects as well as reducing the available die per wafer. As a result of the compounding of these factors, many SoC die are burdened with a high cost overhead.

These problems have led system architects to explore the use of configurable ICs that may equal the complexity of an SoC, but provide the flexibility to extend the useful life of the product. This trend is apparent in the network processor market. However, there are issues with the configurable IC approach. In addition to the longer development time and higher manufacturing cost of an SoC, determining the right amount of flexibility to incorporate is difficult. Eventually, over the life cycle of the product, a compromise for upfront flexibility will necessitate a costly redesign.

Moreover, system designers need the flexibility to replace or upgrade functions to take advantage of new speed grades or die revisions, without forcing a redesign of the entire system. For instance, the control processor may go through a major revision every 12 to 18 months, memory functions will undergo process shrinks every six to 12 months, and the logic will change every t ime a new bus protocol or signaling convention emerges. In addition, while most policy and protocol changes may be accomplished in software, performance optimization usually entails upgrades to the hardware.

One example in network processing is caching packet data on the fly. Providing adequate buffer space (up to 400 Mbits today) forces the use of external memory components, regardless of the amount of embedded memory available on chip. The SoC approach to integration does not provide the necessary flexibility to transparently enhance or evolve the functionality of a network processor system.

Consequently, designers in communications applications are turning to new types of system-level integration based on modular approaches and high density interconnect (HDI). A new generation of system-level packaging called "system-in-package" (SiP) combines multiple ICs and associated discrete elements into a single component. This approach ensures the flexibility that designers require for isolated fun ction upgrades, as well as reduces design and manufacturing complexity. And IP in chip form is the most prevalent and economical way to procure functions. But partitioning for flexibility and simplicity is only half the story. The real win is in performance. By designing ICs specifically to take advantage of the low-noise HDI environment in an SiP, system bandwidth can be dramatically improved over separately packaged IC solutions, and can even provide higher bandwidth and flexibility than single-chip solutions.

The benefits of the SiP approach to system-level partitioning are especially pronounced in the design of network processors. The explosive growth in Internet traffic is producing a huge market opportunity for suppliers of network processor ICs, but the ultimate survivors must be able to scale their solutions to support wire speeds of 10 Gbits/s and beyond. To accomplish this, there are several key requirements for successful deployment of a network processor system.

Network processing takes place between the physical layer interface (PHY) and the switch fabric. The PHY converts electrical or optical signals into logical information for processing, while the switch fabric takes processed network data (packets) from ingress (input) nodes and passes it to egress (outputs) nodes based on processed layer information and established protocols.

The network processing function includes framing, classification, modification, encryption/compression and traffic queuing. This function is evolving from a primarily software-based architecture (router) to hardware-based switches using ASICs and ASSPs, driven by the need to operate at the higher data speeds while performing deep processing in OSI Layers 3 through 7. The growth of broadband Internet traffic, such as video and voice-over-IP, requires quality of service (QoS) processing that significantly increase the performance demand on the network switch. With Internet traffic doubling every four months, the dramatic progress of silicon speed a nd integration typified by Moore's Law cannot keep pace. Therefore, new design paradigms are emerging that rely on multiple ICs and configurable hardware techniques.

The main benefit of hardware-based solutions is performance. Configurable hardware provides the flexibility to build systems tailored for specific protocols, such as Multi-Protocol Label Switching (MPLS) and to support a variety of products and policies, while also promoting fast time-to-market. Thus, configurable hardware solutions based on field-programmable logic, programmable state machines and configurable microcode are emerging in the market.

The shift to OC-192 (10-Gbit/s) rates will require more processing and bandwidth than a single-chip solution can provide, so multichip approaches are needed. Unfortunately, traditional package and printed-circuit boards limit the achievable bandwidth. For instance, to provide full-duplex operation at OC-192 requires up to 40 Gbits/s of memory bandwidth. This may dictate the need for mu ltiple Rambus channels, large amounts of fast SRAM or even embedded memory.

All of these approaches increase system cost and power dissipation. In addition, these approaches will not scale to OC-768 bit rates (40 Gbits/s), which requires memory bandwidth exceeding 160 Gbits/s (40 Gbits/s x 2 [read and write] x 2 [duplex] = 160 Gbits/s).

In addition to high memory bandwidth, deep-layer packet processing at OC-192 bit rates can require more than 10 million gates of logic and 400 Mbits of memory. Thus, the industry faces the dilemma of simultaneously providing high-performance solutions that are cost-effective and scalable with the demands of the Internet. Pipelining and multithreaded architectures can maximize the available bandwidth, but ultimately, new approaches to system-level integration are required.

Power, cost, performance

The system-level integration and partitioning of chip set components requires new packaging solutions. The emergence of SiP techniques offers a solution that can satisfy the performance, cost and power requirements of high-performance applications such as network processors, while maintaining the flexibility of multiple ICs.

The chief characteristic of SiP is low-noise, high-density chip-to-chip interconnect. If package limitations such as pin density, lead inductance and capacitance could be eliminated-that is, if I/O pins were free-it would be possible to design ultrahigh-bandwidth interfaces between unique ICs. For example, a 1-GHz data-rate cache SRAM with nonmultiplexed read and write buses could have a peak bandwidth greater than 100 Gbits/s with 72-bit buses. This system could be built with today's 0.18-micron processes for much lower costs than embedded-memory solutions, while also supporting deep-layer processing at OC-192 Sonet or 10-Gbit/s Ethernet line speeds. And the solution could scale to OC-768 as IC process technology transitions to 0.13 micron.

Alpine Microsystems' MicroBoard SiP technology provides a controlled-im pedance, low-loss interconnect solution for multiple-chip integration. Moreover, as frequencies increase and wavelengths decrease, the integration of PHY electro-optical components with the framing and network-processing functions will become critical to maintain reflection-free connections with routing distances shorter than the characteristic wavelength. For instance, at 10 GHz the maximum routing distance is reduced to 3 mm, a requirement than cannot be met with polymer-based printed-circuit-board technology.

The SiP approach's key benefit is enabling the integration of components optimized specifically for a system-level package. For example, when ICs are optimized for the high speed and high I/O-density characteristics of Alpine's MicroBoard substrate, system bandwidth can increase dramatically, while reducing power consumption. Optimization at the die level not only provides significant system benefits, but can also reduce the cost of the die.

When a system is partitioned into separate components, compared to a single SoC for instance, those components can be optimized for manufacturing cost as well as for performance. The wafer fabrication process can be simplified, yields can be improved and die size reduced.

Efficient layout

In addition to reducing die area by partitioning functions into separate die, changes to the interface circuitry can produce a more efficient die layout. I/O drivers sized for the low-capacitance HDI are significantly smaller in area, and consume far less power than a typical high-current, off-chip driver. Additional savings in size and power can be achieved by resizing the I/O pre-driver circuitry. The use of array pads allows small I/O drivers to be placed closer to their functional source or load, further reducing on-chip "river" routing. Array pads also allow a much larger number of I/O ports, eliminating on-chip multiplexing. Array pads for connections inside of the SiP are essentially free, since the incremental cost per pad is insignifi cant.

In the near future, system architecture will take advantage of SiP technology to achieve significant performance and time-to-market advantages over purely SoC approaches in compute-intensive applications such as network processing.

Industry Articles

SOC: Submicron Issues -> SiPs enable new network processors