Heading off test problems posed by SoC
By Ron Press and Janusz Rajski, EE Times
October 16, 2000 (3:49 p.m. EST)
Ron Press, Technical Marketing Manager, ATPG Products, Janusz Rajski, Chief Scientist and Business Unit Manager, ATPG Product Group, Mentor Graphics Corp., Wilsonville, Ore.
A system-on-chip is a horse of a different color, so why shouldn't testing it also take on a different hue? Unlike a big chip stuffed mainly with random logic, an SoC's personality sparkles with multiple facets: user-defined random logic, large memory arrays, cores, intellectual property, programmable logic, small macros and perhaps hundred of different kinds of small memories dispersed throughout the design. But because an SoC springs from deep-submicron processes-ICs with gate lengths of 0.3 micron or smaller-it may have defects that are not a problem with older-style processes.
Consequently, testing SoC devices calls for a different strategy. Because each of the functional components carries its own test needs, designers must be "up front"-that is, work out a t est plan early, at the start of a design. This will also address the possibility of new faults. The same minuscule feature sizes yielding tens of millions of transistors also allow clock-induced problems to crop up.
At the start, more transistors mean more test patterns and longer testing times on ATE systems. The cost of testing can soar unless some way is found around it. Higher frequencies put pressure on the testers to keep up with device clock rates and provide better accuracy and precision-which usually translates to richer price tags. And even if the cost of test were not an issue, the limited amount of chip test time now apportioned by some foundries is.
An up-front block-by-block testing game plan for SoC devices therefore must provide for several elements: properly equipped automatic test-pattern generation (ATPG) tools for logic testing, limited test er time, new kinds of at-speed fault models and several kinds of memory or small-array testing. A vital concern to any high-volume production line is diagnostics, which must not only find failures but isolate them to particular nodes. A further consideration: Whenever possible, test reuse must be deployed to save test development time.
The value of diagnostics in a production environment recently was revealed by engineers at Texas Instruments Inc. They showed how design-for-testability techniques, such as ATPG and IDDQ, when combined with electrical analysis information, can become a powerful failure-site isolation mechanism for highly integrated ICs.
Other nitty-gritty parameters that call for planning ahead include the number of pins needed for scanning, the amount of memory behind each pin and so on. More likely than not, designers will install boundary scan on an SoC, not just for interconnect testing down the road at the pc-board or multichip-module lev el, but to serve as a port through which built-in test logic and often scan circuitry can be accessed.
As shrinking geometries continue to pack millions of transistors onto one chip (100 million is currently being passed), the number of test patterns is soaring to unprecedented counts. Not only does that translate to longer periods spent in test sockets but to the danger that, at some point, workstation resources will run out. For the first problem, about the only way out is to look for generators that can compress patterns by substantial amounts-20 percent to 60 percent would not be unreasonable. To avoid the potential capacity problem, look for test software that runs on a 64-bit operating system, which, chances are, is necessary for today's large design sizes anyway.
On top of that, test software faces new kinds of faults stemming from the submicron feature sizes and escalating frequencies. For those faults, traditional ATPG test patterns, which usually target static stuck-at faults only, are no longer sufficient. Adding functional patterns in an attempt to catch the newer faults is an exercise in futility. Better to grade the initial set of functional patterns to determine which faults escape, then create ATPG patterns to catch the missing faults.
In fact, designers will rely less and less on functional patterns as design size escalates and the time available for testing each transistor shrinks. To catch speed-related problems and verify circuit timing, at-speed testing must become the norm.
Keep in mind that even small speed-related defects can dramatically subvert overall product quality. Thorough at-speed testing, then, must incorporate a variety of fault models-including transition, path delay and IDDQ.
Transition fault models are represented by a gross delay at a gate's I/O terminals. They are simila r to stuck-at faults but include launch and capture events; the former initiates a transition through a terminal modeled with a delay, the latter samples the response to the launch event within a defined window.
If a terminal has a transition fault, then the event will be slow to appear at the capture point and appear as if there were a stuck-at fault. Similarly, a path delay fault is modeled as a delay in which the entire path, from launch to capture, is defined by the user with the aid of timing-analysis tools. After that, the user creates patterns to isolate and test the paths at speed.
Scan logic provides control and observation points at every scan cell used to launch and cap-ture events during transition and path delay tests. This high level of internal access enables easy transition and path delay pattern generation, even for logic that is deep within a circuit design.
Workers at Hewlett-Packard Co. and elsewhere have shown that a combination of stuck-at, functional and transit ion/path delay faults may be the most effective test strategy. For deep-submicron chips and high-frequency operation, transition and path delay testing becomes even more critical.
High frequencies bring yet another problem: Affordable ATE cannot supply fast enough stimuli; nor is the stimulus always commensurate with the precision and accuracy of the on-chip clocks. But accurate at-speed testing is essential to uncover timing deviations.
Without resorting to multimillion-dollar ATE systems (which may stall out at around 100 MHz anyway), designers must seek out new capabilities. For example, to get around the ATE accuracy problem when testing cores at speed and still hold down costs, ways must be found to keep the tester interface simple yet ensure sufficiently accurate signals during testing (transition and path delay testing only requires accurate clocks at the tester interface).
Alternatively, when a chip uses a phase-locked loop (PLL) for clock generati on, it is possible to deploy a PLL/BIST (built-in self-test) scheme that even the most expensive testers cannot match in frequency and accuracy-or combine external ATE with the internal PLL clocking approach. Any of those methods should help shave the cost of test; at any rate, look for at-speed testing to fall more squarely on the shoulders of commercial ATPG software.
Recently, Motorola Inc. disclosed testing advances for its fourth-generation PowerPC microprocessor. The advances are based on the use of an internal PLL to generate the launch and capture events during scan testing, which further improves the clock accuracy. The Motorola test plan also included BIST for testing large memories, special techniques for testing small arrays and boundary scan.
Memory BIST circuitry, developed early in the design cycle, is ideal for large memory arrays. It can produce internal patterns and test multiple devices in parallel. Another great benefit is that the memory BIST can run at internal clock rat es without reliance on the tester for generating clocks or holding long pattern sets.
Because there is a high chance of manufacturing flaws within an SoC memory block, diagnostic capabilities are a must for memory BIST. Once a problem is diagnosed, bad addresses can be remapped into redundant cells sitting in spare locations. The detected faulty locations are thrown out, not the expensive chip itself.
Small embedded blocks can be tested without adding extra gating or control logic. For that, a testing technique called vector translation-used in Mentor's FastScan Macrotest-converts a functional pattern (used to test a specific block through the random logic of the surrounding circuitry) into a sequence of scan patterns.
Unlike BIST, extra logic is not required to bypass the block's functional inputs. This technique is ideal for shallow memories or small intellectual-property blocks that require a relatively small number of patterns. Because no extra test logic is needed, SoC developers can reuse the patterns previously formulated for the blocks when they were discrete components.
A sophisticated ATPG tool can not only test macros in parallel but can determine if conflicts exist and define which macros can be tested in parallel and why others cannot. Furthermore, even if the macro clock is the same as the scan clock (as with synchronous memory), the macro can still be effectively tested.
Given the insufficient number of test points on today's dense double-sided boards, every complex chip must carry boundary-scan circuitry. Without boundary scan, locating manufacturing faults at the board level becomes a daunting or almost impossible task; with boundary scan, board testing becomes almost trivial and it is almost independent from the logic within the chip. Boundary scan also can port ATPG patterns into a chip's scan chains at any production stage.