DFT for SoC : The Economic Myths
DFT for SoC : The Economic Myths
By Richard Illman, Chief Consulting Engineer, Tality, a subsidiary of Cadence Design Systems, San Jose, Calif., EE Times
October 4, 2002 (10:54 a.m. EST)
URL: http://www.eetimes.com/story/OEG20021001S0055
It is often said that the emergence of the System-on-Chip will require fundamental changes in the approaches to design for testability (DFT.) These changes, it has been suggested, will take the form of a "test re-use" strategy or the adoption of logic BIST. However, analysis of the costs and design techniques associated with real SoCs shows that these approaches will offer no advantages in the majority of design scenarios. SoC design will require an evolution rather than revolution in the world of DFT.
The first issue that must be addressed is simple: "What is the fundamental difference between an SoC design and a conventional ASIC design?" Most importantly, an SoC design usually contains a number of blocks provided by external IP suppliers. Also an SoC may contain a mix of both digital and analog blocks. Another difference of SoC design that is frequently cited is that the designs are hierarchical. In practice the design approach for any large chip will make extensive use of hierarchy. Another potential difference between SoC and ASIC design is shorter designs times.
Cost models
To understand the tradeoffs between different DFT approaches it is important to understand the basic hierarchy of costs involved in a design. For example, consider an SoC design that will result in a $20 chip with a production volume of 1M parts. Such a design might have the following cost structure:
* A leading edge Unix workstation costs approx. $20K
* An EDA ATPG licence costs approx. $100K for a three year licence
* ATE test time for a mid-range tester is around $0.10 per second. For a ten second test time at wafer and package levels this would give a total tester cost of $2M.
* The silicon costs will be $20M. An additional DFT overhead of 5% would add $1M to the overall product cost.
From this simple analysis it is obvious that the processor time and EDA licences are some of the lowest cost items in the de sign process. So the DFT approach should always be to solve problems by use of CPU time and EDA tools rather than impacting the test time or die area. This argument might appear counter-intuitive, since DFT/ATPG are fundamentally based on the use of scan chains that have a significant impact on chip area and performance. However, in the case of scan chains no realistic alternative exists. No known ATPG algorithm, no matter how CPU time intensive, can handle unconstrained sequential logic. Even the use of partial scan has largely been abandoned despite years of development due to the lack of reliable algorithms.
The relative typical costs in the different design phases for a 50K gate IP block also show dramatic differences:
* RTL code development and validation approx. 6 man months
* Place & Route, timing closure, physical verification approx. 4 man weeks
* Scan insertion and ATPG approx. 1 day
As a further example, on a recently completed design of around 4M gates the final ATPG run, using conventional "flat" ATPG required 4 days of CPU time. The back-end activities such as place & route, timing closure and DRC checking required over 300 days of CPU time. The manpower required for DFT was around 10 percent of that required for the back-end work.
The trend in relative costs is illustrated in figure 1.
All these figures emphasize that the cost of ATPG is very small in the context of the overall design costs. Consequently impacting these other, more expensive, aspects of the design flow in order to reduce ATPG cost is not justified.
Core re-use and the P1500 standard
The proposed P1500 standard adds "wrappers" around IP blocks similar in nature to the 1149.1 "JTAG" standard. These provide isolation so that a block can be tested using standard vectors provided by the IP supplier. The isolation guarantees that the results of the test are not dependent upon the environment in which the block is used, provided a standard "test access" mechanism is implemented.
The problem with this approach is that the "wrappers" impose restrictions and overheads on the design. The most obvious of these is the increase in die size due to the extra gates needed for the wrappers. The wrappers also impose additional delays in the paths between the various IP blocks, the equivalent of two multiplexer delays in each path. Traditionally these top level paths can be some of the most critical due to their long lengths.
The wrappers also impose some less obvious, but equally important overheads. When performing synthesis the wrappers impose a constraint on the synthesis tools and do not allow functions to be merged across boundaries between IP blocks. The use of standardized pre-supplied vectors may also restrict the synthesis by not allowing scan chains to be reordered or to change inversions along the chains.
The impact of the wrappers on the different parts of the design flow is illustrated in figure 2.
The wrappers are also expected to increase final device test times by imposing additional levels of protocols when accessing the test structures built into the chip.
The traditional approach to ATPG, starting with a complete chip netlist and generating vectors has a number of significant advantages. Only a limited number of data views are required and the task can normally be managed by a single person. This reduces the complexity of managing the task. In the event of vector debug being required the process is relatively straightforward. In the "hierarchical assembly" approach to generating vectors the process involves far more data views and sources of data. The process of debugging failing vectors will also be more difficult because the source of the problem may lie w ith the pre-supplied vectors, the surrounding access mechanism, the timing of the logic in test mode or the process of "expanding" the vectors to the chip level. As different parts of the process may be owned by different organisations the debug route is potentially fraught with problems.
Is Logic BIST the solution?
Logic BIST has been suggested as the best way forward for SoC test. It reduces the test data volumes and provides a simple mechanism to test blocks independently and in parallel. However significant problems remain. Firstly the design must be completely clean with regard to timing and signal integrity to prevent any "unknown" states from propagating into the signature registers and giving inconsistent results. As device geometries decrease and signal integrity becomes a greater problem this will be increasingly difficult to guarantee. In comparison conventional scan vectors can be easily modified to mask inconsistent results.
Logic BIST is also difficult to implemen t at the register transfer level since random pattern testability and test point insertion can only be performed at the gate level. The random pattern fault coverage of a function cannot be determined from a purely functional description. As most IP is developed at RTL and customers map it into different technologies, it is not practical for the IP developer to provide a standard logic BIST implementation.
What is the future for SoC test?
As shown the commonly proposed techniques for core based test are unlikely to give any cost advantage. However significant developments are happening.
Within the IP developers there are emerging standards to demonstrate that IP blocks are "DFT friendly". For example the Virtual Component Exchange (VCX) mandates that IP developers must demonstrate that synthesis, scan insertion and ATPG have been successfully run on IP blocks using an example target technology. Another example is the recently announced MIPS32 4Kc synthesizable processor for whi ch scan insertion, ATPG and fault grading scripts are provided.
Complex IP blocks frequently have large numbers of asynchronous clock regimes. To support these there are new ATPG dynamic compaction algorithms that generate very efficient vector sets without the need to use a single clock in test mode. The requirement to have a single test clock usually creates significant problems in fixing all possible hold violations and achieving timing closure.
The problem of growing vector sizes and test times will continue to be a key issue. To address these there are new DFT/ATPG techniques that add on-chip decompressors and compressors to reduce the volume of externally stored test vectors. These exploit the high percentage of "don't care" terms in generated vectors that have conventionally been filled with fixed or random values. These techniques have significantly less impact on the design than logic BIST and are now available from the commercially available DFT/ATPG tools sets.
The requirem ent to handle steadily increasing chip sizes will also lead to increased use of large "server farms" to provide computational power. Also parallel ATPG will be introduced in the form of algorithms that split the fault list and generation task between processors. Alternatively configurable scan will be used to split the design into a small number of separate blocks, the test generation can then be run for each block from the top level of the design little or no loss of efficiency. A small amount of additional vectors will then be generated to cover interconnect between the blocks.
Overall, the future of SoC testing will be based upon the use of more sophisticated ATPG algorithms which impose fewer restrictions on the design. The back-end design process is becoming more difficult with signal integrity and timing closure issues. As this happens it is important to impose less DFT logic on the design and so minimize its impact. More sophisticated algorithms and more computational power will address the pr oblems of generating vectors. Most importantly this task can be performed after the final chip netlist is available and does not need to be completed before tape-out to mask making. This will reduce the impact on the overall project timescales and costs.
The test community must finally learn that in the world of DFT, like the world of fashion, "less is more".
Richard Illman is a Chief Consulting Engineer with Tality, based in Livingston, Scotland. He has over 15 years experience in the field of Design-for-Test, initially in the mainframe computer industry and for the last four years in the area of design services. He has published a number of papers on DFT and in 1989 he received the "Best paper" award from the International Test Conference. He is a senior member of the IEEE.