Radiation-Induced Soft Errors In Memories And Logic Eliminated On Chip
SANTA CLARA, Calif. -- April 26, 2002 -- At this week's 40th annual International Reliability Physics Symposium Peter Rohr, vice president of business development for iRoC Technologies, described the need to fully protect both logic and memory circuits from transient errors and radiation induced soft errors. His presentation included the solution iRoC is pioneering, embedding self-correcting functionality directly on the chip to eliminate the soft error problem before it occurs. This approach is the only semiconductor intellectual property (IP) available today for cost-effective, on-chip protection from transient and soft errors.
Speakers at the "Radiation Induced Soft Errors in Silicon Components and Computer Systems" session represented Texas Instruments, Infineon, HP, Compaq, Inovatia Labs, IBM, Intel, Sun Microsystems, Sandia Laboratories, JPL and iRoC Technologies.
Industry issues addressed included:
In memories soft errors are a growing and well known problem while for DRAMS, the sensitivity seems to decrease somewhat for smaller layout geometries due to clever designs that minimize the capture volume. On the other hand, for SRAMS such a decrease in soft error sensitivity has not been observed. SRAMs are, therefore, the biggest concern at the moment. However, for both types of memories, the biggest increase in Soft Error Rate (SER) will come from the incredibly growing level of integration with the resulting number of cells. "The biggest challenge in convincing designers and users of VLS IC-based systems of the soft error threat is its total randomness in occurrence. This even though one soft error-induced crash or one non-traceable error in data per day of a high-reliability, multiple server system would be unacceptable," said Mr. Rohr of iRoC Technologies.
Soft errors in logic are increasingly viewed as a problem for VSDM designs but few designers are implementing solutions.
SOI will bring limited benefits especially as layout dimensions continue to shrink.
While alpha particle induced soft errors seem "under control" at present, alpha sensitivity is growing for smaller layout dimensions.
The best method to determine whether a random failure is indeed coming from soft errors is to measure total randomness of crashes. Unfortunately, this is often difficult to measure, especially in an un-accelerated test at sea level.
Cosmic rays can cause semi-hard errors, requiring power down.
Strong concern was expressed for the need for a better understanding of the soft error phenomena since the whole design methodology, vendor selection, and architecture decisions need to be made now for designs up to 4 years into the future. This amplifies the need for a more fundamental understanding through better modeling.
Electronic systems designers are looking for a clear method to measure the existence and frequency of soft errors at various geometries in their designs to enable them to pursue the most cost effective and scalable CMOS solutions.
Transient and Soft Errors in VDSM Chips
It is traditionally thought that soft errors affect semiconductor memories, that alpha particles are the cause and that the problem is best eliminated through carefully selected processing and materials because other methods result in too much performance penalty and/or silicon overhead.
In his tutorial on April 7, Mr. Rohr explained with actual test results that in today's VDSM technologies the logic parts of an IC design are just as vulnerable to soft errors as the memory parts and that design techniques exist that are extremely cost effective and do not result in a performance penalty while also eliminating signal integrity problems.
iRoC, a leader in the field of Integrated Circuits Reliability, provides the leading price/performance design solutions for Fault-Tolerant and Fail-Safe functionality. By embedding these functionalities at the chip level, designers are able to fill the reliability and security gap left by traditional system level or software protection methods. iRoC's expertise covers SoC embedded buses, memories, CPUs and custom logic protection. Several IPs, including the SPARC 8051 CPU and an ultra low power microcontroller, are already available to minimize cost and accelerate technology adoption. The iRoC soft error protection solution improves system reliability, quality and security using standard CMOS processes while minimizing semiconductor overhead.
SPARC Test Results
On January 28, iRoC announced results from radiation tests that prove embedded intellectual property (IP) for error protection could be added to a system-on-chip (SoC) without decreasing high performance. iRoC's code in memories and logic circuits added no extra delay compared to the genuine die SPARC V8 and enabled memory reliability with a 100% efficiency of its specific robustness architecture. The ROC S81 synthesis report is now available on iRoC web site at www.iroctech.com.
If and when systems fail because of signal integrity or radiation issues, it is not generally a deterministically predictable event, but a random failure and, therefore, may not be easy to pinpoint. Therefore, for applications requiring a high level of reliability such as high-end computing, secure networking systems and money transactions such as with SmartCards, such failures can be dramatic and costly.
Critical Need To Protect ICs From Transient Faults
Most of the time transient errors like soft errors or cross coupling effects are not fixed on silicon, and are not believed to create hard errors or system crashes at sea level. Memories are the first devices to be protected if any and logic networks are typically believed to be immune to 0.13 µm technologies. Logic blocks take much less area and are less susceptible to be hit by a strike. Soft error characterization, simulation and measurement tools have been focused on solutions for memories and not for logic blocks whose lack of protection comes more from a lack of tools and methodologies than from physics and design issues.