By Douglas Goodman, James Hofmeister, Justin Judkins, PhD Ridgetop Group Inc.Tucson, AZ USAAbstract :
Over 81% of new digital designs utilize Field Programmable Gate Arrays (FPGAs). With FPGA packages exceeding 1,000 pins, with Ball Grid Array (BGA) solder bumps providing the interconnect, it is vitally important to make solid contact with the Printed Circuit Board (PCB).
Unfortunately, these solder bumps are subject to the effects of aging, which encompass cracking of the solder ball, oxidation within the crack, and eventual failure of the connection. Although the failures are metallurgically-based, the effects can be detected electrically using a Verilog softcore and in-situ sensor during the power-on self-testing sequences.
Ridgetop will describe its work in providing a self-contained built-in self-test (BIST) solution that is effective in detecting the degradation and eventual failure of the FPGA.1. Introduction : Growth in Use of BGA Packages for FPGAs
In recent years, Field Programmable Gate Arrays (FPGAs) have advanced greatly in their capability. The corresponding needs for I/O on these devices have compelled the manufacturers to utilize high-density ball grid array (BGA) packages, now with configurations exceeding 1,000 pins with very narrow solder bump pitches. According to a recent EE Times Survey, over 81% of new digital designs incorporate FPGAs .
BGAs fit the FPGAs into a smaller footprint, decreasing pitch spacing, by utilizing an array of solder ball connections. This allows for a higher density of I/O connections than other package types. The result is a considerably smaller finished package size.
While BGAs feature shorter electrical path lengths which reduce inductance, the critical solder bump connections are subject to aging and eventual failure from a cracking, oxidation, and failure progression. Ridgetop has developed a built-in self-test (BIST) core that can be used to detect impending problems on FPGAs, so that mitigating actions can be taken. Figure 1: Typical Ball Grid Array Package (from Chip Scale Review)
The top-down view of a high density package is shown in Figure 2 where extremely high pitches are involved.
Figure 2: Top-down View of Solder Bump Mapping. The solder bumps can be 0.6mm in diameter and placed on 1mm center to center.2. Points of Failure: BGA Solder Joints
The following is a list of some points of failure associated with FPGAs in BGA packages: (1) the junction between the FPGA package and the solder ball, (2) the junction between the solder ball and the printed wire board, and (3) the wire bond lift off or break within the FPGA package.
The most typical failure point is the junction between the FPGA package and the solder ball as depicted in Figure 3.Figure 3: BGA Solder-Joint, Solder Balls, and Typical Failure Point
A typical failed joint using results from Auburn University is shown in Figure 4.Figure 4: Solder Bump Cracking (from Auburn University)
While the high-resistance spike profile shown in Figure 5 is simulated, the results have been verified by Highly Accelerated Life Test (HALT) experiments conducted at the Center for Automotive and Vehicle Electronics (CAVE) at Auburn University, with whom Ridgetop maintains a research partnership. Figure 5: Solder-Joint Resistance Profile Showing Presence of High-Resistance Spikes3. Definition of Failure
One industry standard for defining a BGA package failure during a test involving thermal cycles, with or without accompanying physical stresses, is given below:
An event is defined as the occurrence of a high-resistance spike of 300 ohms or higher for a duration period of 200 nanoseconds or longer.
A BGA package failure is defined as the occurrence of 10 or more events that occur within 10 percent of the time (number of thermal cycles) of the first event.
By this standard, a BGA assembly that exhibits intermittent, high-resistance spikes as shown in Figure 5 is deemed not to have failed because (1) the first event occurred at about 700 cycles and (2) at no time during the remaining test cycle did the BGA package exhibit 10 events within 70 cycles; in fact, a total of only three events occurred, as five of the relatively high-resistance spikes are, by definition, not events. The Ridgetop solder-joint fault detectors are able to detect high-resistance spikes of 10s of ohms.
The occurrence of a single high-resistance spike is indicative of solder-joint damage, and the package has reduced reliability. The use of that component might not be prudent in a mission-critical situation. Because solder-joint fatigue damage is cumulative, the occurrence of just a couple of high-resistance spikes, and perhaps just one, indicates that the PCB assembly is likely to exhibit intermittent failures, especially when subjected to stresses associated with vibrations such as with automotive or vehicle applications.4. Ridgetop’s BIST Approach
Ridgetop’s test approach was to provide a hardware/software BIST solution. The first step is to use otherwise unassigned pins on the FPGA in the test structure. These pins become the monitor pins and represent a precursor to a more serious impending failure of the FPGA.
Verilog softcore was written that can be incorporated into the built-in test backbone of the device (to be used possibly on power-up) and a small capacitor that accompanies each of the unused test points on the package, as shown in Figure 6. The softcore provides for the generation of the test sequence shown in Figure 7. Figure 6: Test Points - Test points are usually unassigned pins at the ends of the package, where maximum bending point occurs.
The test code generates a pulse train that is used to detect the 1’s and 0’s. If there is a high-resistance path through the solder ball, indicating a high-resistance or open, then the reflected pulse amplitude will be low, as shown in Figure 8. This is a detected fault at the outer sacrificial pin. It is inferred that the inner pins (where there are programmed cells critical to the FPGA operation) will fail later. This approach provides an early-warning electronic prognostic of the impending failure of the entire package.Figure 7: Test Code Generation Example
Figure 8: Write/Read Pulse Analysis - A low response indicates a high-resistance path, which is indicative of a faulty solder ball.
Actual measurement results taken are shown in Figure 9. Figure 9: Measurement Results Showing Applied Strain and Intermittent Continuity through the Solder Ball5. Electronic Prognostics and the Reliability Bathtub Curve
Failure probability for a device typically follows a bathtub curve (Figure 10), with three distinct regions:
Figure 10: Bathtub Curve Showing Failure Rate with Time, along with Remaining Useful Life (RUL) Period
Infant Mortality (Burn-In) Region. High probability of failure, related to manufacturing defects.
Useful Life Period. Minimum, almost constant, failure rate caused mostly by temperature, vibration, and voltage stresses.
Wearout (End-of-Life) Region. Due to wearout or fatigue mechanisms, probability of failure increases.
The failure of an edge pin on the FPGA represents a precursor event to an impending failure of the FPGA. The fault to failure progression depends on the package type, the type of solder used, and the number of pins used. Ridgetop is continuing to gather statistics that will provide better estimates of the RUL period for a series of intermittent spikes occurring on the FPGA packages.6. Summary
Ridgetop has developed an effective means of testing the integrity of Ball Grid Array (BGA) packages as used in FPGAs and embodied this technology into a product called InstaBIST™ BGA. This IP consists of a Verilog Core and Documentation to enable ongoing testing of the BGA packages that are deployed in critical applications. The BIST approach described is an effective means of detecting faults in BGA packages. Patents have been filed covering this technology and commercial licenses are available.Acknowledgements
Ridgetop would like to acknowledge the assistance of the NAVAIR/JSF Program Office, Raytheon Corporation, and the Center for Automotive and Vehicle Electronics at Auburn University for their support of this work.References
 D. Love and D. Towne, BGA Reliability Characterization Project Temperature Cycling Tests, Sun Microsystems, Palo Alto, CA, Jan. 1999.
 PEM Qualification Requirements for Radiation Hardened Non-hermetic Products Qualifiable for Space Flight Applications.
 D. Goodman, et. al., “Practical Application of PHM/Prognostics to COTS Power Converters,” IEEE Aerospace Conf., Big Sky, MT, Mar. 2005.
 J. Hofmeister, et. al., “In-Situ, Real-Time Detector of Faults in Solder-Joint Networks Belonging to Operational, Fully Programmed Field Programmable Gate Arrays (FPGAs),” IEEE Autotestcon, Anaheim, CA, Sep. 2006.