Quest for Reliability: Reliability Starts at the Bottom

Well, folks, it’s my time to shine—or, at a minimum, be less dull. This is the reliability issue, and that apparently is my quest, according to the title of my column. I’m starting to wish my column were titled “Quest to Eat All the Pizza you Want and Not Gain a Pound,” but reliability it is. 

For those who might be new to my column, I work for an independent electronics laboratory that deals with root-cause failure analysis and product qualification of electronic assemblies. That includes all the parts and materials that go into that process. It also means that, on a regular basis, we see failed electronics discovered at in-circuit testing all the way to a product that has been in the field for many years. 

In a nutshell, I can tell you it is much cheaper to perform product-specific reliability testing before the product goes into the field. If you find out after release that you have to work backward to discover the issue and determine whether everything in the field is at risk of failure and recall, then you still have to go back and do the testing that should have been done in the first place. The monetary cost of a recall can be more than the project was worth in the first place if you look at repairing or replacing products, and that doesn’t even consider the other costs associated with a recall like a possible future business with that customer. The most important factor of your product may possibly be related to something that people need to stay alive. 

One of the best examples of a recall being detrimental in all aspects is Takata automotive airbags. While it was not directly related to what we do in the world of electronic hardware, it speaks to the need for extensive reliability testing before release. The biggest cost associated with that recall is, of course, the loss of human life, but in the business sense, it cost Takata more than $24 billion and—in the end—the company itself. If the PCBA that controls your pizza oven goes out, the stakes are much lower (debatable) but most likely still could have been rooted out with proper upfront reliability testing. 

This month, I plan to share some testing recommendations based on failure analysis, as well as lessons learned from a few of our customers over the years using case studies and data on failed units. Make no mistake, I will focus a lot on cleanliness and how it relates to reliability. “Write what you know,” they said, so that’s my plan. 

What does reliability even mean? According to the all-knowing internet, reliability is “the quality of being trustworthy or of performing consistently well.” I think that pretty much sums it up from the 50,000-foot view. When we get a little closer to the ground, we need to expand that to refer more specifically to the class of product being manufactured. Reliability and Class 1 don’t really overlap in the big Venn diagram of quality; it will most likely work when it goes out the door. That’s about it in a lot of cases. 

When looking at Class 2 and Class 3 hardware, there is most certainly a need to focus on reliability. According to IPC-A-610, “Class 2 Dedicated Service Electronic Products include products where continued performance and extended life is required and for which uninterrupted service is desired but not critical. Typically, the end-use environment would not cause failures” and “Class 3 High-Performance Electronic Products include products where continued high performance or performance-on-demand is critical, equipment downtime cannot be tolerated, end-use environment may be uncommonly harsh, and the equipment must function when required, such as life support or other critical systems.”

What this tells me is that not all reliability is equal. When it comes down to it, there are minor differences between these two classes of electronics. Outside of some high-end exotic assemblies, most parts and assembly processes are used for both classes. The biggest difference is what happens if it fails. It’s literally a matter of life and death in some cases. Sorry, I didn’t mean to bring you down there, but it is important to remember that. The good news is that most companies building those types of electronics are on top of it with testing that would not be required for many Class 2 assemblies. 

Enough of the pseudo-philosophical electronics talk; let’s get down to it. Approving a new supplier for any part of your process is a major key to reliability because you need to know that the bare board and components aren’t going to also supply a surprise down the road. Let’s start at the bare board level. When it comes to guidance, anything that is agreed to between the user and the supplier will dominate any requirement from any other source. 

In lieu of any internal guidance, most companies lean on IPC-6012: Qualification and Performance Specification for Rigid Printed Boards. Looking at the applicable documents specific to PCB manufacturing, there are 23 test methods within the TM-650, 35 related documents, and another 18 joint industry and other association documents. That is a lot of information for those who need it and should cover pretty much every conceivable combination of materials. 

In no way am I suggesting you need to review each and every one of these documents, but they are there either way. If you start with IPC-6012, you can go pretty much anywhere in the testing realm, but not all tests are required—or even necessary—for new supplier approval. Some of the parameters to test for include plating thickness on PTH barrels and pads, solder mask cure, conductive anodic filament (CAF) resistance, and cleanliness, among others. Let’s look at what some of those tests are looking for and the possible reliability issues tied to those. 

I’m going to start at layer one of the PCB fab process. Quality really does start there, and each subsequent step adds another opportunity to screw it up. The CAF test is used on bare fabs to determine whether process chemistries are present on the inner layer of the PCB that will produce electrochemical migration. This is the same as dendrite growth found on a fully populated PCBA. No matter where it occurs, if you have conductive residue, moisture, and potential, you run an elevated risk for electrical leakage and dendrite growth. 

The CAF condition is greatly aided by poor resin flow, creating dry weave that will absorb plating chemistries and allow them to bridge anode and cathode. The IPC test for CAF is found in TM-650, 2.6.25. This is an environmental test that is normally done on test coupons manufactured by your PCB supplier using the same materials you plan to use for normal production. The test boards are subjected to elevated heat and humidity for at least 596 hours under bias. In Figures 1 and 2, you can see dry weave facilitating CAF that will render any subsequent processing steps meaningless.

Scanning electron microscopy (SEM) and energy-dispersive X-ray spectroscopy (EDS) are among the best analytical tools for investigating CAF. By using SEM/EDS, you can determine the composition of the material and compare it to the base metals being used for barrel plating. Figures 3 and 4 show SEM and EDS examples. If it matches, you have CAF and need to work with your PCB fabrication supplier to optimize the process.

In your effort to optimize the bare fab process, you need to know what the contaminations are that facilitate the CAF. For that, you want to use ion chromatography. That will tell you exactly what ions are present and at what concentrations. Those results can be matched back to chemistries used in the plating process, and then the optimization is focused and can happen a lot faster in most cases. The IC data in Table 1 shows typical ionic content from an inner layer cleanliness issue, high levels acetate, sulfate, and sodium residues. These ions are normally found in plating chemistries and suggest that the final rinse is insufficient to completely remove all the residues.
Ion chromatography should also be used on normal production PCBs to determine the level of cleanliness on the outside surface. If IC is to be used for process monitoring, you will want to perform global extractions for baseline data. Localized extractions over concentrated distributions of plated through-holes, over-plated pads, and overly-bare solder mask areas should all be considered to get the clearest idea of just how clean each of those parts of the process is. 

Solder mask cure is another critical parameter that should be examined. When a mask is properly cured, it will exhibit a continuous smooth texture, like a marble countertop. If the solder mask is under cured, the surface will be rough with nooks and crannies, like an English muffin. The same way that muffin will hold delicious butter and jam, the solder mask will hold flux, wash chemistry, and other processing residues. 

camden_fig5.jpgThe IPC test methods related to solder mask cure are 2.3.23B and These are chemical tests that use drops of methyl chlorine or methyl chloroform on the solder mask, followed by using a wooden spudger to see if you can scratch the mask. If it easily scratches, give it a “cure bump” with either UV or thermal exposure and then repeat the test. If the mask is then unaffected, you can go back to your supplier and have them adjust their cure profiles. Uneven solder mask coverage can expose the base metals to less than optimal environments, and that alone can be enough to cause issues like corrosion (shown in Figure 5). There are many different tests specific to bare boards, so it’s a good idea to consider the end-use environment, warranty period, and any other product-specific details to determine which test is most applicable for your product.

Many of the same processes used for the plating of bare boards are also used for component leads. Both processes use chemistries that can increase the risk of corrosion or issues related to electrochemical migration if not fully removed. This happens with components when those chemistries find a way up into the die area, causing corrosion and dendrite growth. This can easily happen when there is a small gap at the overmold/lead frame interface  
(Figures 6 and 7).

camden_fig8.jpgEven if the residues don’t make it all the way to the die, they can be present on the edge of the package body between the leads and propagate electrochemical migration (Figure 8). 

If you are using a fully no-clean assembly process, you can’t rely on an end of the assembly wash process to remove any of those residues. This is when you can use IC analysis to determine the effectiveness of the component wash process. Standard bag extractions will detect any elevated levels of ionics on the outer surfaces. For internal surfaces, Parr Bomb analysis is a pressurized extraction for harvesting possible plating residues that have been absorbed into the component overmold material, down to the lead frame. This can be done without any physical damage to the component. Much like the bare boards, components bring their own inherent reliability risk before the first part is soldered. 

Now that you know how those raw parts can impact your reliability, let’s put all those pieces together. Per IPC J-STD-001 Section 8, you need to qualify an assembly process using SIR per TM-650 on test boards to show how well the CM is processing the proposed set of materials and what impact elevated heat and humidity have on electrical resistance. This test is the bare minimum that needs to be completed to verify the assembly process. Contract manufacturers need to do this testing to generate objective evidence that can be applied to process monitoring analysis. As most people know by now, the historical acceptance criteria of 1.56 µg NaCl equivalence per square centimeter has been given the old heave-ho, and rightfully so. If you want some details on how and why that criterion was removed, I recommend reading IPC-WP-019: An Overview on Global Change in Ionic Cleanliness Requirement. (Spoiler alert: It should never have been used as it has been.) 

Here is an example of how a CM can generate objective evidence and use it for process monitoring. If a company needs to qualify a product with a new customer, and the plan needs to include monitoring the approved assembly process, they choose test coupons that are most representative of their final product based on the mix of SMT and PTH components. They then assemble boards using the proposed combination of materials and equipment to be used for the final product. Along with two bare reference samples, the assembled test boards are tested per IPC If they pass that test, they are tested with ion chromatography to determine the average levels of specific anions, cations, and WOA to create baseline data. 

Next, they build a set of 20 samples of the actual product. A set of 10 boards are tested using ion chromatography with global extraction. The second set of 10 is tested in a ROSE tester. The average of the ROSE test results is the acceptance criteria used on a per shift basis. Remember that the number is being derived from your ROSE tester and can differ from another machine of the same make and model. It doesn’t really matter if that number is 1 or 101 µg NaCl equivalence per square centimeter. That number has been verified with other testing. Often, IC testing is done on a quarterly basis for further evidence of process control. The quarterly test results are compared to the baseline. 

Some customers will also perform elevated heat and humidity exposure testing with normal operating voltages to further validate the acceptance criteria. This is known as temperature-humidity-bias (THB) testing and is similar to SIR testing. THB testing is done on actual products using normal operating voltages and duty cycles. This is one of the most important tests to consider because while the test coupons are considered predictors of performance, a lot of things change when it’s the real deal.

A large percentage of reliability failures we see are tied to cleanliness. In this column, I have addressed bare board, raw component, and test board assembly cleanliness, but those are only three sources for contamination out of a much larger number of options. Anything that can come into contact with the PCBA, either directly or indirectly, is a possible source of contamination. You must consider testing everything around the PCBAs, such as housings, large connector bodies, and any other number of materials. 

We see a lot of failures that have good objective evidence of their assembly process, but because they were only testing the PCBAs, they don’t see the full picture. Materials like mold release on metal and plastic housings can be very ionic. If enough atmospheric moisture is available, it will collect at a low point and drip down on the board. That moisture can contain high levels of ionic content from the housing interior surface. 

camden_fig9.jpgWe have seen vibration dampening foam be extremely high in ionic content that was pressed directly against the surface of the PCBA without doing any cleanliness testing on the material (Figure 9). This was used in an under-hood application and not hermetically sealed. This was done on purpose by someone getting paid to make those types of decisions. It can happen to the best of us. With any luck, someone reading this right now will start to think about every part of their product outside of just PCBA manufacturing that can impact their product reliability. 

I have barely scratched the surface on reliability testing, as so many are product-specific. Some products require a lot of vibration or extreme temperature exposure testing, but what I have covered applies to every product foundation. The title of my column this month is “Reliability Starts at the Bottom,” and by that, I mean at the base of every electronic product. At a minimum, you must have components, bare boards, and assembly materials to build an assembly. You must be able to confirm that you are building on a reliable base. 

I reference IPC test and guidance documents frequently for a good reason; they are compiled by industry experts and, when followed, will more than likely yield a reliable product. I also frequently say that IPC has no idea what you are building specifically, so it’s imperative that you own your product. By that, I mean which tests are required for both initial acceptance and ongoing process monitoring? Being product-specific with your requirements might go above and beyond what IPC—or any other industry association—recommends, but it’s the best thing you can do for your product’s reliability. And isn’t that what it’s all about? Well, that, and pizza, of course.

This column originally appeared in the September issue of SMT007 Magazine.



Quest for Reliability: Reliability Starts at the Bottom


It is much cheaper to perform product-specific reliability testing before the product goes into the field. Eric Camden shares some testing recommendations based on failure analysis, as well as lessons learned from a few of our customers over the years using case studies and data on failed units.

View Story

Quest for Reliability: Correlating COVID-19 With Reliability?


I submit this month’s column from my secure bunker while safely—and smartly, if I may say so myself—practicing social distancing. The word quarantine is more “popular” than ever in that I hear it upward of 4,562 times per day. Before COVID-19, the first thing that popped into my mind when I heard the word “quarantine” was the cages in the receiving area for non-conforming products or similar spaces for built hardware that doesn’t pass some sort of inline test.

View Story

Quest for Reliability: New Solder, Same Old Testing


Solder is inarguably one of the required building blocks for electronic assemblies and, apart from a few exotics, every assembly in the world has it. When it comes to meeting the lead-free requirement, opinions and historical reliability data are not taken into consideration. Eric Camden explores testing and reliability related to solder.

View Story

Quest for Reliability: Improving Reliability for Free


Eric Camden has seen more than a few factories make the move to use more and more automation that has indeed improved production numbers but has done very little to address cleanliness and reliability. In this column, he offers up a few easy steps you can take to reduce risks.

View Story

Quest for Reliability: Big Trouble Comes in Tiny Packages


When it comes to making consumers happy and electronic assemblers miserable, nothing achieves both quite like miniaturization. With our ever-increasing demands to house a full-size movie theater with surround sound and limitless digital storage in the palm of our hands, the only way for CMs to respond is with miniaturization (and cursing—lots of cursing). In this installment, I’ll revisit the history of shrinking packaging and lessons learned.

View Story

Quest for Reliability: Sunshine and Circuit Boards


IPC APEX EXPO may be over, but this column by Eric Camden serves as a great introduction to IPC standards. If you've been thinking about getting involved with manufacturing and assembly standards but weren't sure how to go about it, this column is a must-read for you.

View Story


Quest for Reliability: Voices Carry


The title of Eric Camden’s column this month is “Voices Carry,” so not only is it a great chance to revisit the wonderfully written, top-10 hit song by ‘Til Tuesday/Aimee Mann, but it is also a good opportunity to share the voices of modern electronics and electronic assembly processes.

View Story

Quest for Reliability: Old Dogs, New Tricks


I hear two phrases way too often on a production floor: “We have always done it this way,” and its first cousin, “We have been building this board for 20 years and never had a problem.” Inevitably, these phrases are always uttered by a “seasoned” engineer in the industry that probably should know better. Don’t get me wrong, these phrases are going a long way in my effort to send two kids to college, but they aren’t very helpful regarding reliability. Times change, and technology changes even faster, and if you don’t keep up, you will be left behind. This means focusing on emerging technologies and the associated risk that may be unique to that package.

View Story

Quest for Reliability: Artificial Reliability Over Intelligence


As the industry begins to shift from standard design tools to artificial intelligence (AI), reliability might be overlooked in an effort to build “smarter.” Over the last few years, the desire to manufacture anything and everything for less has included removing humans from as many positions as possible. There are a couple of viewpoints, and I can see positives in both.

View Story

Quest for Reliability: Reliability by the Book


Having been in electronics for just shy of 20 years, I can say that the next time we work on a Class I failure analysis project, it will pretty much be the first. Class I electronics serve a different purpose in life, and if they fail, it’s normally not a big deal; instead, it’s mainly a minor inconvenience. In this month’s column, I’ll speak to specifications for Class I, II, and III products per IPC definitions as well as the IPC standards process.

View Story

Quest for Reliability: SMTAI 2019 Thoughts


Before I headed to Rosemont, I was a little skeptical if it would be worth it for me, considering the lack of task groups that had become my SMTAI/IPC APEX EXPO focus. But after three days of sessions (and a somewhat impressive third-place showing at the SMTA trivia night), I was reminded of why I went to SMTAI in the first place: to learn about the newest technology and how to address age-old problems that are ever-evolving in this era of miniaturization.

View Story

Sealing Your Fate


Coating does not always prevent failures; it is just as important to look at your cleanliness levels just as you would with an assembly that is not bound for coating. If you have a dirty assembly, you might be buying a little time, but ultimately, you've sealed your own fate.

View Story

Quest for Reliability: The F Word


The word "failure" is as nasty as it gets in our world. It goes against everything we thought we knew. All contract manufacturing facilities strive to build a reliable product, or at least they all should. The problem is too many companies hope they are building reliable products without doing the work required to ensure they are.

View Story

Quest for Reliability: These Darn Kids/Back in My Day


This month’s topic is focused on youth, both in terms of humans and technologies. I think these two topics go together since they rely on each other to a large degree. The latter has more than likely shaped or even invented by the former.

View Story

How Smart Is Your Factory?


When you plan a production facility with the mindset that connectivity and optimization will be key aspects of your operation, it will pay dividends in the form of lower production cost, better traceability, and higher reliability.

View Story

The Cost of Quality and the Higher Cost of Failure


If you are shopping a new product around to multiple contract manufacturers (CMs), and if all other things in two separate CMs are equal including price and delivery times but one offers a more comprehensive ongoing quality monitoring system, why wouldn't you go with that one? You usually pay some type of premium for the CM that has an overall quality monitoring system that goes beyond just ICT or bench level testing. Definitely, most CMs will give you some sort of assurance that the product is working as it leaves the facility, but if one has a mindset that more than basic testing is required to show reliability, you will more than likely have fewer field failures.

View Story


Does Medical Device Reliability Worry You Sick?


When you are manufacturing high-reliability assemblies related to medical industry, it is critical to take a very close look at the assembly process and all other processes that can influence the end-use reliability—even seemingly unrelated processes, such as post-installation cleaning—as it really could be a matter of life or death.

View Story

Are You Connected to Reliability?


The need for communication between every operator on the manufacturing floor can be a critical difference between a reliable piece of hardware and one that presents some level of unexpected performance. This column highlights a few things happening in the shop floor, such as as touch-up soldering and third shift issue, not commonly communicated, which can cause performance issues.

View Story

Are Megatrends Putting Your Product at Megarisk?


It took 38 years for radio to get 50 million users, television made it in 13 years, Internet in four, iPod in three, and Facebook in only two years. What these numbers mean to our industry is the need to create electronics at blazing speeds that we haven’t seen before. But how will it affect reliability? Read on.

View Story

Cleaning a No-clean Flux: The Worst Decision You’ve Ever Made?


There are a few reasons to choose to clean a no-clean flux, such as when the PCB assembly requires conformal coating, or when probes are required for testing. Other than that, there seems to be no need to clean a no-clean flux. This column tells you more.

View Story

Contamination: The Enemy of Electronics


Welcome to the first installation of “Quest for Reliability.” The goal behind this column is to use my experience at an independent laboratory for over 18 years to help readers understand PCBA reliability issues, and more importantly, prevent suspect conditions in the first place. The laboratory I work in has served every sector of the electronics industry, from oil and gas equipment designed to function miles below the surface of the earth, to aerospace companies and everywhere in between.

View Story
Copyright © 2020 I-Connect007. All rights reserved.