Accelerate Stress, Increase Productivity
Depending on whose job it is, accelerated stress testing (AST) can be thought of as preproduction design verification, production screening, temperature cycling or pro-active product monitoring. It is all those things and more. In any context, it is product specific and should be continually refined with historical data.
Cost-reduction and quality-assurance programs drive AST. As part of a production automation program, AST improves product consistency and reliability by reducing the number of uncontrolled manufacturing variables, such as manual tasks performed by operators of burn-in and stress-testing stations, which are often referred to as HALT/HASS test systems. In many plants that make multiple products, AST chambers and measurement equipment are required at the end of each production line. To hold down test costs, these widely distributed chambers can be controlled from a central location via an Ethernet data communications network. Remotely collected data is then stored, analyzed and redistributed to other users.
AST payoffIt seems logical to expect reduced warranty costs when fewer products must be replaced or repaired after shipment. When repair and replacement costs are high, it makes sense to perform rigorous AST on 100% of the products before shipment, especially mission-critical products that need to work reliably for long periods. For example, auto powertrain and chassis electronics must last for 10 years in harsh conditions in cars. Power supplies and AC-DC converters must operate reliably in computers and other equipment that run 24 hours per day, seven days per week. Hermetically sealed devices, such as heart pacemakers, must operate flawlessly in extreme humidity and temperature conditions to protect the life of the user. Aerospace and avionics equipment must function correctly to avoid putting an aircraft and the lives of its crew and passengers in jeopardy. In less dramatic applications, AST reduces customer cost and improves goodwill.
Effective AST programs should have a favorable impact on OEM development costs. By quickly uncovering problems associated with product and process designs, AST shortens engineering and manufacturing start-up times. This is accomplished with techniques that shorten the time needed to identify and correct potential causes of product failure. From a marketing perspective, AST provides a competitive edge by helping companies get reliable products into the hands of consumers ahead of other OEMs.
Because AST requires capital investments, it forces an evaluation of how engineering and production equipment is being used. A well thought-out AST program should identify ways to test more product with less equipment by using versatile instruments and switching systems on multiple devices under test (DUTs). This also leads to maximum utilization of expensive manufacturing and test systems, whether they exist already or are purchased as part of the AST program. Although dozens of test chambers may be distributed all over the factory floor, the net result is reduced costs and a better return on investment (ROI), which more than repays the initial AST investment.
A is for acceleratedThe theory behind AST is embodied in a concept called the product reliability curve, also known as the bathtub curve because of its shape. The curve has three distinct failure rate regions. The early failure period, also called infant mortality, has a decreasing failure rate and is associated with built-in defects. These defects can often be identified by AST. Three major sources of early failure are design, components and manufacturing. Whether early product failure is caused by a flawed design, incorrect component selection or faulty production processes, removing these flaws has the effect of deepening and widening the bathtub curve. Choosing the appropriate stress level allows AST to accelerate failures. Taking frequent measurements during AST reveals these failures and helps pinpoint the cause. In general, the higher the stress applied to a product, the sooner it will fail because of built-in defects.
Standard end-of-line functional tests usually do not reveal subtle design and manufacturing flaws that cause early failures. Under actual use conditions, some of these flaws cause intermittent failures that are difficult to diagnose, resulting in costly warranty claims and bad customer relations. Using subtle voltage and current measurements during AST monitoring make it possible to spot these “soft” failure modes.
The bottom of the bathtub curve is characterized by scattered failures attributed to random component failures, isolated cases of flawed assembly, occasional material defects and other random phenomena. Even so, AST can identify patterns that can help locate the sources of these flaws. Eventually, good products fail because of wear and tear, represented by the increasing failure rate at the right-hand side of the bathtub curve. Depending on a product’s wear-out mechanism, AST may or may not identify basic design and component characteristics that affect “normal” life.
To accelerate product failure, stress levels much higher than those found in normal product usage must be used. Also, the right kind of stress must be applied because different conditions such as temperature, vibration, humidity and thermal shock have varying levels of effectiveness in screening for early failures. Temperature, for instance, has a 40% effectiveness rating and humidity an 18% effectiveness level. Although humidity and vibration are significant, AST programs for many products gain little from combining these and other stress tests with temperature cycling.
An effective AST program may require experimentation to determine if integrated temperature-humidity or temperature-vibration cycling will reveal defects earlier or find different types of early failures that temperature alone would not reveal. The DUT must be thoroughly analyzed to make this determination. It is important to take advantage of the in-house expertise available by asking design and production engineers questions concerning product characteristics, applications and manufacturing methods. Also, consult with outside sources, such as the Institute of Environmental Sciences and Technology (www.iest.org), who publish guidelines and standards needed to set up an effective AST program.
Establish parametersA test strategy should not only encompass stress parameters, but also:
- Stress chamber requirements.
- Measurement physics such as methods, parameter values, resolution and accuracy.
- Instrument features such as control, triggering, firmware and data communication interfaces.
- Switching assemblies for concurrent multipath testing.
- Other system elements such as power supplies and sources, PC controller, OS, I/O support, racks, fixturing, cabling, plumbing, software and documentation support.
When experimentation or analysis shows that temperature cycling alone (without humidity or vibration) will quickly reveal defects, this simpler test regimen reduces development time, results in less complicated test operations, and lowers capital costs. To make temperature cycling the principal focus of AST, it is necessary to choose a temperature high enough to cause failures within a few hours or less. One guideline commonly applied is the ˚C rule, which establishes the ambient temperature rise that causes product life to be cut in half. A temperature rise of 2x cuts life by one-fourth, and so on. An Arrhenius model or similar statistical method is used to establish the best test temperature and predict product life. Whichever model is used, be sure to stick with it to allow making valid data comparisons over time. The goal is to create a model that accurately describes how elevated temperatures affect product life, and then use this to minimize test time while assuring reliable products.
The temperature change profile is also important as it may affect failure rate. This can involve cycling with a slow ramp, a rapid step increase (temperature shock) or merely aging at elevated temperature. In any case, a test chamber with a temperature range and cycling features that provide the necessary profile must be purchased. It is important to consider future applications when making this purchase.
Some guidelines for AST temperature parameters and chamber specifications are:
- Basic Chamber Range: 2 to 60 C per minute rate of change.
- Design Test: 10 to 60 C per minute rate, test should slightly exceed design margin.
- Component Test: 2 to 20 C per minute rate of change.
- Assembly Test: 2 to 10 C per minute rate of change.
- Total span of at least 100 C. A rule of thumb is, the wider the better.
Inside the test chamber, heated airflow elevates product temperatures. Therefore, make sure that products are mounted so they all reach the same temperature. To accomplish this, focus on airflow velocity rather than volumetric rate. Generally, an air velocity in the range of 600 to 1,000 fpm is desirable. This is high enough to “scrub off” the product’s surface barrier effectively, which could otherwise impede temperature change. Air velocity that is too high or too low affects the chamber’s ability to change temperatures—1,000 fpm is about the upper limit.
Chamber size and fixturing must be selected to conform to lot sizes and the physical size of DUTs. Other considerations are the mechanical design of the fixtures and racks, and connections that ensure the integrity of electrical signals.
Large electronics manufacturers need many stress chambers operating simultaneously to meet production requirements, so networking is a major AST consideration.
Historically, each chamber has required its own local PC to control instrumentation and gather data. Ethernet has become nearly universal as the data communication backbone in manufacturing and process plants, meaning most industrial PCs come with an Ethernet card. With Ethernet evolving as a de facto communications standard, look for instrumentation with this type of network interface. This eliminates the need for separate PCs at each chamber, which offers a number of advantages:
- One central PC can be used with multiple Ethernet-based instruments at local test stations.
- Without the local PC, there is less operator and engineering involvement at each station, and reduced PC maintenance and troubleshooting.
- More efficient monitoring and control—the central PC controller simultaneously collects data from all DUTs at all test chambers for central processing and distribution.
- Long-distance distribution and high data transfer speeds.
- Production managers can monitor data and make pass or fail decisions remotely, without going to HALT/HASS rooms.
Good data, good resultsAST requires repeatable, traceable measurements of parameters such as voltage, resistance and temperature over multiple channels for each test fixture in a chamber. Many burn-in test sequences require hours or days to complete, so long-term equipment reliability and data security are critical.
After determining the test requirements, the next steps are to analyze the test environment and then list all potential sources of measurement error. The aim is to limit the measurement uncertainty to an acceptable level.
Potential error sources include noise associated with the DUTs, external noise coupled into cabling, and the limitations of the instruments. More specifically, consider voltage offsets caused by thermoelectric effects (at connections of dissimilar metals) and voltages generated by rectification of radio frequency interference (RFI). Other external error sources include AC line cycle noise, Johnson noise, magnetic fields and ground loops.
When making resistance measurements, test lead resistance and self-heating of the DUT that causes a resistance shift are common error sources. Generally, lead resistance error should be minimized by using the Kelvin measurement method that uses four wires. Pulse testing may be a solution for self-heating.
Successful AST requires instruments with adequate resolution and sensitivity to provide accurate, repeatable data. Usually, 6-1⁄2-digit resolution is adequate for most applications; required sensitivity depends on the smallest signal to be measured. Multiple measurements can be taken and averaged to remove the effects of random noise that is superimposed on the DUT signal. Results should be checked periodically to make sure they are credible and meet test objectives.
In most cases, a voltage or current source is applied to the DUT, and then its response to that signal is measured. Switching systems are used for multiple DUTs, so the signal path through switch hardware must not compromise the source or response signal accuracy. Therefore, measurement and switching instruments should be considered together and the following factors taken into account:
- Channel count, associated with lot size and number of signals to be measured.
- Signal levels, both applied and measured.
- Speed, bandwidth and throughput, including limitations imposed by data communications between different instruments.
- Cabling and connectors that are selected for applied signal levels, low noise characteristics and quick connect and disconnect.
- Synchronization and triggering among pieces of measuring equipment to optimize speed and accuracy.
Instrument integration and productivityAST system integrators have to tackle several issues. For many, measuring AST voltage, current and resistance with a digital multimeter (DMM) might do the job, but it probably will not have all the switching and control functions needed for multiple DUTs. Therefore, separate source, measurement and switching systems would be required, posing integration problems associated with cabling, synchronization, triggering and software.
Fortunately, integrated multimeter and switching systems are available with plug-in modules that provide the flexibility to vary channel counts from 20 to 400, apply a stimulus to DUTs, route signals, control system components and make precision measurements over ranges wider than are possible with most standard DMMs. Instruments are available that provide 14 measurement functions with stable 6-1⁄2-digit accuracy that help reduce yield losses because of false failures.
Other features of multimeter and switching systems include per-channel programmable scan lists, large data buffers, battery-backed memory for secure data storage in case of power interruption, built-in signal conditioning, scaling and math functions that allow the user to optimize system throughput in automated AST applications. Data communication options typically include Ethernet, GPIB and RS232. Per-channel costs for these systems are usually lower than those of “build-your-own” test systems with equivalent channel count and accuracy.
Data management is another productivity issue that AST system users and integrators must address. As mentioned earlier, many plants require AST chambers and measurement equipment at the end of each production line. Raw test data from all these AST stations typically is routed over the plant communications network to a central repository. The plant’s data management system must provide data mining and analysis tools to create context, particularly for cycle test failures because without meaningful context, this mountain of raw data would overwhelm a user.
For remote and distributed data sharing over LAN/WAN systems, Ethernet-based instruments should have software, usually firmware, which makes integration easy. For example, firmware should provide an embedded IP or home page address for identifying the instrument’s location on the network. It should also have Web browser functions, such as “Send” and “Read” buttons. Besides standard data communications, these functions make it easier to debug measurement and communication problems.
Test application software has a big impact on AST system productivity. This software should help minimize instrument set-up time, provide on-line help utilities and come with a library of instrument, I/O and other device drivers to make programming transparent to the user. An intuitive graphical user interface (GUI) should make it easy to set up and operate the system, and let the operator know what is happening as tests progress. For integrators, software should support commonly used programming languages and facilitate development of the test executive. For maintenance personnel, the software must support calibration and diagnostics for debugging and troubleshooting. Assorted runtime, administrative, security and variance utilities should also be available.
AST executionSetup and runtime issues influence the payback period on an AST investment. To shorten the payback time and improve ROI, AST hardware and software should facilitate DUT loading and calibration. The operator GUI should be able to present loading instructions, and the software should verify that correct loading has been completed. The DUT loading subsystem should be able to recognize where a DUT needs to be loaded, minimize the chances of an operator or handler error, and initiate error messages when a loading fault occurs.
The first objective of runtime diagnostics is to let an operator know the system is running correctly. There must be a high level of confidence that when the system indicates a DUT failure, there indeed is one, and not a system failure. The diagnostic software should characterize all the major subassemblies in the system. At a minimum, this should include instrument self-tests, verification of mass interconnects and fixture tests.
Clearly, setting up and operating an AST program involves many details, not the least of which are networking and data management. Using instruments with Ethernet data communications capabilities help simplify matters. In many cases, an expert system integrator or test chamber manufacturer specializing in AST can shorten development and installation times. In the final analysis, this may be the most cost-effective route and provide the best results. When measured against yield improvements, lower warranty costs and higher customer satisfaction, AST usually is a bargain.