# Getting Started With Design of Experiments

March 25, 2011

Getting Started With Design of Experiments Design of Experiments (DOE is an effective as a troubleshooting method because it helps understand a process in more detail. Follow these seven steps on conducting experiments.

In a perfect world, we would solve every development problem quickly and with certainty. The results of our experiments would be easy to interpret and we would always be able to understand our process completely. Unfortunately, here in the real world, it just isn’t that easy. A process may have several inputs, known as factors, and we need to differentiate which factors are important or not in controlling our process. Often factors have interactions with each other, a condition where factors work synergistically or antagonistically to change a measured process output called a response. Establishing cause and effect relationships between the factors and responses using simple plots and graphs is confusing at best. To understand how factors affect each other and process responses, we need a powerful yet easy to use tool to make our job more manageable.

One of the most important tools in the Quality professional’s statistical tool box to understand a process is Design of Experiments (DOE). In a type of DOE called Response Surface Methodology (RSM), a process can be approximated by a mathematical equation, more commonly known as a model. In most instances, the model is created by software using the data we created by varying the input factors in unique ways called a design. Once a model is found to approximate the process using software and verified, any combination of the factor values within the design limits can be used to estimate the resulting process response without physically running the experiment again. With this done, a process may also be optimized to meet multiple requirements, as well.

Some typical uses of DOE include finding minimum manufacturing costs while maintaining required product specifications, making processes robust to be able to handle variability caused by using multiple vendors, minimizing defects, determining the best combination of components in a mixture for a particular use and finding the best process setting to maximize or minimize one or more product characteristics. Despite these strengths DOE has not made the inroads that other statistical tools such as Statistical Process Control (SPC), or Gage Repeatability and Reproducibility (GR&R) studies have made. Some people think the statistics are too complex; others don’t know where to start. There are as many reasons for not using DOE as people who should use DOE but don’t.

It may come as a surprise to some, but a Gage GR&R study is a designed experiment that has been refined into a common statistical method with standard calculations and interpretation rules. If you have successfully completed a GR&R study, you have already been successful using DOE. If you have process knowledge, follow a basic methodology, understand basic statistics, have the right software, and the desire to learn, you are on the road to being successful in DOE and finding your management the best solution possible to a problem with the least cost.

One of the most important strengths of DOE is to efficiently cover all possible combinations of the process factors and how they affect a process response. Our intuition falsely tells us we know there is a simple method to find all the different combinations of these factors by changing One Factor At a Time (OFAT) and measure the result at each of the combinations. Unfortunately OFAT has two big weaknesses – it is not thorough and it ignores interactions.

In place of the one factor at a time settings, DOE uses a study design, strategic values for the factors for you to run on your process and find the corresponding process result. Most often due to the specific nature of your process, the design is generated by a computer program. The factor values in the design, called levels and their corresponding results are then made into the model using software. Once the model is created, it is then evaluated by running the factors at specific levels to verify that it predicts the correct value for the responses. Because the software creates a model, any possible combination of factors may be evaluated without physically running additional experiments and still obtain a response value saving both time and money. Using the model also allows you to optimize certain characteristics of the process, such as minimizing cost, obtaining the maximum hardness, or whatever your goals for the experiment were. These factor settings are called the sweet spot, the best combination of the factors that meets your goals. DOE also provides another piece of information equally important, the analysis will tell you if what you would like to obtain can not be accomplished with the current boundaries for the factors or process design. By using DOE and not OFAT, you can be assured with confidence that you did not miss the desired result when you analyze your data.

A general methodology was developed by the late Statistician Dave Doehlert that will help assure your success in performing DOE. There are seven steps in this process.

During this step, you should also make sure your gauging process is adequate, select the factors, consider substitute responses if the response can not be measured directly, and be sure any assumptions are correct.

Not all process factors need to be controlled, and many can not be controlled. Uncontrolled factors potentially can cause false response values if the design is not run in a random order, or can obscure the desired variation caused by the controlled factors.

For all practical purposes, there are three types of models ranging from simple to complex:

The main effects model provides the ability to select the most important factors from a long list of factors you may think are important and require the least amount of data. This is often referred to as a screening.

The interaction model provides the capability of main effects model with an important difference; how one factor may affect another in an antagonistic or synergistic way. Interactions happen frequently among factors, and as a result, the interaction model is thought to be worth the extra cost and effort so the interactions are not missed.

The quadratic model includes the capabilities of the simpler models plus the ability to introduce curvature. As might be expected, the quadratic model requires more data than the interaction model. The curvature ability allows finding minimum and maximum values of a response. Use the quadratic model for your best work or if the simpler models do not fit the data.

Conduct your design in a random order. This will account for the factors you don't know about.

Hold every factor you are not varying in your design constant.

Execute every trial in your design.

You will also need to understand the uncertainty in your results. This uncertainty is caused by noise in your data. The only way to estimate this uncertainty is to repeat a few of the runs, known as replication. If your repeated measurements have a large amount of variation indicated by a large variance, you may not have accounted for an important factor in your design that randomization has included in the unaccounted variation. One example of this is perhaps you should have used blocking on a factor but did not.

DOE provides a systematic method of understanding a process with the minimal cost and effort while assuring that the best settings for the factors are not missed. Once the model is validated, you can find factor combinations that best fit your goals knowing all the factor combination have been evaluated. Using the seven step experiment methodology, you can be sure you did the best work possible. To be successful in experimental design does not require knowledge of advanced statistics due to the power of the software, extensive literature, and available training.

In a perfect world, we would solve every development problem quickly and with certainty. The results of our experiments would be easy to interpret and we would always be able to understand our process completely. Unfortunately, here in the real world, it just isn’t that easy. A process may have several inputs, known as factors, and we need to differentiate which factors are important or not in controlling our process. Often factors have interactions with each other, a condition where factors work synergistically or antagonistically to change a measured process output called a response. Establishing cause and effect relationships between the factors and responses using simple plots and graphs is confusing at best. To understand how factors affect each other and process responses, we need a powerful yet easy to use tool to make our job more manageable.

One of the most important tools in the Quality professional’s statistical tool box to understand a process is Design of Experiments (DOE). In a type of DOE called Response Surface Methodology (RSM), a process can be approximated by a mathematical equation, more commonly known as a model. In most instances, the model is created by software using the data we created by varying the input factors in unique ways called a design. Once a model is found to approximate the process using software and verified, any combination of the factor values within the design limits can be used to estimate the resulting process response without physically running the experiment again. With this done, a process may also be optimized to meet multiple requirements, as well.

Some typical uses of DOE include finding minimum manufacturing costs while maintaining required product specifications, making processes robust to be able to handle variability caused by using multiple vendors, minimizing defects, determining the best combination of components in a mixture for a particular use and finding the best process setting to maximize or minimize one or more product characteristics. Despite these strengths DOE has not made the inroads that other statistical tools such as Statistical Process Control (SPC), or Gage Repeatability and Reproducibility (GR&R) studies have made. Some people think the statistics are too complex; others don’t know where to start. There are as many reasons for not using DOE as people who should use DOE but don’t.

It may come as a surprise to some, but a Gage GR&R study is a designed experiment that has been refined into a common statistical method with standard calculations and interpretation rules. If you have successfully completed a GR&R study, you have already been successful using DOE. If you have process knowledge, follow a basic methodology, understand basic statistics, have the right software, and the desire to learn, you are on the road to being successful in DOE and finding your management the best solution possible to a problem with the least cost.

One of the most important strengths of DOE is to efficiently cover all possible combinations of the process factors and how they affect a process response. Our intuition falsely tells us we know there is a simple method to find all the different combinations of these factors by changing One Factor At a Time (OFAT) and measure the result at each of the combinations. Unfortunately OFAT has two big weaknesses – it is not thorough and it ignores interactions.

In place of the one factor at a time settings, DOE uses a study design, strategic values for the factors for you to run on your process and find the corresponding process result. Most often due to the specific nature of your process, the design is generated by a computer program. The factor values in the design, called levels and their corresponding results are then made into the model using software. Once the model is created, it is then evaluated by running the factors at specific levels to verify that it predicts the correct value for the responses. Because the software creates a model, any possible combination of factors may be evaluated without physically running additional experiments and still obtain a response value saving both time and money. Using the model also allows you to optimize certain characteristics of the process, such as minimizing cost, obtaining the maximum hardness, or whatever your goals for the experiment were. These factor settings are called the sweet spot, the best combination of the factors that meets your goals. DOE also provides another piece of information equally important, the analysis will tell you if what you would like to obtain can not be accomplished with the current boundaries for the factors or process design. By using DOE and not OFAT, you can be assured with confidence that you did not miss the desired result when you analyze your data.

A general methodology was developed by the late Statistician Dave Doehlert that will help assure your success in performing DOE. There are seven steps in this process.

## Step One: Ask the Right Question

This is the first and most important step. Discuss the question with the customer who requested the experiment to make sure it is accurately formulated and the results will meet their needs. The right question requires stated settings for input factors as well as the required resulting process values to meet the objectives. Most people find using an Ishikawa diagram helpful to identify factors as well as using a team approach.During this step, you should also make sure your gauging process is adequate, select the factors, consider substitute responses if the response can not be measured directly, and be sure any assumptions are correct.

Not all process factors need to be controlled, and many can not be controlled. Uncontrolled factors potentially can cause false response values if the design is not run in a random order, or can obscure the desired variation caused by the controlled factors.

## Step Two: Choose a Model

The model should be able to predict all combinations of factor settings and what the corresponding result would be. Generally, models are categorized as ranging from simple to complex. The simplest model, a main effects model fits the data to a flat plane without any curvature and is useful for screening designs. More complex models such as the interaction and quadratic model allow shapes such as saddles and domes respectively. Usually, the more complex the model, the better the data may be fit at the expense of having to gather more results for the various factor levels. More combinations of factors and their results may be completed if a simpler model does not fit the data using augmentation, a model building process going from simple to complex by using ever increasing amounts of experimental data.For all practical purposes, there are three types of models ranging from simple to complex:

## Step Three: Choose An Experiment Design

The model dictates what design you choose. Some software has the ability to create custom designs specific to your needs. In the past designs were published for standard experiments that seldom met real world needs. It is best to purchase the software that will create custom designs. These computer generated designs often require less data to be collected than published designs, and perhaps more importantly allow making designs specific to your needs. Custom designs are a must have in your software and present a negligible additional cost. If you have factors that you expect to affect the process but don’t want to study, the effects can be discarded mathematically using a technique called blocking. Blocking prevents nuisance factors from confusing your results.## Step Four: Collect Your Data

From your design, run the experiment by collecting the data using the values for the factors required by the design table. You are trying to establish a cause effect relationship-the cause of your results changing is your factor levels changing. In order to do this, you must account for all of the factors-even those you don't know about. To do this you will need to follow all three of these steps when you collect your data:You will also need to understand the uncertainty in your results. This uncertainty is caused by noise in your data. The only way to estimate this uncertainty is to repeat a few of the runs, known as replication. If your repeated measurements have a large amount of variation indicated by a large variance, you may not have accounted for an important factor in your design that randomization has included in the unaccounted variation. One example of this is perhaps you should have used blocking on a factor but did not.

## Step Five: Analyze the Data

Modern software takes the complexity out of performing the data analysis. Occasionally the variation is not consistent in the study; the data must be massaged by a transformation calculation because the estimation errors are not normally distributed as required by most software. Once this is done the best combination of factor settings that meets your objectives can be easily found using the software.## Step Six: Test Your Model and Sweet Spot

This is the moment of truth. Using various combinations of the factors, use the software model to predict what the resulting process value would be. Then actually run the same values on the process and compare the results. If it does not predict well, you may need a more complex model, possibly switching from an interaction model to a quadratic model for example. In sequential DOE, by carefully picking some of the factor combinations to test at as part of higher level models, you can gather the information for a more complex model at the same time as testing the model. If an optimal value is desired, be sure to test this predicted sweet spot. If you had to transform the data, be sure to use an inverse transformation to obtain the original units if needed.## Step Seven: Take Pride in a Job Well Done

Because you followed a systematic approach to your experiment, you have done the best job possible to find the answer to your questions. In some instances you may find that you need to “be bold” and expand your allowable factor settings because the analysis indicates that the goals can not be met within the current allowable factor settings and the settings must be expanded.DOE provides a systematic method of understanding a process with the minimal cost and effort while assuring that the best settings for the factors are not missed. Once the model is validated, you can find factor combinations that best fit your goals knowing all the factor combination have been evaluated. Using the seven step experiment methodology, you can be sure you did the best work possible. To be successful in experimental design does not require knowledge of advanced statistics due to the power of the software, extensive literature, and available training.