A statistical process control (SPC) program is only as good as its data. Data can point out problems, tell their causes and how often they happen. Data can show how much variation is in the process, and when the process is out of control. It can lay the groundwork for action.

To do all this, it has to be the right kind of data for the purpose. It has to represent the population it is supposed to represent, and it has to be organized. If the data fails these criteria, it can lead to wrong conclusions and the wrong type of action.

The types of data most common for statistical process control are variable and attribute. Data falls into one of these groups based on the way it’s collected. Variables are data that’s measured, such as length, weight, temperature and diameter. They can be any whole number or fraction.

Attributes are counted data, such as the number of defects. They are often a record of go/no-go, pass/fail or yes/no. Either the part has a defect or it does not. Because they are tallies, they must be whole numbers such as 1, 154 or 68. The operator or QC manager would not record a half of a tally. The attribute value would be the total number of tallies. Data can also be classified in either of these groups by its purpose. This includes data for analysis, regulation, process control, and acceptance or rejection.

Data for analysis is the type of data that is used to study past results, make new tests and to study the relationship between causes and their effects. Data for acceptance of rejection is point-of-inspection, go/no-go data. Data for regulation is used to adjust the process, and calls for direct action, such as temperature changes. Data for process control shows if the process is in or out of control, and shows process trends.

## Characteristics

To control the process, users need to collect, analyze and act on data from the characteristics that make up both the process and the part. A characteristic is a feature of a part or its process, such as the dimension, speed, hardness, smoothness, flatness or weight.

Before collecting the data, decide which characteristics are most important for improving product quality. Keep in mind that it is okay to change characteristics at any time. After these characteristics are brought under control and are consistently producing the output wanted, it is possible to improve overall quality by controlling other characteristics.

When looking at each characteristic, consider the type of data that can be gotten from it, how it will be measured and at what point of the process it will be measured. It is also important to know if the results can be proven, what can be learned from the data, and if it can be acted on.

Before collecting data, determine what the purpose is. Is it to control the process, correct a problem or analyze the process? The purpose points the way to the kind of data needed, where to collect it and how to organize it.

After identifying the purpose, decide the extent and the objectives of the study. Then decide what type of data is needed from which characteristic. Keep in mind that it isn’t enough just to collect data. To reach a conclusion, it needs to be understood. Therefore, it is important to know how to analyze the data and what data will make the analysis accurate before collecting it. It is equally important to decide how the data will be collected. Consider what collection method will most clearly show the problem’s cause or the process trends.

Because it is seldom feasible to test every item in a group, most studies are based on random samples. How the user samples his or her universe determines the view of it, so the samples must be random. If they aren’t, it won’t have an accurate picture of the universe. The only way to ensure random sampling is to develop a plan for sampling the data before beginning to collect it.

## A sampling plan

With sampling, collect data on a number of items in the group, and apply the results of this study to the whole group. When the plan is solid with enough random samples, the results of the study will accurately reflect the whole group.

There are several things to consider when developing a sampling plan. The goal of sampling is to get information that accurately reflects the population. First, identify what needs to be controlled, then decide what sampling method to use, how often to take them, where they should come from, and how many will represent the group. For some studies, when to take a sample or the production order may be important. For example, if the goal is to detect a change that won’t last long, the time between samples should be short.

The sampling method used depends on the type of data needed. For attribute data, samples are lot-by-lot. Samples from each group are inspected and defects are tallied. Variable data comes from continuous process samples. This type of sampling involves taking measurements of random items in the process as it is running.

How often samples are taken depends on what is being studied. For attributes, samples should be taken for each lot. For variables, consider the nature of the process as well as the purpose of the study. Samples maybe needed every five minutes, hourly, daily or during each shift. The goal is to take samples often enough to get an accurate picture for the study.

Where the samples come from refers to the point in the process where the measurements are taken. Again, the purpose of the study determines this. For a count of the defects, the samples will be post-production. For variable data, where the samples come from depends on what data will reveal the most information about the process. This depends on the purpose, the characteristic, and the process. If the sample consists of readings of consecutive parts, it captures that specific time in the process. If only a summary of events over time is needed, the readings can be from random parts.

The group of samples taken from a population must have all the characteristics that are in that population. Therefore, how many samples taken depends on how many will give an accurate picture of the population.

In a random sample, every item in the population has an equal chance of being taken. In a biased sample, every item doesn’t have an equal chance. Only taking the items that can be easily reached will give a biased sample. So will selecting only those with obvious defects. If the bias is small, an accurate picture of the population can still be taken, but there is no way to know the amount of bias. Design sampling plans to avoid bias.

When developing a sampling plan, decide how many readings to take for each sample. The number of readings, or sample size, determines how much variation the control chart will reflect. An increase in the sample size causes a decrease in the variation between samples. A sample size increase also increases variation within a sample.

In an X-bar chart, variation decreases as the sample size increases. Because there is less variation, the control limits are tighter. Tighter control limits make the chart more sensitive.

When first bringing a process under control, use a small sample size, such as two, and take samples frequently. This way, the chart will show some out-of-control points, but not enough to overwhelm the user. If a majority of the samples are outside the limits, cut the sample size. After the causes of the outside points have been eliminated and the process stabilized, increase the sample size to find more variation. As the control limits are tightened and problem causes eliminated, processes will improve.

## Problem-solving techniques

The first step toward solving a problem is defining it. This makes the objective clear for everyone involved so they can focus on finding a solution. The second step toward solving a problem is to determine its cause or causes.

After defining a problem and finding its causes, work can be started on correcting the causes. It’s important to consider the solution’s impact on other parts of the process before adopting the solution. Equally important is how to prevent the problem from happening again. This is the idea behind SPC: preventing problems instead of detecting and solving them.

There are several tools that make problems easier to define and solve.

- Pareto analysis
- Cause-and-effect diagrams
- Scatter diagrams
- Histograms
- Run charts

Sometimes the hardest part of solving problems is deciding which one to tackle first. Pareto is a way to prioritize problems by looking at their cost and frequency and helps determine which causes are the biggest.

The theory behind a Pareto analysis is that a few production problems cause the most damage and a large number of problems do the rest. The goal of Pareto analysis is to clearly identify which problems could represent the largest potential savings. Project team members use Pareto to analyze problems and develop a schedule for attacking them. They also use it to show how the process has improved over time.

Pareto breaks problems into a series of categories, with a common denominator running through each. In most cases this denominator is dollars, because most problems reflect added costs for a company. However, if costs are about the same for each problem area, focus on how often each problem occurs.

After a problem is defined, its causes must be determined. Cause and effect diagrams, also known as Ishikawa or fishbone diagrams, show how to sort out and relate factors affecting quality. By illustrating how each cause relates to the effect, this diagram guides problem-solving efforts to the disease, not the symptoms.

Cause-and-effect diagrams break the causes into several categories and then subdivide these further when they become too complex. Most major causes can be categorized as materials, equipment, workers, methods, measurement, management and the environment.

A group approach is the most effective way to create a cause-and-effect diagram. The goal is to identify all of the causes that relate to the effect, and groups usually come up with more ideas than individuals do.

Scatter diagrams show if there is a relationship between a cause and the effects, or between two causes. They can reveal if an increase in one variable increases, decreases, or has no effect on the other one.

Histograms illustrate how often a range of measurements has occurred. They also show how the data distribution relates to the specifications and if data falls outside the specification limits. By showing the shape, central value, and the method of dispersion of a group of measurements, histograms can tell us a lot about the behavior of a process.

How close the actual distribution is to a normal curve can tell a lot about a process. Although it is often apparent if distribution is close to normal, the subtle shifts that represent a process problem can’t always be identified. This is especially true when trying to compare histograms.

To address this, statisticians have developed several methods for testing the data for normality. Among these are tests for skewness and kurtosis, and Chi-square tests. With these tests, differences in data distributions that have the same mean and the same standard deviation can be detected. This type of analysis can show us if process improvements are effective.

Skew is the difference between the mean and the mode. Tests for skewness measure the symmetry of the curve. If the skew factor is zero, there is no skew. If the skew factor is positive, the mean is larger than the mode. With a negative skew, the mode is larger than the mean. The skew factor can show if the process has a tendency to lean toward upper or lower specification limits.

There are also situations where the standard deviation, the mean, and the skew are the same for two distributions, but one chart has a flat curve and the other a peaked curve. The degree of flatness of the curve is known as the kurtosis.

The Chi-square test shows how well the actual distribution fits the expected one. These tests are often used to determine the likelihood of a distribution.

Run charts provide a way to study the stability of a process and to detect process trends. Because they reflect the process over time, data is plotted on the chart in the order that it was produced.

## Why SPC?

Using these techniques and others can help a company succeed in a global market that depends on quality.

The emphasis should be on attaining consistent high quality. It isn’t enough to produce quality sporadically; one bad product can hurt a company’s future sales. Inconsistent quality is also more expensive because bad parts have to be reworked or even scrapped. On the other hand, when quality improves, productivity improves, costs sales and sales go up.

Excerpted from the book, *The Book of Statistical Process Control,* by Zontec Press (Cincinnati). The sponsoring editor for the book is Warren Ha and the editing supervisor is Richard Morris. To obtain a copy of the book or for more information on statistical process control, contact Zontec at (513) 648-0088 or via the Internet at www.zontec-spc.com.

## Sidebar: Tech tips

1. A SPC program is only as good as its data. Data can show variation and out-of-control processes.

2. To control the process, collect, analyze and act on data from the characteristics that make up the process and the part.

3. Most studies are based on random samples. The goal of sampling is to get information that accurately reflects the population.

4. The first step toward solving a problem is defining it. The second step is to determine its cause. Tools used to define and solve a problem, include Pareto analysis, cause and effect diagrams, scatter diagrams, histograms and run charts.