I’m a Type-A personality with a sense of urgency to explain everything. Give me a little data, and I will use every statistical tool I can wrap around these rationalizations to help explain an observation. But here is something that I cannot explain: why do we tolerate such poor gages?

We should not comment on the quality of a process without accurate data, but we seem incapable of providing it. We are not much farther from the days in which we checked our oil one dipstick at a time.

Repeatability studies are used to validate the quality of a gage from a variation perspective. It’s a pretty low bar. Measure the data, repeat the exercise under identical conditions, and if you arrive at the same measure, the gage is deemed to pass the first measure of reliability. It’s one reason we “measure twice, cut once,” and why we draw the dip stick from the engine block, wipe it off, then measure again.

But as simple as it might seem to pass this initial test, I find many gages fail miserably.

Let’s start with digital thermometers. I came down with the flu recently, and took three measurements from a “deluxe” version of the device, since I had long ago given up on the entry-level product. With this premier product, you sweep the device across your forehead. The first reading indicated my temperature was 96.8, below normal. Second attempt, I am now at 100.4. Third attempt, I am at 98.7. So, am I sick? Not sure, but I am certainly confused.

If we cannot trust our technology, can we trust our own eyes? As part of seminars I run on performance improvement, I challenge audiences to read a paragraph and to count the number of “f”s in the paragraph. This is a test of reproducibility, a higher bar than repeatability, and utilizes different operators under identical conditions to determine if the resulting measurement is the same. Taken together, these approaches make up what is known as a “gage R&R study.”

No matter how much time I give the audience, the variation across participants is stunning. The audience would fail any reasonable standard for determining if a pair of eyes are a reliable gage. This is one of the reasons I argue that inspection is fundamentally unproductive. We are destined to look right past defects that are screaming to be discovered.

It is important to recognize that measurement errors can cost your company money by rejecting perfectly good products as scrap, or worse, releasing defective products that reach your customers. It is easy to put these costs in financial terms if you have unit cost information and the results of a capability analysis showing your error rates. Once management understands the implications of a poor measurement system, you’ll get the support you need to conduct a competent gage R&R study, the first step toward diagnosing if your own system is reliable. Keep in mind the gage can be an instrument, it can be our eyes, and it can actually be both at the same time (unreliable eyes inaccurately reading inaccurate measurements—not as uncommon as you think).

At the end of the day, the real point of a measurement system analysis is to know if your measurement system is good enough for your needs. It’s worth considering what W. Edwards Deming had to say: “For what purpose are we measuring; if we are going to eat off this table maybe it is clean enough, but if we are going to perform an operation then clean takes on a different meaning.” To respect this view is to have clear and unambiguous definitions for what “clean” (or “within spec”) really looks like.

The fact we have poor gages that measure performance can be a demoralizing matter, but we are not powerless in our ability to overcome the limits it may place on our decision-making ability. So get out there and demand good data, you need it to do your job. Q