Measurement
The 100% QA Myth: The More You Review, The Less You Catch
Every hour spent on 100% review is an hour not spent on calibration work, on uncertainty analysis, on training, on root-cause investigation when a real problem surfaces.

In 42 years working with calibration laboratories, I have yet to meet one that has solved the same stubborn question: how do we verify our own work without verifying every single piece of it?
The honest answer most labs converge on is some version of 100% review — have a reviewer read every outgoing certificate before it leaves the lab. It feels safe. It looks defensible. And if you examine it carefully, it is usually neither.
This is the first article in a three-part series on outgoing-work quality assurance for calibration laboratories. This one makes a single claim: 100% review, as practiced in most calibration labs, is a myth. It does not do what it says on the package. The next two articles in this series explain what to do instead.
Manual review is not free of error
Start with the reviewer. When a human being performs a routine manual inspection task — reading a certificate, checking values against tolerances, verifying a signature — they do not operate at zero error probability. The human reliability literature puts the baseline at roughly 0.1% to 1% error per action under ideal conditions (Swain and Guttmann, 1983). Fatigue, time pressure, and inadequate training act as multipliers of 3× to 10× on that baseline (Williams, 1988).
Think about what this means in a calibration lab. A senior reviewer reading 100 certificates in a working day, at the degraded end of that range, is making errors at a rate comparable to what they are trying to detect. Every extra certificate they read is another chance to let something slip through — and because their attention degrades as the day goes on, the last ten certificates are reviewed with measurably less care than the first ten.
The uncomfortable truth is that volume itself is an error source. A reviewer who checks every certificate is operating in the area where fatigue-driven errors are most likely. Reducing what they have to read by 90% — but making sure they read it carefully — gives better error detection than exhaustive review at reduced attention.
The arithmetic no one runs
Here is a calculation worth running. Take a mid-sized calibration lab with a typical mix of work across primary-standards, secondary electrical, mechanical, and dimensional disciplines. Per-technician throughput varies enormously by specialty — a primary-standards technician calibrating a highly accurate Multi-Function Calibrator may complete one certificate per day, while a secondary-electrical technician calibrating handheld meters may finish fifteen. Average across a staff of 20 technicians, a reasonable lab-wide total is 1,500 to 2,000 certificates per month. Call it 1,680 for this calculation. At 15 minutes of review time per certificate — which is thorough but not excessive — the review workload is 420 hours per month.
A working month contains about 168 hours per person. At 420 hours of review demand, you need approximately two and a half full-time-equivalent (FTE) reviewers. Not one. Two and a half, full-time, on review alone. An FTE is the labor-hours equivalent of one person working a standard full-time schedule, so 2.5 FTE could be two dedicated reviewers plus another at half-time, or any arrangement totaling the same hours. Larger commercial labs running multi-discipline services may scale volumes two or more times this demand.
Most labs that claim to do 100% review do not employ two and a half FTE reviewers. They employ one person who is also doing other things. The math tells you exactly what is happening in those labs: the reviewer is either skipping certificates they trust, skimming the ones they read, or pushing through the work at 5 minutes per certificate instead of 15. None of those are 100% reviews in any meaningful sense. It is triage wearing a 100% label. And triage, by definition, lets some errors through.
The cost that nobody lists
Earlier in my career, at an aerospace manufacturer, I managed a metrology engineer who, despite being a full-time employee, spent at least four hours per day on 100% QA review of incoming vendor calibration certificates — roughly $85,000 annually in loaded cost for one person doing redundant review on vendor-provided work. Not fraud detection. Not specialized analysis. Clerical double-checking.
That is half of one experienced engineer’s professional output — skills we were paying for in full — consumed by redundant checking. Scaled to the 2.5 full-time-equivalent reviewers a mid-sized calibration lab actually requires, the cost is roughly $425,000 per year in reviewer time alone.
That number gets worse when you consider what the reviewers could be doing instead. Every hour spent on 100% review is an hour not spent on calibration work, on uncertainty analysis, on training, on root-cause investigation when a real problem surfaces. The opportunity cost compounds over years.
What 100% review does not give you
Even if you had unlimited reviewer capacity — even if a team of perfectly rested, perfectly trained reviewers could genuinely read every certificate at full attention — 100% review would still be the wrong tool for the job. Because it does not answer the question your quality management system is actually asking.
100% review produces a count. You found eight errors this month, and you did not find any others. That number has no uncertainty attached. You cannot say whether this month is better than last month with any confidence. You cannot say whether a particular technician's error rate is genuinely elevated. You cannot project what next month will look like. You have an integer, not knowledge.
A statistical sampling plan produces something different. It produces an estimated laboratory error rate with a confidence interval: for example, 2.8% with 95% confidence that the true error rate is between 1.9% and 4.1%. That is testable. It can be compared month to month. It can be used to spot genuine signals amid statistical noise. It tells you what your process is doing, not just what happened in one sample. That interval is the difference between hoping your process is stable and knowing it—with the data to prove it to an auditor or a customer.
When an auditor asks how you chose which certificates to review carefully, “the reviewer reads all of them” is not actually a satisfying answer. Because the auditor knows — as you now do — that the reviewer did not read all of them with equal attention. They knew some of them were fine and skimmed. They paid more attention to unfamiliar technicians. They caught fewer errors in the last hour than in the first. The audit answer that actually satisfies ISO/IEC 17025 Clause 7.7 is a documented, reproducible, statistically grounded selection — not a claim of exhaustive review that does not survive scrutiny.
Where this leaves us
The problem is not that calibration labs are careless. Every lab manager I have worked with wants their outgoing work to be correct. The problem is that the tool we have historically reached for — read everything — does not deliver what it promises and costs more than most labs realize. We have been using a safety blanket instead of an instrument.
There is a better way, and it comes from the same statistical tradition that gave us measurement uncertainty, traceability chains, and control charts. Stratified sampling, applied thoughtfully, lets you review 10% of outgoing work with higher real error detection rate than 100% review and produces defensible laboratory quality metrics an auditor will accept—metrics that actually improve over time rather than merely documenting yesterday’s triage. That is the subject of the next article in this series.
In the meantime, when someone in your lab tells you they are doing 100% review, ask them how many hours they spent on it this month. Do the arithmetic yourself. See if the numbers work. If the math does not fit in one person’s workweek, your 100% review is a myth — but the cost of believing in it is real.
Further reading
Swain, A.D. and Guttmann, H.E. (1983). Handbook of Human Reliability Analysis with Emphasis on Nuclear Power Plant Applications, NUREG/CR-1278. U.S. Nuclear Regulatory Commission.
Williams, J.C. (1988). "A data-based method for assessing and reducing human error to improve operational performance." Proceedings of the IEEE Fourth Conference on Human Factors and Power Plants.
ISO/IEC 17025:2017. General requirements for the competence of testing and calibration laboratories. Clause 7.7 (Ensuring the validity of results).
Acknowledgment: This article was developed in collaboration with AI assistants, Anthropic's Claude and xAI’s Grok, for drafting, structural editing, and prose refinement under the author's direction. All metrology and statistics content, claims about laboratory operations, the underlying methodology, and final editorial decisions are the author's own. The author is solely responsible for the final content.
This is Article 1 of 3 in The Calibration-QA Trilogy.
Article 2, "Sampling Smarter, Not Less," explains the statistical alternative in practical terms. Article 3, "When Sampling Beats Counting," covers the technical properties that make a sampling plan defensible under audit.
Looking for a reprint of this article?
From high-res PDFs to custom plaques, order your copy today!







