Artificial Intelligence
Before AI Can Help, the Data Has to Be Ready
Manufacturers who want their data to support AI in the next several years need to make decisions now about how they collect and store information.

Manufacturers increasingly want to use artificial intelligence (AI) and advanced analytics to improve quality and reduce waste. But experts who work with manufacturers on these projects say companies often can’t get useful results from AI because the data feeding it is incomplete, inconsistent or inaccessible — and many manufacturers don’t realize that until they’re already trying to implement it.
The problem starts at collection
The first challenge is that many manufacturers don’t collect enough data, or collect it in ways that introduce errors from the start. In some facilities, workers still record measurements by hand, which creates opportunities for mistakes that ripple through any analysis built on top of them. “Errors are intrinsic to manual data entry,” said Andy Duvall, sales director at MicroRidge. Bad handwriting, typos and data entered into the wrong fields all introduce inaccuracies. “A wrong value will negatively affect any AI models,” he said. “It can lead to incorrect analysis and send a team chasing a phantom issue. At best, your scrap rate increases; at worst, you proliferate non-conforming material, undetected.”
Where data gets collected also matters. When quality teams only measure a part after all operations are complete, they lose the ability to pinpoint where in the process something went wrong, Duvall said. Adding measurement points earlier and throughout the process gives analytical models more variables to work with and makes the resulting insights more actionable.
Anneke van der Linde, cofounder of HAI, a company that has developed an AI-ready operational digital twin for process industries, describes another common scenario: data that exists but is effectively out of reach. Lab results live in one portal, operator measurements in another, raw material data in an enterprise resource planning (ERP) system and process records on paper. “Data is locked in silos, scattered over multiple databases,” she said. Even when manufacturers have been collecting data for years, that fragmentation can make it nearly unusable for advanced analysis.
What good data actually looks like
Clean data alone is necessary but not sufficient. Van der Linde said good quality data is validated, structured and, most importantly, contextualized. A viscosity measurement only becomes meaningful when a technician or system pairs it with information about where and when it was taken, which product was being produced, the relevant specifications and the associated batch or process order. Without that context, the number exists but doesn’t tell anyone anything useful. It should also be possible to correlate that measurement with other parameters, she said, such as temperature or the fill level of the final product.
Consistency matters just as much as completeness. The best data comes from sources that limit variables as much as possible, Duvall said: the same instrument, the same environment, the same method, measuring the same feature every time. Automated data collection helps achieve that by removing manual entry errors and letting operators focus on measuring correctly rather than on recording the result.
Nomenclature creates its own problems, said Christopher J. Campbell, CEO of AssetSmart. Software systems rarely name data fields the same way. One database might use “Manufacturer” where another uses “Make.” One might call it “Measurement Units” while another uses “UOM.” Those disparities make analysis and reporting much harder, particularly across large organizations where teams may describe the same item differently across different systems. Insufficient decimal precision compounds the problem, he said. Rounding or truncation of values can introduce errors that grow significant over time as they move through calculations.
Historical data presents a related challenge. Even when older data exists and is accessible, Phil Mason, vice president of business development at Hertzler Systems, said most older data lacks the traceability needed to give analysis real meaning. Without knowing the conditions under which a measurement was taken, the data may be technically accurate but analytically limited.
Where integrity breaks down
Even manufacturers with reasonably good collection practices lose data integrity between measurement and analysis. Duvall said the translation point between the physical and digital world is a common bottleneck. Gages and measurement instruments have improved substantially. Software for analysis has also advanced. But the layer in between, where physical measurements become digital records, often doesn’t receive the same attention. “A system is only as good as its weakest link,” he said, “and data collection must be as much a priority as automation, software, and gaging.”
Human factors contribute as well. Mason described a manufacturer that had resorted to reviewing security camera footage to verify whether workers were completing data checks at all. The issue wasn’t the sophistication of their systems, he said. It was integrity. “People want to know if they are getting better,” Mason said. “If it takes too long to know the score, the opportunity is lost and the system weakens.” AI won’t fix that.
The consequences of poor data integrity aren’t always obvious until something goes wrong. Van der Linde described a manufacturer that tried to use AI to determine optimal settings for a spray drying tower, drawing on moisture measurements stored in an ERP system. The AI outputs didn’t align with process logic, and further investigation revealed that the moisture measurements for the inlet and outlet streams had been swapped. The data existed, had been collected and was accessible, but it was wrong, and the error wasn’t apparent until the AI results made no sense.
Building toward AI readiness
Manufacturers who want their data to support AI in the next several years need to make decisions now about how they collect and store information, Campbell said. That means capturing more detail than seems immediately necessary, since analytical tools will keep evolving and future needs are hard to predict. It also means avoiding proprietary storage formats that could make data difficult to export or use with other tools, and ensuring systems support the modern integration protocols that AI platforms rely on.
A common mistake, van der Linde said, is building a data lake without first defining what data is needed and how it will be used. Storing data without a plan adds little value. For data to be useful it must be validated, structured and enriched with the metadata that gives it meaning. And once companies develop AI models, those models need to be embedded in the systems operators and technicians already use. Otherwise, adoption is unlikely regardless of how well the models perform.
Duvall said quality teams should also capture information about the measurement process itself, not just the measurement result. Details about the equipment used, the conditions at the time and the operator involved give analytical models more to work with. “Memory is cheap,” he said, “and the more data that can be captured about the manufacturing system, the better prepared an organization will be for automation and AI.”
Mason’s advice is more fundamental. Before asking how to implement AI, he said, manufacturers should establish a data strategy. “If they have no data strategy, AI is not going to be the magic elixir they may think it is,” he said. A concrete plan for what data needs to be collected and why is the prerequisite. Without it, the sophistication of the tools on top doesn’t matter much.
Campbell added a note of caution about AI tools themselves. Current systems have well-documented limitations, he said, including constraints on how much information they can process at one time and a tendency to produce errors or fabrications. Human oversight remains essential, particularly where mistakes could have safety implications or significant costs. “Great caution and human oversight is warranted using current-generation AI tools for quality analysis,” he said. Manufacturers who move toward AI without that foundation in place may find the results fall short of what they were expecting.
Looking for a reprint of this article?
From high-res PDFs to custom plaques, order your copy today!





