Every day I talk to customers who want to get off paper. They spend an inordinate amount of operators’ time manually capturing data on paper, and then often double-down when they have someone transfer those paper check sheets into a digital spreadsheet. What they fail to realize is they are even further behind technology as we move from digitizing data collection to raw digital data from many different sources. Hourly checks turn into millisecond data streams. Startup inspections turn into a continuous flow of process data. The type and flow of data is increasing. How is a quality engineer supposed to stay on top of it all, make sense of it all, and continue to improve quality? 

For each new data source, an exponential increase in processing occurs. Stream processing technology points to a new solution for increasing volume and variety of data. Parsing each event as it happens allows real-time processing and aggregation, making the data immediately accessible for process improvements, whether manual or automated, programmable or cognitive. 

Where is the increase of data coming from?

  • Integrating quality systems with other data silos like MRP, LIMS, and spreadsheets.
  • IoT devices enabling automated data everywhere.

One of the primary drivers for the amount of data available today stems from integrated systems. More and more data comes from PLCs, ERP, LIMS, custom-built databases, etc. Each of these silos of data contains important information that may affect the quality of a process and ultimately the satisfaction of the customer. An order number for a customer may need to be correlated to a product feature and spec of a finished good. The paper system might tie this data together in days or weeks, which might mean delay in delivery or even a product shipped before correlation is made. 

In addition to integrated systems, the Internet of Things (IoT) already has shaken up the landscape and will continue to impact the realities of data. Business Insider’s research department estimated that 237 million IoT devices installed in 2015 will continue to grow to an astounding 923 million by 2020. Every piece of equipment on the plant floor will either come with a device that collects and feeds streams of data or will be retrofitted with a device. Either way, the pulse of the machine (as well as the blood pressure, several blood panels, respiration, etc.) will be a stream of data accessible in real-time from anywhere in the plant. 

The flood of data coming from existing technologies like manual operator checks, PLCs, ERP, and other systems will swell with the addition of IoT devices to levels never before imagined. Somewhere in the flood surge, important information flows past with no one even noticing until the customer calls to complain. The flood of data can allow for after-the-fact study and root-cause analysis which will allow process improvement to stop that issue from happening again. But it is already too late. 

Real-time Analytics and Alerts

Stream processing points to a possible solution. It was developed for high-volume data processing like GPUs and later applied to market trading. If a data historian stores data with minimal processing, then stream processing is the opposite, analyzing and aggregating data without storing the individual data record (although part of the processing can be pushing the raw data to a historian). Similar to business work flows, the system defines data flows, processing each event as it hits the system. The individual data point is less important than the fingerprint it leaves on the system. The inputs are the raw data streams, the data flows are the processing of the data, and the outputs are the analytics, aggregation, and alerting derived from the data flows.

Once the inputs and data flow are configured, you can really unlock the power of your data by adding machine learning and artificial intelligence to your processing stack. The flood of data not only becomes manageable, but also filters out into irrigation ditches and mill races to enable more powerful work to occur. The data flows of yesterday provided analytics, but the inputs and outputs had to be known and programmed into the system. We are now moving toward AI that will learn from the streams of data and make correlations and automated corrections at the speed of electrons.

Measurable Business Impact

The bottom line is how these streams of data can help a company save money, achieve growth, and/or reduce risk. The amount of data available will continue to increase; tying that data to a metric that enables improvement that actually impacts the business becomes important. 

I have talked to many people that are excited about a new machine because of all the data they can get out of it. My response is always the same: What is the value of the data? Just because it is easy to collect that data doesn’t mean it is worth collecting. Can the data be moved up the chain from operators at the plant floor to supervisors and managers to vice presidents? What are the dashboards or reports that link that raw data to something bigger in the company? Can you tie it all the way to a corporate initiative? 

  • Tie the project to a corporate initiative for faster approval and easier budgeting.
  • Determine one or two high-level metrics for focus.
  • Use the stream processing to correlate raw data with corrective actions that impact the key metrics.

For example, a company might have an initiative to have the best quality in their industry. This initiative can be met by improving first pass yield across the plant. First pass yield can be improved by reducing variation and tightening tolerances. The raw data streaming from that new equipment can be processed in real time to allow immediate analytics and alerting. Instead of waiting for an engineer to look at the data and make corrections to the process, an algorithm can quickly spot a trend as it is happening and make the appropriate changes to correct the parameter. Instead of hundreds of pieces being affected, stream processing enabled a cognitive system to notice the trend and make a correction before any bad pieces were manufactured. All of this would take some effort to set up and configure, but the corporate objective is well on its way to being met. 

This may be a trivial example, but my intent is to start thinking about how the raw data stream can be tied to something larger instead of just being collected because we can collect it. 

Nine hundred twenty three million devices create a lot of data—how does this data help the company? I have two questions to ask when you start planning for more and more streams of data: What is the data that you want to collect? What are you going to do with it when you have it? Q


Greenough, John. “How the Internet of Things Is Revolutionizing Manufacturing.” Business Insider. Business Insider, 12 Oct. 2016. Web. 06 July 2017.

Harrison, Rob. “Why Quality Management Leaders Need an IoT Strategy Now.” Quality Digest. Millennium 360 Inc., 18 Jan. 2015. Web. 06 July 2017.