Vision & Sensors

Reliable Identification with Deep Learning

Deep learning optimizes text recognition in machine vision.

Photo 1: OCR software reliably recognizes number combinations and letters. *Source: MVTec Software GmbH*

Photo 2: Machine vision makes it possible to identify characters on components in industrial environments. *Source: MVTec Software GmbH*

Photo 3: Deep learning technologies facilitate extremely high font recognition rates. *Source: MVTec Software GmbH*

Optical character recognition (OCR) is an important technique in industrial environments. After all, machine vision makes it possible to reliably identify workpieces and products throughout the entire value chain based on printed or stamped characters. With the aid of modern deep learning technologies and convolutional neural networks (CNNs), certain fonts can be trained in such a way that font recognition rates are significantly improved.

OCR techniques are especially well known from office communication. The scanning of paper documents such as invoices, delivery notes, and other records allows them to be quickly converted to a digital format, to extract relevant information from them and to integrate this data into a continuous, electronic flow of information. The method also plays a key role in industrial design and production processes, particularly in the context of Industry 4.0 or the Industrial Internet of Things. Printed letter or number combinations make it possible to clearly and quickly identify components and make them available for automated process chains.

With optical character recognition, image acquisition devices such as scanners and cameras record digital image information and turn it into raster graphics that accurately represent the text down to the last pixel. OCR software reads out these graphics, recognizes number combinations or letters, and combines them into words or even entire sentences. Machine vision techniques support the optical recognition of character combinations in design and production processes. These techniques include specific functions for the special requirements of industrial environments.

Robust text recognition thanks to artificial intelligence

For example, well thought-out classification techniques ensure very high recognition rates even under difficult conditions. Blurry or slanted text can also be identified without problems—even with distorted letters or characters printed onto or etched into reflective surfaces or highly textured color backgrounds. Since industrial applications require ever faster and more flexible processes, artificial intelligence techniques are also being incorporated into machine vision. Thus, deep learning algorithms and artificial neural networks (such as CNNs) ensure even more robust results from text recognition.

Deep learning technologies are characterized by analyzing large amounts of digital image data and thus training models of certain objects that must be recognized. This works both with respect to physical objects as well as letters or numbers. A label is attached to the data, which identifies the object, such as “dog” or “letter A.” Reliable statements on the content of newly recorded image information can now be made based on the trained models. The technology thus learns each time a new image is “labeled.” This increases the likelihood of reliably recognizing as many different versions of the image content as possible, such as other kinds of dogs or characters with varying fonts or shapes. The individual objects can thus be dependably divided into appropriate classes, which makes for fully automated, independent recognition without the system having to explicitly provide a sample image for each object.

Self-learning system for classification

To ensure high recognition rates, the algorithms take into account the features of all of the image information that is recorded. The system thus learns which properties represent a certain class. For example, the system automatically recognizes which features apply to the “dog” class and which external properties are typical for the particular letters or numbers. The technology also learns from errors. If the result is incorrect while a network is being trained, parameters are adapted and the process starts over. The error recognition is also based on the analysis of large amounts of digital image data, whereby the system automatically detects deviations. This iterative procedure continues to repeat until the model is ideally trained for the application. The main difference between classic machine learning and deep learning is also apparent here. Deep learning does not require features which must be laboriously and manually defined and verified by the developer, but instead, it uses learning algorithms to automatically find and extract the unique patterns for distinguishing between classes.

Modern machine vision solutions use the deep learning technologies described above for optical character recognition. New machine vision software contains an OCR classifier based on deep learning algorithms, which can be accessed with a wide range of pre-trained fonts. As a result, much higher reading rates can be achieved than with any previous classification method. The software also permits the robust recognition of dot print, SEMI, industrial, and document-based fonts with only one universal pre-trained font. Thus, a specifically pre-trained font does not have to be selected for each type of application. This is an enormous benefit for companies, since a very large number of images—as high as in the six-digit-range—is generally needed for training. Providing pre-trained fonts therefore saves a great deal of time. The improved character recognition was verified in test series using an extremely complex data set. It was even possible to cut the error rate in half in images with many different characters of varying sizes, shapes, fonts, and qualities. Objects can thus be identified more reliably, and objects with production or printing errors can be more clearly detected and therefore rejected.

Summary

OCR techniques are indispensable in industrial engineering and production processes for the automated recognition of objects with printed number or letter combinations. Innovative machine vision solutions support reliable identification with integrated deep learning technologies. With the aid of classifiers based on a large number of pre-trained fonts, higher recognition rates than ever before can now be achieved. Companies also save the time that is required for laboriously training fonts.