Suppose you want a machine vision system to automatically detect scratches on a plastic part such as a cell phone case. You know that the scratches will be nearly straight lines and that they can be low contrast and discontinuous. You can’t predict their orientations or lengths, but only scratches longer than some length must be reported.
You set up a machine vision system with part positioning, lighting, lens, camera, processor, and vision software, and do some experiments. As always, proper lighting is critical for showing the defects. For this task a low-angle ring light “catches” the raised edges of each scratch and so shows the scratch as brighter lines. This lighting also “catches” dust particles on the part.
TECH TIPS
Most machine vision algorithms don’t know about lines and use only local information.
So what we need is an algorithm that uses knowledge of lines to constrain which points are part of a line and a way to globally integrate local line segments into a reliable detection of scratches.
This is where the Hough Transform comes in. |
The vision software improves and thresholds the scratch signals, so you easily see scratches—but the vision system doesn’t reliably detect them. Scratches appear as low-contrast, discontinuous line segments while dust appears as random points. Your mind knows about lines and can globally integrate local, low-contrast line segments into the reliable perception of lines. This ability was described by the Gestalt psychologists as “The Law of Continuity.”
Most machine vision algorithms don’t know about lines and use only local information. So what we need is an algorithm that, like your mind, uses knowledge of lines to constrain which points are part of a line and a way to globally integrate local line segments into a reliable detection of scratches. This sounds like a job for the Hough Transform.
Introducing Mr. Hough
The Hough (pronounced “Huff”) Transform is a clever way to improve image structure detection when the structure can be parameterized, that is, described by set of parameters in an equation. In the case of straight line scratches, a straight line equation, y = m*x + b, fits a scratch line with parameters m and b, where m is the line’s slope and b is the line’s intercept with the Y axis. The algorithm “knows” about straight lines from this equation.
The Hough Transform generates parameter values m and b for all lines that could go through each detected (by a threshold, in this example) image point. Each possible line through each point then votes for its m and b values in a parameter space of possible m and b values. We limit and quantize this parameter space to get an accumulator space which accumulates votes for m and b values. After all possible lines through all detected points have voted, the accumulator space is searched for peaks that indicate which pairs of m and b parameters got the most votes. A peaks indicates the presence of line and gives its parameters and so its equation in the image.
The Hough Transform integrates weak local signals – scratch and noise points – over the image, using the line equation to select where in the accumulator space the weak local signals are summed. This gives us a way to integrate local image measures into a reliable, global detection of scratch lines.
A Simple Example
At this point if you aren’t confused, you haven’t been paying attention. Here’s a simple example of detecting a 3-point line at 45 degrees in an otherwise blank image (see Figure 2). The detected points (red dots) have coordinates {1,1},{2,2}, and {3,3}. We set up an accumulator space for b = 0,1,2, and 3 (magenta). The slope of the line, m, is computed from b and the position of each detected point, p = 1,2, and 3. We quantize m into size angle ranges (sectors) of 30 degrees each (blue text and dotted line).
There are 4 possible values for b and 6 for m, so our accumulator space has 4 x 6 = 24 cells. These are initially set to all be 0. With 3 points the peak vote value in the accumulator space will be 3 (yellow line in input image and yellow highlight in accumulator space). The total number of counts in the accumulator space is p x b = 3 x 4 = 12.
The algorithm outline is:
Do for each point, p // For all detected points in input image
Do for each intercept, b // For all quantized Y intercepts
m = arctan( (p-b)/p ) // Computes quantized slope, m, in degrees
accumulator[m][b]++ // Increment the accumulator cell at location m,b
Try a few point and intercepts to see how this works. For example, the drawing shows the line for point p = 1 and parameter b = 2 (in green). Using these values m = arctan(-1) = -45 degrees. The dotted green line at -45 degrees shows m is between -60 to -30 degrees. Then use the two parameters m = -60…-30 and b = 2 to increment the accumulator table. This is shown by the green dotted arrow from the line to the accumulator and the green number in the accumulator.
I’ve made this example simple by using the familiar form of the line equation and limiting points and intercepts to the first quadrant. In practice, b values can be negative and the line equation y = mx + b can’t represent vertical lines as m would go to infinity. Instead a rho, theta parameterization of the line is used, where rho is the perpendicular distance from line to the origin and theta is the angle rho makes with the X axis. This gives a line equation of the form: rho = x cos(theta) + y sin(theta). At every x,y detected point and for every quantized theta angle, a quantized rho is generated and the accumulator space at rho, theta is incremented. This produces characteristic sinusoidal patterns in the accumulator space that intersect at detected line parameters
To Infinity and Beyond
Encouraged by the success of the Hough Transform in detecting broken lines in noisy images, we consider applying this method to circles. Circles have three parameters: (x,y) center and radius, so our parameter space is three-dimensional. Let’s see how this works.
Let’s suppose you have an 800 by 600 pixel image and you choose a quantization of 1 pixel for the radius and assume a maximum radius of 800. That means there are 800 x 600 x 800 bins in your accumulator space, so 384,000,000 bins. The accumulator can fit in main memory but the voting will be slow because cell access violates the assumptions of memory caching in the CPU. Then you have additional time to search that space for peaks. Most of the accumulator space will be empty but there is no way to skip empty memory, at least on a standard computer.
Undeterred, we try ellipse detection using the Hough Transform. Now there are five parameters: (x,y) center, lengths of the major and minor axes, and major axis orientation, which we will quantize to 180 angles. On the same 800 by 600 pixel image we now have a five-dimensional space of 800 x 600 x 800 x 600 x 180 = 41,472,000,000,000. Needless to say, that won’t work on any vision system I’m aware of; perhaps the NSA has something that could handle this?
In short, the Hough Transform “explodes” as you increase the number of parameters. A line transform can be practical on a vision system, but a straightforward circle transform is usually not.
There are many ways to reduce the accumulator size and processing time. The simplest are to reduce the range of your accumulator and to increase quantization bin size. For example, if the radius of a circular object, such as a rivet or fastener, is limited to a small range and if the position of the center of the circular object is also restricted, then the circle Hough Transform is much faster.
Local edge structure can be used to reduce voting. For a line, we might vote for an m only when p and b generate a line that matches the local edge orientation as computed from a 3x3 (pixel) edge detector. This reduces accumulator access time and also increases the size of the desired peak in the accumulator relative to the off-peak values and so makes line detection more robust.
Similarly, for circles, the local curvature of circle edge points can limit voting to only points with a selected range of radii (radius = 1/curvature). In any case, you are betting that the local edge structure is representative of the image structure you want to detect. You are trading robustness for speed and smaller memory size.
The Generalized Hough Transform (GHT) extends the idea of using edge orientation to reduce time and accumulator dimensions and allows the detection of arbitrary object shapes. The basic GHT computes a table that lists object edge points by pairs of angles and radii from an object edge point to a center point as a function of the orientation of the object edge. When recognizing an object, the edge orientation at an object edge point is used to “look up” the angle, radii pairs from the table which vote for the position the object center point, as in the regular Hough Transform. The peak in the vote accumulation space gives the x,y object position.
This method is the basis for some machine vision “search” algorithms, which learn the outline of an arbitrary shape and then find that outline shape in new images, assuming the object is in the image.
What to Expect in Practice
The voting process increases the correct peak bin but also “sprays” votes to off-peak bins. Off-line noise will generally accumulate in off-peak bins. When the line is short so the correct peak has few counts, the noise and “spray” into the off-peak bins can cause false line detection. You already know how to reduce this problem: use good lighting and image preprocessing to get a strong signal before applying the Hough Transform.
Peak detection is usually done with a threshold that can be tricky to set because the correct peak bin might not be that much greater than the off-peak bin values and the correct peak and noise bins will vary from image to image. Local maximum peak detection sometimes helps and sometimes causes line detection from off-peak bins, that is, noise. As the threshold decreases you will often see multiple lines being detected through the same image (scratch) line, as bins around the correct peak are now being used, and they have similar parameters and so give similar lines.
When properly done, the Hough Transform can seem to have magical powers of using local features to pull out global image patterns, such as lines, from background noise.
References
Method and Means for Recognizing Complex Patterns, by Paul V. C. Hough, US Patent 3069654, December 1963.
Use of the Hough Transform to Detect Lines and Curves in Pictures, by Richard O. Duda and Peter E. Hart, Communications of the ACM, Vol. 15, Number 1., January 1972.
How the Hough Transform was Invented, by P.E. Hart, IEEE Signal Processing Magazine, Vol. 26, Issue 6, November 2009.
Hough Transform (A good introduction)
Generalizing the Hough Transform to Detect Arbitrary Shapes, by D.H. Ballard, Pattern Recognition, Vol.13, No.2, p.111-122, 1981.