Machine Vision is an important and rapidly developing branch in the field of artificial intelligence and is currently in a stage of continuous breakthroughs and maturity.
It is generally believed that machine vision "is a device that automatically receives and processes an image of a real scene through optical devices and non-contact sensors, obtains the required information or controls the movement of the machine by analyzing the image." In other words, it can be understood as collecting image information through image sensors such as industrial cameras, analyzing and processing the converted image information, and then controlling the subsequent process actions of the automated equipment.
1. Application of image processing technology
The image processing system of machine vision calculates and analyzes the digital image signals on site according to specific application requirements, and controls the actions of on-site equipment based on the processing results. Its common applications are as follows:
(1) Image acquisition
Image acquisition is the process of obtaining scene images from the work site. It is the first step in machine vision. Most acquisition tools are CCD or CMOS cameras or video cameras.
Cameras collect single images, while video cameras can collect continuous live images. As far as an image is concerned, it is actually a projection of a three-dimensional scene on a two-dimensional image plane. The color (brightness and chromaticity) of a certain point in the image is a reflection of the color of the corresponding point in the scene. This is the fundamental basis for us to use collected images to replace real scenes.
If the camera outputs analog signals, the analog image signals need to be digitized and sent to computers (including embedded systems) for processing. Most cameras can now directly output digital image signals, eliminating the need for analog-to-digital conversion. In addition, the digital output interfaces of cameras are now standardized, such as USB, VGA, 1394, HDMI, WiFi, and Blue Tooth interfaces, which can be directly sent to computers for processing, eliminating the need to add an image acquisition card between the image output and the computer. Subsequent image processing is often performed by computers or embedded systems in software.
(2) Image preprocessing
The collected digital field images are often affected by equipment and environmental factors, and are often interfered with to varying degrees, such as noise, geometric deformation, color distortion, etc., which will hinder the subsequent processing. Therefore, the collected images must be preprocessed. Common preprocessing includes noise elimination, geometric correction, histogram equalization, etc.
Usually, time domain or frequency domain filtering methods are used to remove noise from images; geometric transformation methods are used to correct the geometric distortion of images; histogram equalization, homomorphic filtering and other methods are used to reduce the color deviation of images.
In short, through this series of image preprocessing technologies, the captured images are "processed" to provide "better" and "more useful" images for machine vision applications.
(3) Image segmentation
Image segmentation is to divide the image into regions with different characteristics according to the application requirements and extract the target of interest from it. Common features in images include grayscale, color, texture, edge, corner point, etc. For example, the image of the automobile assembly line is segmented into background area and workpiece area, which are provided to the subsequent processing unit for processing the installation part of the workpiece.
Image segmentation has been a difficult problem in image processing for many years. There are many types of segmentation algorithms, but the results are often not ideal. Recently, people use deep learning methods based on neural networks for image segmentation, and their performance outperforms traditional algorithms.
(4) Target recognition and classification
In industries such as manufacturing and security, machine vision cannot do without the recognition and classification of targets in input images, so as to complete subsequent judgments and operations on this basis. Recognition and classification technologies have many similarities. Often, after target recognition is completed, the category of the target is also clear. Recent image recognition technology is transcending traditional methods and forming intelligent image recognition methods with neural networks as the mainstream, such as convolutional neural networks (CNNs), recurrent neural networks (RNNs), and other methods with superior performance.
(5) Target positioning and measurement
In intelligent manufacturing, the most common task is to install the target workpiece, but the target often needs to be positioned before installation, and the target needs to be measured after installation. Both installation and measurement require high accuracy and speed, such as millimeter-level accuracy (or even smaller) and millisecond-level speed.
This high-precision, high-speed positioning and measurement is difficult to achieve by conventional mechanical or manual methods. In machine vision, image processing is used to process images of the installation site according to the complex mapping relationship between the target and the image, so as to quickly and accurately complete the positioning and measurement tasks.
(6) Target detection and tracking
Moving target detection and tracking in image processing is to detect in real time whether there is a moving target in the scene image captured by the camera, and predict its next movement direction and trend, that is, tracking. And submit these motion data to subsequent analysis and control processing in a timely manner to form corresponding control actions. Image acquisition generally uses a single camera, and if necessary, two cameras can be used to imitate human binocular vision to obtain stereoscopic information of the scene, which is more conducive to target detection and tracking processing.
2. Challenges faced
In the development of machine vision image processing technology, there are still many technical bottlenecks. For example, a certain processing method often performs well in research and development, but problems often occur in complex and changing application environments. For example, the face recognition system can have a recognition rate of more than 95% when the target cooperates, but in the actual monitoring environment, the recognition rate will drop significantly.
Machine vision systems require image recognition and measurement accuracy close to 100%, and any slight error may lead to unpredictable consequences. For example, errors in target positioning may cause the assembled equipment to fail to meet requirements.
Visual inspection equipment has high real-time detection efficiency and large image acquisition data volume. If the image acquisition and processing speeds are slow, coupled with the newly introduced deep learning algorithms, the real-time processing of the system will be more difficult and will not be able to keep up with the pace of machine operation and control. Therefore, it is very important to increase the image processing speed.
3. The main methods to improve image processing speed
There are currently two main methods to increase image processing speed.
The first is to improve and optimize the image processing algorithm. The algorithm should be simple, fast, and take into account the actual effect; the second is to improve and optimize the means of implementing the algorithm.
So, in what ways can machine vision inspection equipment improve inspection speed?
(1) Application-Specific Integrated Circuit (ASIC)
ASIC is a hardware chip specially designed for a fixed algorithm or application, and has strong real-time performance. However, in actual applications, it has disadvantages such as relatively long development cycle, high cost, poor adaptability and flexibility.
(2) Field Programmable Gate Array (FPGA)
FPGA is a two-dimensional matrix composed of multiple programmable basic logic units. The logic units and the logic units and I/O units are connected through programmable lines. FPGA has strong flexibility in design, and its integration and working speed are constantly improving. The functions that can be realized are also becoming stronger and stronger. At the same time, its development cycle is short, the system is easy to maintain and expand, and it can greatly improve the processing speed of image data.
(3) General-purpose computer network parallel processing
This processing structure adopts the "multi-client + server" approach, where one image sensor corresponds to one client, the server realizes information synthesis, and most of the image processing work is completed by software. Although this structure is relatively large, it is easy to upgrade and maintain, and has good real-time performance.
(4) Digital Signal Processor (DSP)
DSP is a unique microprocessor that processes large amounts of information using digital signals. Its working principle is to convert the received analog signal into a digital signal of "0" or "1", then modify, delete and enhance the digital signal, and interpret the digital data back to analog data or actual environment format in other system chips. Its real-time operating speed is much faster than that of general-purpose microprocessors. However, the DSP system is still a serial instruction execution system, and only performs hardware optimization on certain fixed operations, so it cannot meet the requirements of many algorithms.
In the real-time image processing system, the underlying signal data volume is large, and the processing speed requirement is high, but the operation structure is relatively simple, which is suitable for FPGA to implement in hardware; the high-level processing algorithm is characterized by relatively small amount of processed data, but the algorithm and control structure are complex, which can be implemented using DSP. Therefore, the advantages of the two can be combined to take into account both real-time performance and flexibility.
