Proceedings Volume 9530

Automated Visual Inspection and Machine Vision

Jürgen Beyerer, Fernando Puente León
cover
Proceedings Volume 9530

Automated Visual Inspection and Machine Vision

Jürgen Beyerer, Fernando Puente León
Purchase the printed version of this volume at proceedings.com or access the digital version at SPIE Digital Library.

Volume Details

Date Published: 19 May 2015
Contents: 6 Sessions, 27 Papers, 0 Presentations
Conference: SPIE Optical Metrology 2015
Volume Number: 9530

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Front Matter: Volume 9530
  • Measurement
  • Multispectral Inspection
  • Optical Sorting
  • Inspection, Monitoring and Detection
  • Poster Session
Front Matter: Volume 9530
icon_mobile_dropdown
Front Matter: Volume 9530
This PDF file contains the front matter associated with SPIE Proceedings Volume 9530, including the Title Page, Copyright information, Table of Contents, Introduction (if any), and Conference Committee listing.
Measurement
icon_mobile_dropdown
Camera series and parallel networks for deformation measurements of large scale structures
Qifeng Yu, Yang Shang, Banglei Guan, et al.
Videometrics is a technique for measuring displacement, deformation and motion with features of precision, multifunctional, automation and real time measurement etc. Videometrics with camera networks is a fast developed area for deformation measurements of large scale structures. Conventional camera network is parallel network where cameras are independent each other and the relations among the cameras are calibrated from their target images. In recent years, we proposed and developed two kinds of videometrics with camera series networks where cameras are connected each other in series and relations among the cameras can be relayed one by one for the deformation measurements of large and super large scale structures. In this paper, our research work in both the camera series and parallel networks for the deformation measurements of large scale structures are overviewed and some new development are introduced. First, our proposed methods of camera series networks are introduced, including the pose-relay videometrics with camera series and the displacement-relay videometrics with camera series. Then our work of large scale structure deformation measurement by camera parallel networks is overviewed. Videometrics with various types of camera networks has the broad prospect of undertaking automatic, long-term and continuous measurement for deformation in engineering projects such as wind turbine blades, ship, railroad beds, and bridges.
Extending critical dimension measurement for optical microlithography with robust SEM image contours
Dimensions of fine and complex structures printed by microlithography inside a photoresist are inspected with the help of Scanning Electron Microscope generating high resolution top down greyscale digital images. The edges of the photoresist can be extracted from the images to produce contours that best exploit the relevant image information. Unfortunately, such contours are usually of bad quality and subject to edge detection errors due to the noise of the SEM image and the roughness of the photoresist. In this work, we introduce a new method to deal with contours to easily complete complex operations like smoothing of contours or robust averaging of multiple contours without the help of any reference contour layer. Our approach is to use level set to represent any 2D contour as a 3D surface. We demonstrate that we can easily smooth complex 2D contours of critical structures of around 50nm width. We also managed to generate very reliable contours by averaging multiple contours of low quality. Finally, the level set method was used to locally derive confidence about the determination of the average contours.
Towards one trillion positions
Tobias Haist, Marc Gronle, Duc Anh Bui, et al.
How accurately can you determine positions using a non-expensive imaging system? We demonstrate a system, that has the potential to achieve position detections over a large measurement field (200 x 200 mm) for one million times one million 2D positions. Non-expensive telecentric imaging of the large object field is achieved using a large diffractive front element in combination with two small off-the-shelf lenses. The position measurement itself is considerably improved using a simple replication technique: the point to be measured is replicated N-times and the centers of gravity of the N points are averaged. By this approach discretization errors and camera noise are reduced by the square root of the number of points. We describe the system, discuss the error model and show experimental results for the DOE-based telecentric imaging and the position detection sensing.
Application of a reflectance model to the sensor planning system
This study describes a new sensor planning system for the automatic generation of scanning positions based on a computer model of the part for digitization of sheet metal parts. The focus of this paper is in the application of a reflectance model into this sensor planning system. The goal of this sensor planning system and application of this model is to ensure fast, complete and accurate digitization of the parts for their inspection during serial-line production, especially in the automotive industry. A methodology of the sensor planning system consists of positions planning, their simulation for true visibility of the part elements using a reflectance model, and a simulation of the positions for robot reachability. Compared to previous studies, visual properties of the scanned parts’ surface can be simulated precisely. The Nayar model is used as a reflectance model. This model is suitable for materials that are characterized by the combination of diffuse and specular reflections and uses three components of reflection: diffuse, specular lobe and specular spike. Results of the scanning that were obtained using an ATOS III Triple Scan fringe projection 3D scanner and a KUKA KR 60 HA industrial robot were compared to the simulation. The comparison based on the correspondence of the polygons area acquired in each sensor position (in simulation and in scanning) shows that in the performed measurements the median of differences between simulation and scanning is around 16%.
The application of vision measurement in aerodynamic testing combined with speckle correlation
Ding Chen, Jin-guo Zhang, Ye-hua Zhang, et al.
This paper presents a combination of visual measurement technique of speckle correlation method in aerodynamic test application. Modal analysis of aerodynamic testing and deformation measurement is often very important but very difficult to achieve, fortunately, the development of modern optical measurement techniques made it possible. First, we conduct the modal analysis on an airfoil model and its deformation analysis under certain conditions. Then, the above technique was used to verify it. The results of the aerodynamic test and finite element analysis agree well, The novel of the new method is combining the speckle correlation and the model deformation in the aerodynamic testing. This method using the speckle correlation to process the data, combining sub-pixel correlation can make the results achieve very high precision and realized the real planar measuring. This non-contact full-field optical metrology shows a lot of abstracting potentials in aerodynamic test applications.
Multispectral Inspection
icon_mobile_dropdown
Research of the fusion methods of the multispectral optoelectronic systems images
This article is devoted to consideration of the issues relating to digital images fusion of the multispectral optoelectronic systems. The images fusion formation methods and methods are studied. Theoretical analysis of the methods was completed in the course of the work, mathematical simulation model of the multispectral optoelectronic systems was developed. Effect of various factors on the result of fusion was demonstrated on the basis of the said model investigation. The paper also considers and suggests the objective assessment methods of the fusion image quality. The paper describes the mostly widely used from the above: the averaging method, the masking technique fusion, the interlacing fusion, fusion of images Fourier spectrum. The quality of the resulting image was assessed on the basis of the calculation of the cross entropy, brightness dispersion and excess of the Fourier spectrum function. Based on the research findings we can state that the images obtained by the mask technique methods, by averaging and the Fourier spectrum fusion methods have the highest information entropy. The best quality feature, in terms of the brightness dispersion and excess of the Fourier spectrum function, was demonstrated by the averaging method. The method allows reducing noise components of an image on the account of smoothing of its local brightness variations smoothing thus the contrast is improved.
A new indicator in early drought diagnosis of cucumber with chlorophyll fluorescence imaging
Heng Wang, Haifeng Li, Liang Xu, et al.
Crop population growth information can more fully reflect the state of crop growth, eliminate individual differences, and reduce error in judgment. We have built a suitable plant population growth information online monitoring system with the plant chlorophyll fluorescence and spectral scanning imaging to get the crop growth status. On the basis of the fluorescence image detection, we have studied the early drought diagnosis of cucumber. The typical chlorophyll fluorescence parameters can not reflect the drought degree significantly. We define a new indication parameter (DI). With the drought deepening, DI declines. DI can enlarge the early manifestation of cucumber drought (3-5 days), indicate more significantly in the early drought diagnosis of cucumber.
Optical Sorting
icon_mobile_dropdown
Spatial regularization for the unmixing of hyperspectral images
Sebastian Bauer, Florian Neumann, Fernando Puente León
For demanding sorting tasks, the acquisition and processing of color images does not provide sufficient information for the successful discrimination between the different object classes that are to be sorted. An alternative to integrating three spectral regions of visible light to the three color channels is to sample the spectrum at up to several hundred, evenly-spaced points and acquire so-called hyperspectral images. Such images provide a complete image of the scene at each considered wavelength and contain much more information about the composition of the different materials. Hyperspectral images can also be acquired in spectral regions neighboring visible light such as, e.g., the ultraviolet (UV) and near-infrared (NIR) region. From a mathematical point of view, it is possible to extract the spectra of the pure materials and the amount to which these spectra contribute to material mixtures. This process is called spectral unmixing. Spectral unmixing based on the mostly used linear mixing model is a difficult task due to model ambiguities and distorting factors such as noise. Until a few years ago, the most inherent property of hyperspectral images, that is to say, the abundance correlation between neighboring pixels, was not used in unmixing algorithms. Only recently, researchers started to incorporate spatial information into the unmixing process, which by now is known to improve the unmixing results. In this paper, we will introduce two new methods and study the effect of these two and two already described methods on spectral unmixing, especially on their ability to account for edges and other shapes in the abundance maps.
Identification and sorting of regular textures according to their similarity
Regardless whether mosaics, material surfaces or skin surfaces are inspected their texture plays an important role. Texture is a property which is hard to describe using words but it can easily be described in pictures. Furthermore, a huge amount of digital images containing a visual description of textures already exists. However, this information becomes useless if there are no appropriate methods to browse the data. In addition, depending on the given task some properties like scale, rotation or intensity invariance are desired. In this paper we propose to analyze texture images according to their characteristic pattern. First a classification approach is proposed to separate regular from non-regular textures. The second stage will focus on regular textures suggesting a method to sort them according to their similarity. Different features will be extracted from the texture in order to describe its scale, orientation, texel and the texel’s relative position. Depending on the desired invariance of the visual characteristics (like the texture’s scale or the texel’s form invariance) the comparison of the features between images will be weighted and combined to define the degree of similarity between them. Tuning the weighting parameters allows this search algorithm to be easily adapted to the requirements of the desired task. Not only the total invariance of desired parameters can be adjusted, the weighting of the parameters may also be modified to adapt to an application-specific type of similarity. This search method has been evaluated using different textures and similarity criteria achieving very promising results.
The influence of the design features on optical sorter effectiveness
Nikita A. Pavlenko, Valery V. Korotaev
Paper deals with investigation of factors which may have a significant negative impact on the process of mineral raw material enrichment by optical sorting method. The studies were conducted using experimental setup which is a prototype of an optical sorter for sorting mineral objects of small size. Special software was developed in the LabVIEW for experimental setup control and analysis of experimental results. The impact of illumination features, as well as used lens were analyzed.
Inspection, Monitoring and Detection
icon_mobile_dropdown
Investigation into the use of smartphone as a machine vision device for engineering metrology and flaw detection, with focus on drilling
Vikram Razdan, Richard Bateman
This study investigates the use of a Smartphone and its camera vision capabilities in Engineering metrology and flaw detection, with a view to develop a low cost alternative to Machine vision systems which are out of range for small scale manufacturers. A Smartphone has to provide a similar level of accuracy as Machine Vision devices like Smart cameras. The objective set out was to develop an App on an Android Smartphone, incorporating advanced Computer vision algorithms written in java code. The App could then be used for recording measurements of Twist Drill bits and hole geometry, and analysing the results for accuracy. A detailed literature review was carried out for in-depth study of Machine vision systems and their capabilities, including a comparison between the HTC One X Android Smartphone and the Teledyne Dalsa BOA Smart camera. A review of the existing metrology Apps in the market was also undertaken. In addition, the drilling operation was evaluated to establish key measurement parameters of a twist Drill bit, especially flank wear and diameter. The methodology covers software development of the Android App, including the use of image processing algorithms like Gaussian Blur, Sobel and Canny available from OpenCV software library, as well as designing and developing the experimental set-up for carrying out the measurements. The results obtained from the experimental set-up were analysed for geometry of Twist Drill bits and holes, including diametrical measurements and flaw detection. The results show that Smartphones like the HTC One X have the processing power and the camera capability to carry out metrological tasks, although dimensional accuracy achievable from the Smartphone App is below the level provided by Machine vision devices like Smart cameras. A Smartphone with mechanical attachments, capable of image processing and having a reasonable level of accuracy in dimensional measurement, has the potential to become a handy low-cost Machine vision system for small scale manufacturers, especially in field metrology and flaw detection.
Range imaging behind semi-transparent surfaces by high-speed modulated light
D. Geerardyn, H. Ingelberts, R. Deleener, et al.
Range-imaging is a measurement technique able to generate an image which contains the distance information from the camera to all the points of a scene. This distance information can be captured by, amongst others, the Time-of-Flight principle which measures the time a light pulse needs to travel back and forth from the camera to the scene and converts this time into a depth value. For a good operation of the Time-of-Flight principle, a high-power, fast-modulated light source is required. Currently, most 3D cameras use laser diodes or LEDs. Moreover, most systems use square-wave modulation of the light source, requiring high bandwidths of the optical driver. To enhance both bandwidth and optical power, we developed a light source consisting of 16 high-power (50 mW) laser diodes using GHz laser drivers, combined with GHz buffers. Moreover, this light source can be integrated in a Time-of-Flight camera. Specifically, we designed and experimentally validated this new light source, based on ultra-fast laser diodes, allowing an increased performance of the current Time-of-Flight cameras. In this paper, we first discuss the development of a high-power illumination board, with a large beam divergence and suitable for high-speed square-wave modulation with a chosen duty-cycle. Our light source can be modulated faster than 1 GHz, which corresponds to optical pulses shorter than 500 ps. Moreover, the pulses can be shifted in time with sub-nanosecond precision. Secondly, we integrated this light source into a Time-of-Flight setup, able to measure the distances of objects behind a semi-transparent surface. The resulting images are compared with the image quality of commercially available Time-of-Flight cameras. From these results, we can conclude that our light source is suitable for Time-of-Flight measurements and gives a low-cost alternative for imaging purposes. Moreover, it can handle both pulsed as continuous-wave Time-of-Flight, to allow a broader range of applications.
Retroreflective microprismatic materials in image-based control applications
Mariya G. Serikova, Anton V. Pantyushin, Elena V. Gorbunova, et al.
This work addresses accurate position measurement of reference marks made of retroreective microprismatic materials by image-based systems. High reflection microprismatic technology implies tiny hermetically sealed pockets, which improve material reflectivity, but result in non-reflective preprinted netting pattern. The mark pattern to be used for measuring can be simply printed on the reflective material as an opaque area with predefined shape. However, the non-reflecting pattern acts as a spatial filter that affects resultant spatial reflectivity of the mark. When an image of the mark is taken, the desired mark shape can be deformed by the netting pattern. This deformational may prevent accurate estimation of the mark position in the image. In this paper experimental comparison of three image filtering approaches (median filtering, morphological close and filtering in a frequency domain) in order to minimize the affection of the netting pattern is provided. These filtering approaches were experimentally evaluated by processing of the images of the mark that was translated in a camera field of view. For that a developed experimental setup including a camera with LED backlight and the mark placed on a translation stage was used. The experiment showed that median filtering provided better netting pattern elimination and higher accuracy of key features position estimation (approximately ±0.1 pix) in the condition of the experiment. The ways of future use of reference marks based on microprismatic material in image-based control applications are discussed.
Object recognition in 3D point clouds with maximum likelihood estimation
A novel technique for object recognition and localization within a 3D point cloud has been developed, by constructing a likelihood function for the pose vector of a known model in a measured scene. The function is based on surface features in the model and corresponding surface features detected in the scene. Using an optimization algorithm, the maximum of the function was found, corresponding to the 6 degree of freedom (DOF) pose of the model within the scene even in the presence of significant clutter.
Automatic detection system of shaft part surface defect based on machine vision
Lixing Jiang, Kuoyuan Sun, Fulai Zhao, et al.
Surface physical damage detection is an important part of the shaft parts quality inspection and the traditional detecting methods are mostly human eye identification which has many disadvantages such as low efficiency, bad reliability. In order to improve the automation level of the quality detection of shaft parts and establish its relevant industry quality standard, a machine vision inspection system connected with MCU was designed to realize the surface detection of shaft parts. The system adopt the monochrome line-scan digital camera and use the dark-field and forward illumination technology to acquire images with high contrast; the images were segmented to Bi-value images through maximum between-cluster variance method after image filtering and image enhancing algorithms; then the mainly contours were extracted based on the evaluation criterion of the aspect ratio and the area; then calculate the coordinates of the centre of gravity of defects area, namely locating point coordinates; At last, location of the defects area were marked by the coding pen communicated with MCU. Experiment show that no defect was omitted and false alarm error rate was lower than 5%, which showed that the designed system met the demand of shaft part on-line real-time detection.
Poster Session
icon_mobile_dropdown
Modeling and analysis of the two-dimensional polystyrene aggregation process
Krzysztof Skorupski, Imre Horvath, Maria Rosaria Vetrano
Small particles tend to aggregate and create large fractal-like structures which can be analysed using microscopy techniques. In this work we present an algorithm capable of measuring the basic morphological parameters of two-dimensional polystyrene layers. Our study was divided into two separate parts. The goal of the first one was to create high quality particle monolayers. Their purpose was to allow for monitoring of the two-dimensional aggregation process by means of optical microscopy. In the next step microscopy images were analysed in more detail. The size distribution function and the total number of particles were calculated. When an aggregate was larger than a specified size its fractal dimension was approximated using the box-counting technique. After retrieving the morphological parameters fractal-like aggregate models were created using the most common tunable algorithms. Our study proved that real structures resemble to geometries generated with CC (Cluster-Cluster) aggregation techniques. Initial clusters, i.e. those generated during early stages of the aggregation process, are characterized by slightly larger fractal dimension. However, its value decreases along with the aggregation time. The next step is to improve our algorithm even further and use it in a fully automatic on-line monitoring process.
Fast interframe transformation with local binary patterns
In this paper, we propose a background stabilization method for an arbitrary camera movement. We investigate the state of the art algorithms for feature point detection and introduce a composite LBP descriptor to describe the feature points both with an algorithm for feature points matching on a sequence of images. In addition, an algorithm for constructing an affine transformation of the old frame in the sequence into the new one for the tasks of stabilization and image stitching was proposed.
New opportunities of pegmatites enrichment by optical sorting
Aleksandr N. Chertov, Elena V. Gorbunova, Artem A. Alekhin, et al.
The paper presents the research results of pegmatites from Karelian deposits. The aim of this research was to find selective features of microcline, biotite, muscovite, quartz, and plagioclase for determining the opportunity of their selection from original ore by optical sorting method which based on color differences of analyzed objects. Studies have shown that the solution of the problem of these minerals separation is possible in 3 stages. In the first stage groups "microcline", "muscovite and biotite", "quartz and plagioclase," are separated according to the values of channels hue H and lightness L in the color model HLS. In the second stage biotite and muscovite are separated from each other by the values of the channel hue H and saturation S. Finally, in the third stage couple "quartz - plagioclase" are separated. But these minerals are indistinguishable from each other by color, so it's proposed to separate them by selective feature "surface structure."
The algorithm for generation of panoramic images for omnidirectional cameras
The omnidirectional cameras are used in areas where large field-of-view is important. Omnidirectional cameras can give a complete view of 360° along one of direction. But the distortion of omnidirectional cameras is great, which makes omnidirectional image unreadable. One way to view omnidirectional images in a readable form is the generation of panoramic images from omnidirectional images. At the same time panorama keeps the main advantage of the omnidirectional image - a large field of view. The algorithm for generation panoramas from omnidirectional images consists of several steps. Panoramas can be described as projections onto cylinders, spheres, cubes, or other surfaces that surround a viewing point. In practice, the most commonly used cylindrical, spherical and cubic panoramas. So at the first step we describe panoramas field-of-view by creating virtual surface (cylinder, sphere or cube) from matrix of 3d points in virtual object space. Then we create mapping table by finding coordinates of image points for those 3d points on omnidirectional image by using projection function. At the last step we generate panorama pixel-by-pixel image from original omnidirectional image by using of mapping table. In order to find the projection function of omnidirectional camera we used the calibration procedure, developed by Davide Scaramuzza – Omnidirectional Camera Calibration Toolbox for Matlab. After the calibration, the toolbox provides two functions which express the relation between a given pixel point and its projection onto the unit sphere. After first run of the algorithm we obtain mapping table. This mapping table can be used for real time generation of panoramic images with minimal cost of CPU time.
Automatic online laser resonator alignment based on machine vision: analysis
Lizhi Dong, Wenjin Liu, Ping Yang, et al.
In order to maintain sufficient performance of a laser, proper alignment of the resonator is very important. We present online laser resonator alignment based on machine vision. In this method, a camera detects the displacement of the laser beam spot on the rear mirror from a reference location, and the displacement of this spot is used to indicate the misalignment of the cavity mirrors. The resonator could be automatically aligned using the displacement as feedback. We give a detailed analysis of the relation between tilt of resonator mirrors and the beam spot location on the rear mirror by calculating the modes of the resonator. Both a stable symmetric confocal resonator and a positive branch confocal unstable resonator are investigated. Calculation results show that the displacement of the beam spot on the rear mirror continuously grows with tilt of the resonator mirrors, and the direction of the displacement can well reflect the direction of the misalignment, indicating that the displacement can effectively denote resonator misalignment.
Control system of warehouse robots' position
Ivan A. Maruev, Eugene G. Lebedko, Anton V. Nikulin
Development of robotic vehicles allowed to carry out massively introduction of their different spheres of activity. But often a necessary condition for the functioning of such systems is the presence of the control of their movement. The opto-electronic system control the spatial position of vehicles, such as mobile robots, describes in this paper. The system consists of reference marks installed on the vehicle and cameras for watching it. The paper presents a mathematical description of the system, the method of determining the coordinates of objects based on their photographic projections using the camcorder. The layout system was developed for testing algorithms having two cameras observe the movement of the layout of the vehicle, realized on the platform Rover 5 Chaisis. The reference mark, which consists of four LEDs, was fixed on the vehicle. The configuration of the LEDs has been presented in the form of vertices of the cube. In the course of the study was found that error does not exceed a value of 1 mm at the distance of 2 meters.
A novel regularization method for optical flow-based head pose estimation
This paper presents a method for appearance-based 3D head pose tracking utilizing optical flow computation. The task is to recover the head pose parameters for extreme head pose angles based on 2D images. A novel method is presented that enables a robust recovery of the full motion by employing a motion-dependent regulatory term within the optical flow algorithm. Thereby, the rigid motion parameters are coupled directly with a regulatory term in the image alignment method affecting translation and rotation independently. The ill-conditioned, nonlinear optimization problem is stabilized by the proposed regulatory term yielding suitable conditioning of the Hessian matrix. It is shown that the regularization corresponding to the motion parameters can be extended to full 3D motion consisting of six parameters. Experiments on the Boston University head pose dataset demonstrate the enhancement of robustness in head pose estimation compared to conventional regularization methods. Using well-defined values for the regulatory parameters, the proposed method shows significant improvement in headtracking scenarios in terms of accuracy compared to existing methods.
Accurate invariant pattern recognition for perspective camera model
Mariya G. Serikova, Ekaterina N. Pantyushina, Vadim V. Zyuzin, et al.
In this work we present a pattern recognition method based on geometry analysis of a flat pattern. The method provides reliable detection of the pattern in the case when significant perspective deformation is present in the image. The method is based on the fact that collinearity of the lines remains unchanged under perspective transformation. So the recognition feature is the presence of two lines, containing four points each. Eight points form two squares for convenience of applying corner detection algorithms. The method is suitable for automatic pattern detection in a dense environment of false objects. In this work we test the proposed method for statistics of detection and algorithm's performance. For estimation of pattern detection quality we performed image simulation process with random size and spatial frequency of background clutter while both translational (range varied from 200 mm to 1500 mm) and rotational (up to 60°) deformations in given pattern position were added. Simulated measuring system included a camera (4000x4000 sensor with 25 mm lens) and a flat pattern. Tests showed that the proposed method demonstrates no more than 1% recognition error when number of false targets is up to 40.
Stereo sequences analysis for dynamic scene understanding in a driver assistance system
The improved stereo-based approach for dynamic road scene understanding in a Driver Assistance System (DAS) is presented. System calibration is addressed. Algorithms for road lane detection, road 3D model generation, obstacle predetection and object (vehicle) detection are described. Lane detection is based on the evidence analysis. Obstacle predetection procedure performs the comparison of radial ortophotos, obtained by left and right stereo images. Object detection algorithm is based on recognition of back part of cars by histograms of oriented gradients. Car Stereo Sequences (CSS) Dataset captured by vehicle-based laboratory and published for DAS algorithms testing.
Image restoration using aberration taken by a Hartmann wavefront sensor on extended object, towards real-time deconvolution
In this paper we present the results of image restoration using the data taken by a Hartmann sensor. The aberration is measure by a Hartmann sensor in which the object itself is used as reference. Then the Point Spread Function (PSF) is simulated and used for image reconstruction using the Lucy-Richardson technique. A technique is presented for quantitative evaluation the Lucy-Richardson technique for deconvolution.
A FragTrack algorithm enhancement for total occlusion management in visual object tracking
In recent years, "FragTrack" has become one of the most cited real time algorithms for visual tracking of an object in a video sequence. However, this algorithm fails when the object model is not present in the image or it is completely occluded, and in long term video sequences. In these sequences, the target object appearance is considerably modified during the time and its comparison with the template established at the first frame is hard to compute. In this work we introduce improvements to the original FragTrack: the management of total object occlusions and the update of the object template. Basically, we use a voting map generated by a non-parametric kernel density estimation strategy that allows us to compute a probability distribution for the distances of the histograms between template and object patches. In order to automatically determine whether the target object is present or not in the current frame, an adaptive threshold is introduced. A Bayesian classifier establishes, frame by frame, the presence of template object in the current frame. The template is partially updated at every frame. We tested the algorithm on well-known benchmark sequences, in which the object is always present, and on video sequences showing total occlusion of the target object to demonstrate the effectiveness of the proposed method.