Proceedings Volume 7538

Image Processing: Machine Vision Applications III

cover
Proceedings Volume 7538

Image Processing: Machine Vision Applications III

View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 28 January 2010
Contents: 9 Sessions, 32 Papers, 0 Presentations
Conference: IS&T/SPIE Electronic Imaging 2010
Volume Number: 7538

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Front Matter: Volume 7538
  • Industrial Inspection and Applications
  • Active Vision and Robotics I
  • Physical Imaging and Microscopy
  • Multispectral Imaging
  • 3D Vision and Range Imaging
  • Active Vision and Robotics II
  • Image Processing and Algorithms
  • Interactive Paper Session
Front Matter: Volume 7538
icon_mobile_dropdown
Front Matter: Volume 7538
This PDF file contains the front matter associated with SPIE Proceedings Volume 7538, including the Title Page, Copyright information, Table of Contents, and the Conference Committee listing.
Industrial Inspection and Applications
icon_mobile_dropdown
Rotating optical geometry sensor for inner pipe-surface reconstruction
Moritz Ritter, Christan W. Frey
The inspection of sewer or fresh water pipes is usually carried out by a remotely controlled inspection vehicle equipped with a high resolution camera and a lightning system. This operator-oriented approach based on offline analysis of the recorded images is highly subjective and prone to errors. Beside the subjective classification of pipe defects through the operator standard closed circuit television (CCTV) technology is not suitable for detecting geometrical deformations resulting from e.g. structural mechanical weakness of the pipe, corrosion of e.g. cast-iron material or sedimentations. At Fraunhofer Institute of Optronics, System Technologies and Image Exploitation (IOSB) in Karlsruhe, Germany, a new Rotating Optical Geometry Sensor (ROGS) for pipe inspection has been developed which is capable of measuring the inner pipe geometry very precisely over the whole pipe length. This paper describes the developed ROGS system and the online adaption strategy for choosing the optimal system parameters. These parameters are the rotation and traveling speed dependent from the pipe diameter. Furthermore, a practicable calibration methodology is presented which guarantees an identification of the several internal sensor parameters. ROGS has been integrated in two different systems: A rod based system for small fresh water pipes and a standard inspection vehicle based system for large sewer Pipes. These systems have been successfully applied to different pipe systems. With this measurement method the geometric information can be used efficiently for an objective repeatable quality evaluation. Results and experiences in the area of fresh water pipe inspection will be presented.
Fully automatic leaf characterisation in heterogeneous environment of plant growing automation
In the last decade, we have seen a tremendous emergence of genome sequencing analysis systems. These systems are limited by the ability to phenotype numerous plants under controlled environmental conditions. To avoid this limitation, it is desirable to use an automated system designed with plants control growth feature in mind. For each experimental sequence, many parameters are subject to variations: illuminant, plant size and color, humidity, temperature, to name a few. These parameters variations require the adjustment of classical plant detection algorithms. This paper present an innovative and automatic imaging scheme for characterising the plant's leafs growth. By considering a plant growth sequence it is possible, using the color histogram sequence, to detect day color variations and, then, to compute to set the algorithm parameters. The main difficulty is to take into account the automaton properties since the plant is not photographed exactly at the same position and angle. There is also an important evolution of the plant background, like moss, which needs to be taken into account. Ground truth experiments on several complete sequences will demonstrate the ability to identify the rosettes and to extract the plant characteristics whatever the culture conditions are.
3D vision for nuclear reactor retrofit tool docking
Jay Stavnitzky, Frederic Rivollier
This paper is a description of several vision applications utilizing close-range photogrammetric solutions to determine the positions of various components on a nuclear reactor face. The 3D position determination generated by the vision system is used to engage automated tools with the components of the nuclear reactor during a retubing retrofit of the reactor. A discussion of the vision algorithms and their performance in the system is presented. Specific challenges related to the use of a vision system in this environment will also be discussed.
Active Vision and Robotics I
icon_mobile_dropdown
Analysis of a multimodal-camera and its advantages for autonomous vehicles
Simon Hawe, Ulrich Kirchmaier, Klaus Diepold
For autonomous vehicles and robots a fast, complete, and reliable acquisition of the environment is crucial for almost every task they perform. To fulfil this, optical sensors with different spectral sensibility are one of the most important sensors as they provide very rich information about the scene. Regarding outdoor environments, the contained dynamics are very high which arise on the one hand from object movements and self motion and on the other hand from changing lighting conditions due to varying weather conditions. These high dynamics hinder a reliable scene acquisition using conventional optical sensors as they only offer a limited sampling rate, resolution, and dynamic range. To overcome these limitations without using specialized hardware we propose an assembly of several cameras and beam-splitters which we call a multimodal-camera. The cameras take images from the same scene from slightly different viewpoints and with diverse parameters like exposure, or shutter time which are all adjustable. By combing these images and applying techniques from computer graphics, we are able to create an output by computation that covers the scene's high dynamics and can be used for a reliable scene analysis.
Fully automatic 3D digitization of unknown objects
Gabriel F. Rozenwald, Ralph Seulin, Yohan D. Fougerolle
This paper presents a complete system for 3D digitization of objects assuming no prior knowledge on its shape. The proposed methodology is applied to a digitization cell composed of a fringe projection scanner head, a robotic arm with 6 degrees of freedom (DoF), and a turntable. A two-step approach is used to automatically guide the scanning process. The first step uses the concept of Mass Vector Chains (MVC) to perform an initial scanning. The second step directs the scanner to remaining holes of the model. Post-processing of the data is also addressed. Tests with real objects were performed and results of digitization length in time and number of views are provided along with estimated surface coverage.
Automatic trajectory clustering for generating ground truth data sets
Julia Moehrmann, Gunther Heidemann
We present a novel approach towards the creation of vision based recognition tasks. A lot of domain specific recognition systems have been presented in the past which make use of the large amounts of available video data. The creation of ground truth data sets for the training of theses systems remains difficult and tiresome. We present a system which automatically creates clusters of 2D trajectories. The results of this clustering can then be used to perform the actual labeling of the data, or rather the selection of events or features of interest by the user. The selected clusters can be used as positive training data for a user defined recognition task - without the need to adapt the system. The proposed technique reduces the necessary user interaction and allows the creation of application independent ground truth data sets with minimal effort. In order to achieve the automatic clustering we have developed a distance metric based on the Hidden Markov Model representations of three sequences - movement, speed and orientation - derived from the initial trajectory. The proposed system yields promising results and could prove to be an important steps towards mining very large data sets.
VisNAV 100: a robust, compact imaging sensor for enabling autonomous air-to-air refueling of aircraft and unmanned aerial vehicles
Anup Katake, Heeyoul Choi
To enable autonomous air-to-refueling of manned and unmanned vehicles a robust high speed relative navigation sensor capable of proving high accuracy 3DOF information in diverse operating conditions is required. To help address this problem, StarVision Technologies Inc. has been developing a compact, high update rate (100Hz), wide field-of-view (90deg) direction and range estimation imaging sensor called VisNAV 100. The sensor is fully autonomous requiring no communication from the tanker aircraft and contains high reliability embedded avionics to provide range, azimuth, elevation (3 degrees of freedom solution 3DOF) and closing speed relative to the tanker aircraft. The sensor is capable of providing 3DOF with an error of 1% in range and 0.1deg in azimuth/elevation up to a range of 30m and 1 deg error in direction for ranges up to 200m at 100Hz update rates. In this paper we will discuss the algorithms that were developed in-house to enable robust beacon pattern detection, outlier rejection and 3DOF estimation in adverse conditions and present the results of several outdoor tests. Results from the long range single beacon detection tests will also be discussed.
Physical Imaging and Microscopy
icon_mobile_dropdown
Comparison of the ability of quantitative parameters to differentiate surface texture of Atomic Force Microscope (AFM) images
Bethany Niedzielski, Christine Caragianis Broadbridge, John S. DaPonte, et al.
The purpose of this study was to compare the ability of several texture analysis parameters to differentiate textured samples from a smooth control on images obtained with an Atomic Force Microscope (AFM). Surface roughness plays a major role in the realm of material science, especially in integrated electronic devices. As these devices become smaller and smaller, new materials with better electrical properties are needed. New materials with smoother surface morphology have been found to have superior electrical properties than their rougher counterparts. Therefore, in many cases surface texture is indicative of the electrical properties that material will have. Physical vapor deposition techniques such as Jet Vapor Deposition and Molecular Beam Epitaxy are being utilized to synthesize these materials as they have been found to create pure and uniform thin layers. For the current study, growth parameters were varied to produce a spectrum of textured samples. The focus of this study was the image processing techniques associated with quantifying surface texture. As a result of the limited sample size, there was no attempt to draw conclusions about specimen processing methods. The samples were imaged using an AFM in tapping mode. In the process of collecting images, it was discovered that roughness data was much better depicted in the microscope's "height" mode as opposed to "equal area" mode. The AFM quantified the surface texture of each image by returning RMS roughness and the first order histogram statistics of mean roughness, standard deviation, skewness, and kurtosis. Color images from the AFM were then processed on an off line computer running NIH ImageJ with an image texture plug in. This plug in produced another set of first order statistics computed from each images' histogram as well as second order statistics computed from each images' cooccurrence matrix. The second order statistics, which were originally proposed by Haralick, include contrast, angular second moment, correlation, inverse difference moment, and entropy. These features were computed in the 0°, 45°, 90°, and 135° directions. The findings of this study propose that the best combination of quantitative texture parameters is standard deviation, 0° inverse difference moment, and 0° entropy, all of which are obtained from the NIH ImageJ texture plug in.
Conformity of valuable spikes by ombroscopic imaging
This paper describes a methodology for thin spikes characterization. Nowadays, its evaluation is performed by visual control. We propose a method to measure these spikes at a micrometric scale by using ombroscopic image processing. A spike needs to be mainly conic and its tip must be ogival. The first aspect is evaluated by comparing the spike with an ideal cone based on spike's contour. To find lines supported by contours, we use the Radon transform. However, due to irregular contour, we develop an improvement of this transform based on morphological operators. This way, real segments are found and a correct estimation of an ideal cone can be done. The second aspect is controlled by measuring the radius of the tip which gives both sharpness and regularity of the tip. As the following of the curvature is problematic, we use a morphological skeleton on the contour to obtain a structure similar to a Y. The intersection of these three branches leads to a correct estimation of the circular gauge. An additional filling criterion validates the result. This study is successful as the production is correctly classified and precise measures were obtained both in terms of global characteristics and sharpness.
Segmentation of thermographic images of hands using a genetic algorithm
Payel Ghosh, Melanie Mitchell, Judith Gold
This paper presents a new technique for segmenting thermographic images using a genetic algorithm (GA). The individuals of the GA also known as chromosomes consist of a sequence of parameters of a level set function. Each chromosome represents a unique segmenting contour. An initial population of segmenting contours is generated based on the learned variation of the level set parameters from training images. Each segmenting contour (an individual) is evaluated for its fitness based on the texture of the region it encloses. The fittest individuals are allowed to propagate to future generations of the GA run using selection, crossover and mutation. The dataset consists of thermographic images of hands of patients suffering from upper extremity musculo-skeletal disorders (UEMSD). Thermographic images are acquired to study the skin temperature as a surrogate for the amount of blood flow in the hands of these patients. Since entire hands are not visible on these images, segmentation of the outline of the hands on these images is typically performed by a human. In this paper several different methods have been tried for segmenting thermographic images: Gabor-wavelet-based texture segmentation method, the level set method of segmentation and our GA which we termed LSGA because it combines level sets with genetic algorithms. The results show a comparative evaluation of the segmentation performed by all the methods. We conclude that LSGA successfully segments entire hands on images in which hands are only partially visible.
A system architecture for online data interpretation and reduction in fluorescence microscopy
Thorsten Röder, Matthias Geisbauer, Yang Chen, et al.
In this paper we present a high-throughput sample screening system that enables real-time data analysis and reduction for live cell analysis using fluorescence microscopy. We propose a novel system architecture capable of analyzing a large amount of samples during the experiment and thus greatly minimizing the post-analysis phase that is the common practice today. By utilizing data reduction algorithms, relevant information of the target cells is extracted from the online collected data stream, and then used to adjust the experiment parameters in real-time, allowing the system to dynamically react on changing sample properties and to control the microscope setup accordingly. The proposed system consists of an integrated DSP-FPGA hybrid solution to ensure the required real-time constraints, to execute efficiently the underlying computer vision algorithms and to close the perception-action loop. We demonstrate our approach by addressing the selective imaging of cells with a particular combination of markers. With this novel closed-loop system the amount of superfluous collected data is minimized, while at the same time the information entropy increases.
Multispectral Imaging
icon_mobile_dropdown
Motion estimation accuracy for visible-light/gamma-ray imaging fusion for portable portal monitoring
Thomas P. Karnowski, Mark F. Cunningham, James S Goddard, et al.
The use of radiation sensors as portal monitors is increasing due to heightened concerns over the smuggling of fissile material. Portable systems that can detect significant quantities of fissile material that might be present in vehicular traffic are of particular interest. We have constructed a prototype, rapid-deployment portal gamma-ray imaging portal monitor that uses machine vision and gamma-ray imaging to monitor multiple lanes of traffic. Vehicles are detected and tracked by using point detection and optical flow methods as implemented in the OpenCV software library. Points are clustered together but imperfections in the detected points and tracks cause errors in the accuracy of the vehicle position estimates. The resulting errors cause a "blurring" effect in the gamma image of the vehicle. To minimize these errors, we have compared a variety of motion estimation techniques including an estimate using the median of the clustered points, a "best-track" filtering algorithm, and a constant velocity motion estimation model. The accuracy of these methods are contrasted and compared to a manually verified ground-truth measurement by quantifying the rootmean- square differences in the times the vehicles cross the gamma-ray image pixel boundaries compared with a groundtruth manual measurement.
A novel region-based approach for the fusion of combined stereo and spectral series
I. Gheta, S. Höfer, M. Heizmann, et al.
This contribution proposes a novel approach for image fusion of combined stereo and spectral series acquired simultaneously with a camera array. To this purpose, nine cameras are equipped with spectral filters (50 nm spectral bandwidth) such that the visible and near infrared parts of the spectrum (400-900 nm) are observed. The resulting image series is fused in order to obtain two types of information: the 3D shape of the scene and its spectral properties. For the registration of the images, a novel region based registration approach which evaluates the gray value invariant features (e.g. edges) of regions in segmented images is proposed. The registration problem is formulated by means of energy functionals. The data term of our functional compares features of a region in one image with features of an area in another image, such that an additional independency of the form and size of the regions in the segmented images is obtained. As regularization, a smoothness term is proposed, which models the fact that disparity discontinuities should only occur at edges in the images. In order to minimize the energy functional, we use graph cuts. The minimization is carried out simultaneously over all image pairs in the series. Even though the approach is region based, a label (e.g. disparity) is assigned to each pixel. The result of the minimization approach consists of a disparity map. By means of calibration, we use the disparity map to compute a depth map. Once pixel depths are determined, the images can be warped to a common view, such that a pure spectral series is obtained. This can be used to classify different materials of the objects in the scene based on real spectral information, which cannot be acquired with a common RGB camera.
3D Vision and Range Imaging
icon_mobile_dropdown
Multiple range imaging camera operation with minimal performance impact
Time-of-flight range imaging cameras operate by illuminating a scene with amplitude modulated light and measuring the phase shift of the modulation envelope between the emitted and reflected light. Object distance can then be calculated from this phase measurement. This approach does not work in multiple camera environments as the measured phase is corrupted by the illumination from other cameras. To minimize inaccuracies in multiple camera environments, replacing the traditional cyclic modulation with pseudo-noise amplitude modulation has been previously demonstrated. However, this technique effectively reduced the modulation frequency, therefore decreasing the distance measurement precision (which has a proportional relationship with the modulation frequency). A new modulation scheme using maximum length pseudo-random sequences binary phase encoded onto the existing cyclic amplitude modulation, is presented. The effective modulation frequency therefore remains unchanged, providing range measurements with high precision. The effectiveness of the new modulation scheme was verified using a custom time-of-flight camera based on the PMD19-K2 range imaging sensor. The new pseudo-noise modulation has no significant performance decrease in a single camera environment. In a two camera environment, the precision is only reduced by the increased photon shot noise from the second illumination source.
Calibration and control of a robot arm using a range imaging camera
Cameron B. D. Kelly, Adrian A. Dorrington, Michael J. Cree, et al.
Time of flight range imaging is an emerging technology that has numerous applications in machine vision. In this paper we cover the use of a commercial time of flight range imaging camera for calibrating a robotic arm. We do this by identifying retro-reflective targets attached to the arm, and centroiding on calibrated spatial data, which allows precise measurement of three dimensional target locations. The robotic arm is an inexpensive model that does not have positional feedback, so a series of movements are performed to calibrate the servos signals to the physical position of the arm. The calibration showed a good linear response between the control signal and servo angles. The calibration procedure also provided a transformation between the camera and arm coordinate systems. Inverse kinematic control was then used to position the arm. The range camera could also be used to identify objects in the scene. With the object location now known in the arm's coordinate system (transformed from the camera's coordinate system) the arm was able to move allowing it to grasp the object.
Resolving depth-measurement ambiguity with commercially available range imaging cameras
Shane H. McClure, Michael J. Cree, Adrian A. Dorrington, et al.
Time-of-flight range imaging is typically performed with the amplitude modulated continuous wave method. This involves illuminating a scene with amplitude modulated light. Reflected light from the scene is received by the sensor with the range to the scene encoded as a phase delay of the modulation envelope. Due to the cyclic nature of phase, an ambiguity in the measured range occurs every half wavelength in distance, thereby limiting the maximum useable range of the camera. This paper proposes a procedure to resolve depth ambiguity using software post processing. First, the range data is processed to segment the scene into separate objects. The average intensity of each object can then be used to determine which pixels are beyond the non-ambiguous range. The results demonstrate that depth ambiguity can be resolved for various scenes using only the available depth and intensity information. This proposed method reduces the sensitivity to objects with very high and very low reflectance, normally a key problem with basic threshold approaches. This approach is very flexible as it can be used with any range imaging camera. Furthermore, capture time is not extended, keeping the artifacts caused by moving objects at a minimum. This makes it suitable for applications such as robot vision where the camera may be moving during captures. The key limitation of the method is its inability to distinguish between two overlapping objects that are separated by a distance of exactly one non-ambiguous range. Overall the reliability of this method is higher than the basic threshold approach, but not as high as the multiple frequency method of resolving ambiguity.
A novel 3D reconstruction approach by dynamic (de)focused light
Intuon Lertrusdachakul, Yohan D. Fougerolle, Olivier Laligant
In this paper, we propose a novel active 3D recovery method based on dynamic (de)focused light. The method combines both depth from focus (DFF) and depth from defocus (DFD) techniques. With this approach, optimized illumination pattern is projected on the object in order to enforce strong dominant texture on the surface. The imaging system is specifically constructed to keep the whole object sharp in all captured images. Consequently, only projected patterns experience the defocused deformation according to an object depth. Projected light pattern images are acquired within certain focused ranges similar to DFF approach, while the focus measures across these images are calculated for depth estimation by using DFD manner. This guarantees that at least one focus or near-focus image within depth of field exists in the computation. Therefore, the final reconstruction is supposed to be prominent to the one obtained from DFD and also less computational extensive compared to DFF provided.
Active Vision and Robotics II
icon_mobile_dropdown
1000-fps real-time optical flow detection system
Idaku Ishii, Taku Taniguchi, Kenkichi Yamamoto, et al.
Real-time optical flow detection at 1000 fps was realized by implementing an improved optical flow detection algorithm as hardware logic on a high-speed vision platform. The improved gradient-based algorithm, which is based on the Lucas-Kanade algorithm, can select a pseudo variable frame rate adaptively according to the amplitude of optical flow to estimate the accurate optical flow for objects moving at high speeds and low speeds in the same scene. The high-speed vision platform on which the optical flow detection algorithm is implemented can be used to calculate optical flow at 1000 fps for images of 1024 x 1024 pixels; by considering real scenarios such as rapid human motion, the performance of our developed optical flow detection algorithm and system was verified.
Motion based situation recognition in group meetings
Julia Moehrmann, Xin Wang, Gunther Heidemann
We present an unobtrusive vision based system for the recognition of situations in group meetings. The system uses a three-stage architecture, consisting of one video processing stage and two classification stages. The video processing stage detects motion in the videos and extracts up to 12 features from this data. The classification stage uses Hidden Markov Models to first identify the activity of every participant in the meeting and afterwards recognize the situation as a whole. The feature extraction uses position information of both hands and the face to extract motion features like speed, acceleration and motion frequency, as well as distance based features. We investigate the discriminative ability of these features and their applicability to the task of interaction recognition. A two-stage Hidden Markov Model classifier is applied to perform the recognition task. The developed system classifies the situation in 94% of all frames in our video test set correctly, where 3% of the test data is misclassified due to contradictory behavior of the participants. The results show that unimodal data can be sufficient to recognize complex situations.
Motion detection with level set-based segmentation
Suk-ho Lee, Nam-seok Choi, Moon Gi Kang
In this paper, we propose a level set based object detection method for video surveillance which provides for a robust and real-time working object detection under various global illumination conditions. The proposed scheme needs no manual parameter settings for different illumination conditions, which makes the algorithm applicable to automatic surveillance systems. Two special filters are designed to eliminate the spurious object regions that occur due to the CCD noise, making the scheme stable even in very low illumination conditions. We demonstrate the effectiveness of the proposed algorithm experimentally with different illumination conditions, change of contrast, and noise level.
Ego-translation estimation from one straight edge in constructed scenes
The task of recovering the camera motion relative to the environment (ego-motion estimation) is fundamental to many computer vision applications and this field has witnessed a wide range of approaches to this problem. Usual approaches are based on point or line correspondences, optical flow or the so-called direct methods. We present an algorithm for determining 3D motion and structure from one line correspondence between two perspective images. Classical methods which use supporting lines need at least three images. In this work, however, we show that only one supporting line correspondence belong to a planar surface in the space is enough to estimate the camera ego-translation provided the texture on the surface close to the line is enough discriminative. Only one line correspondence is enough and it is not necessary that two matched line segments contain the projection of a common part of the corresponding line segment in space. We first recover camera rotation by matching vanishing points based on the methods already exist in the literature and then recovering the camera translation. Experimental results on both synthetic and real images prove the functionality of the proposed method.
Image Processing and Algorithms
icon_mobile_dropdown
A line detection and description algorithm based on swarm intelligence
Ulrich Kirchmaier, Simon Hawe, Klaus Diepold
In this work, we use the principles of Swarm Intelligence to establish a novel algorithm for detecting and describing straight edges in images. The algorithm uses a set of individual mobile agents with limited cognitive possibilities. Using their memory and communication abilities, the agents can establish fast and robust solutions. The agents initially move randomly in a two dimensional space defined by an arbitrary input image or image sequence. In every time step, each agent calculates the derivative values in x and y direction at its current position and thresholds these values subsequently. If an agent discovers an edge or respectively a straight edge, it follows this straight edge and stores its start point. When it reaches the straight edge's end, it marks its last position as its stop point. As a kind of indirect communication between the agents, each of them leaves important information at each new position discovered. Thus each agent can benefit from the calculations any other agent has done before, which speeds up the algorithm. This new approach is a fast alternative to classical line finding operation like e.g. the Hough Transform.
A hybrid and adaptive segmentation method using color and texture information
C. Meurie, Y. Ruichek, A. Cohen, et al.
This paper presents a new image segmentation method based on the combination of texture and color informations. The method first computes the morphological color and texture gradients. The color gradient is analyzed taking into account the different color spaces. The texture gradient is computed using the luminance component of the HSL color space. The texture gradient procedure is achieved using a morphological filter and a granulometric and local energy analysis. To overcome the limitations of a linear/barycentric combination, the two morphological gradients are then mixed using a gradient component fusion strategy (to fuse the three components of the color gradient and the unique component of the texture gradient) and an adaptive technique to choose the weighting coefficients. The segmentation process is finally performed by applying the watershed technique using different type of germ images. The segmentation method is evaluated in different object classification applications using the k-means algorithm. The obtained results are compared with other known segmentation methods. The evaluation analysis shows that the proposed method gives better results, especially with hard image acquisition conditions.
Hierarchical feature extraction and object recognition based on biologically inspired filters
Pankaj Mishra, B. Keith Jenkins
A key to solving the multiclass object recognition problem is to extract a set of features which accurately and uniquely capture the salient characteristics of different objects. In this work we modify a hierarchical model of the visual cortex that is based on the HMAX model. The first layer of the HMAX model convolves the image with a set of multi-scale, multi-oriented and localized filters, which in our case are learnt from thousands of image patches randomly extracted from natural stimuli. These filters emerge as a result of optimization based in part on approximate-L1-norm sparseness maximization. A key difference between these filters and standard Gabor filters used in the HAMX model is that these filters are adapted to natural stimuli, and hence are more biologically plausible. Based on the modified model we extract a flexible set of features which are largely scale, translation and rotation invariant. This model is applied to extract features from Caltech-5 and Caltech-101 datasets, which are then fed to a support vector machine classifier for the object recognition task. The overall performance successfully demonstrates the plausibility of using filters learned from natural stimuli for feature extraction in object recognition problems.
Blurred face recognition algorithm guided by a no-reference blur metric
Cécile Fiche, Patricia Ladret, Ngoc-Son Vu
Performance of face recognition systems drop drastically when blur effect is present on facial images. In this paper, we propose a new approach for blurred face recognition. Our method is based on a measure of the level of blur introduced in the image using a no-reference blur metric. The face recognition process can be performed with any facial feature descriptor to allow the combination of alternative methods for overcoming data acquisition problems introduced in an image. To assess its efficiency, the approach has been applied with Gabor wavelets, Local Binary Patterns (LBP) and Local Phase Quantization (LPQ) facial descriptors on the FERET data-set. Experimental results clearly show the strength of this method at overcoming the problem caused by various forms of blur whatever the facial feature descriptor are implemented.
2x1D image registration and comparison
Geng Zheng, Elisa H. Barney Smith, Nader Rafla, et al.
This paper presents a novel 2×1D phase correlation based image registration method for verification of printer emulator output. The method combines the basic phase correlation technique and a modified 2×1D version of it to achieve both high speed and high accuracy. The proposed method has been implemented and tested using images generated by printer emulators. Over 97% of the image pairs were registered correctly, accurately dealing with diverse images with large translations and image cropping.
Interactive Paper Session
icon_mobile_dropdown
Layer separation for material discrimination cargo imaging system
Kenneth Fu, Dale Ranta, Pankaj Das, et al.
We propose an approach to boost the accuracy of the performance of a high-energy x-ray material discrimination imaging system. The theory of using two energies of x-rays to scan objects to extract the atomic information has been well developed. Such an approach is known as dual-energy imaging. At the beginning of this century, mega-volt-level dual-energy systems began to be applied to extract information regarding the materials inside a cargo container. For a system that scans at two x-ray energies, the ratio between the attenuations of the two energies will be different for different materials. Using this property, we can classify the content of a cargo container from the attenuation ratio image. However, thick shielding can reduce the signal-to-noise ratio such that correct material identification with low false alarm rate is unfeasible without further image processing. We have developed a method for high atomic number discrimination that can more accurately identify a region of high atomic number. The pixels of each object are clustered using our proposed clustering approach. The thickness and ratio of high- and low-energy attenuations of each object can then be more correctly calculated by separating it from its background. Our method can significantly improve the accuracy by suppressing false alarms and increasing the detection rate.
The application of wavelet denoising in material discrimination system
Kenneth Fu, Dale Ranta, Clark Guest, et al.
Recently, the need for cargo inspection imaging systems to provide a material discrimination function has become desirable. This is done by scanning the cargo container with x-rays at two different energy levels. The ratio of attenuations of the two energy scans can provide information on the composition of the material. However, with the statistical error from noise, the accuracy of such systems can be low. Because the moving source emits two energies of x-rays alternately, images from the two scans will not be identical. That means edges of objects in the two images are not perfectly aligned. Moreover, digitization creates blurry-edge artifacts. Different energy x-rays produce different edge spread functions. Those combined effects contribute to a source of false classification namely, the "edge effect." Other types of false classification are caused by noise, mainly Poisson noise associated with photons. The Poisson noise in xray images can be dealt with using either a Wiener filter or a wavelet shrinkage denoising approach. In this paper, we propose a method that uses the wavelet shrinkage denoising approach to enhance the performance of the material identification system. Test results show that this wavelet-based approach has improved performance in object detection and eliminating false positives due to the edge effects.
A practical DCT based blind image watermarking scheme for print-and-scan process
Jianming Jin, Huiman Hou, Yuhong Xiong
Blind image watermarking technologies allow information to be embedded in common digital images and then recover from the watermarked images without the original images. However, the embedded information is often damaged after the print-and-scan process, because of the randomly added noises, the altered color, and the rotation and scaling introduced in the process. In this paper, we present a practical blind image watermarking scheme based on DCT domain which can survive from the print-and-scan process. The image is partitioned into blocks, and each block embeds one bit watermark data. Two uncorrelated pseudo random sequences are used to spread bit 0 and 1 in the middle frequency band of block-DCT spectrum respectively, which is done by adding the corresponding pseudo random sequence to the middle frequency block-DCT coefficients adaptively. The embedded bit is recovered by comparing the correlations of the modified middle frequency coefficients with each pseudo random sequence. Experiments show that the bit error ratio of watermarking is 2.26% after the print-and-scan process, which is robust enough for visual objects embedding. The robustness of the embedded data can be further improved by incorporating data error correction coding and data repetition voting techniques. In conclusion, this scheme achieves a good performance of both watermark robustness and watermark transparency for the print-and-scan process.
Synthesis of solid textures based on a 2D example: application to the synthesis of 3D carbon structures observed by transmission electronic microscopy
Jean-Pierre Da Costa, Christian Germain
We propose a novel parametric approach which aims at the synthesis of anisotropic solid textures from the analysis of a single 2D exemplar. This approach is an extension of the pyramidal scheme of Portilla and Simoncelli. It proceeds in three main steps: first, a 2D analysis of the example is performed which produces a set of reference statistics. Then, 3D reference statistics are inferred from the 2D ones thanks to specific anisotropy assumptions. The final step aims at the synthesis itself: the 3D target statistics are imposed on a random 3D block according to a specific multi resolution pyramidal scheme. The approach is applied to the synthesis of solid textures representative of the structure of dense pre-graphitic carbons. The samples are lattice fringe images obtained by high resolution transmission electronic microscopy (HRTEM). HRTEM samples with increasing structural order are used for the experimental evaluation. The produced solid textures exhibit anisotropy properties similar to those observed in the HRTEM samples. Such an approach can easily be extended to any 3D anisotropic structures showing stacks of layers such as wood grain images, seismic data, etc.
Cigarette smoke detection from captured image sequences
Kentaro Iwamoto, Hironori Inoue, Toru Matsubara, et al.
We investigate a detection of smoke from captured image sequences. We propose to address the following two problems in order to attain this goal. The first problem is to estimate candidate areas of smoke. The second problem is to judge if smoke exists in the scene. To solve the first problem, we apply the previously proposed framework where image sequences are divided into some small blocks and the smoke detection is done in each small block. In this framework, we propose to use color and edge information of the scene. To solve the second problem, we propose a method for judging if smoke exists in the scene by using the areas of smoke obtained in the last step part. We propose some feature values for judging if smoke exists in the scene. Then, by simulation we find the best combination of feature values. In addition, we study the effect of normalization, which provide better performance in recognition.