Proceedings Volume 7878

Intelligent Robots and Computer Vision XXVIII: Algorithms and Techniques

Juha Röning, David P. Casasent, Ernest L. Hall
cover
Proceedings Volume 7878

Intelligent Robots and Computer Vision XXVIII: Algorithms and Techniques

Juha Röning, David P. Casasent, Ernest L. Hall
View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 24 January 2011
Contents: 9 Sessions, 33 Papers, 0 Presentations
Conference: IS&T/SPIE Electronic Imaging 2011
Volume Number: 7878

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Front Matter: Volume 7878
  • Invited Papers on Intelligent Robots
  • Stereovision and Applications
  • Novel People and Vehicle Tracking Approaches
  • Tracking Methods for Intelligent Robots
  • Human Robot Interaction and Manipulation
  • Vision Navigation and Target Detection
  • Visual Algorithms
  • Intelligent Ground Vehicle Competition
Front Matter: Volume 7878
icon_mobile_dropdown
Front Matter: Volume 7878
This PDF file contains the front matter associated with SPIE Proceedings Volume 7878, including the Title Page, Copyright information, Table of Contents, and the Conference Committee listing.
Invited Papers on Intelligent Robots
icon_mobile_dropdown
Software framework for nano- and microscale measurement applications
Juha Röning, Ville Tuhkanen, Risto Sipola, et al.
Development of new instruments and measurement methods has advanced research in the field of nanotechnology. Development of measurement systems used in research requires support from reconfigurable software. Application frameworks can be used to develop domain-specific application skeletons. New applications are specialized from the framework by filling its extension points. This paper presents an application framework for nano- and micro-scale applications. The framework consists of implementation of a robotic control architecture and components that implement features available in measurement applications. To ease the development of user interfaces for measurement systems, the framework also contains ready-to-use user interface components. The goal of the framework was to ease the development of new applications for measurement systems. Features of the implemented framework were examined through two test cases. Benefits gained by using the framework were analyzed by determining work needed to specialize new applications from the framework. Also the degree of reusability of specialized applications was examined. The work shows that the developed framework can be used to implement software for measurement systems and that the major part of the software can be implemented by using reusable components of the framework. When developing new software, a developer only needs to develop components related to the hardware used and performing the measurement task. Using the framework developing new software takes less time. The framework also unifies structure of developed software.
A traffic situation analysis system
The observation and monitoring of traffic with smart visions systems for the purpose of improving traffic safety has a big potential. For example embedded vision systems built into vehicles can be used as early warning systems, or stationary camera systems can modify the switching frequency of signals at intersections. Today the automated analysis of traffic situations is still in its infancy - the patterns of vehicle motion and pedestrian flow in an urban environment are too complex to be fully understood by a vision system. We present steps towards such a traffic monitoring system which is designed to detect potentially dangerous traffic situations, especially incidents in which the interaction of pedestrians and vehicles might develop into safety critical encounters. The proposed system is field-tested at a real pedestrian crossing in the City of Vienna for the duration of one year. It consists of a cluster of 3 smart cameras, each of which is built from a very compact PC hardware system in an outdoor capable housing. Two cameras run vehicle detection software including license plate detection and recognition, one camera runs a complex pedestrian detection and tracking module based on the HOG detection principle. As a supplement, all 3 cameras use additional optical flow computation in a low-resolution video stream in order to estimate the motion path and speed of objects. This work describes the foundation for all 3 different object detection modalities (pedestrians, vehi1cles, license plates), and explains the system setup and its design.
The 18th Annual Intelligent Ground Vehicle Competition: trends and influences for intelligent ground vehicle control
The Intelligent Ground Vehicle Competition (IGVC) is one of four, unmanned systems, student competitions that were founded by the Association for Unmanned Vehicle Systems International (AUVSI). The IGVC is a multidisciplinary exercise in product realization that challenges college engineering student teams to integrate advanced control theory, machine vision, vehicular electronics and mobile platform fundamentals to design and build an unmanned system. Teams from around the world focus on developing a suite of dual-use technologies to equip ground vehicles of the future with intelligent driving capabilities. Over the past 18 years, the competition has challenged undergraduate, graduate and Ph.D. students with real world applications in intelligent transportation systems, the military and manufacturing automation. To date, teams from over 75 universities and colleges have participated. This paper describes some of the applications of the technologies required by this competition and discusses the educational benefits. The primary goal of the IGVC is to advance engineering education in intelligent vehicles and related technologies. The employment and professional networking opportunities created for students and industrial sponsors through a series of technical events over the four-day competition are highlighted. Finally, an assessment of the competition based on participation is presented.
Stereovision and Applications
icon_mobile_dropdown
Stereo matching based on two cameras and one 3D image sensor
Lu Yang, Xiaowei Shao, Ryosuke Shibasaki, et al.
Due to the problems of noise, textureless region and depth discontinuity in stereo matching, a new matching method based on two cameras and one 3D image sensor is proposed in this paper. Though the 3D image sensor can offer a depth map, it has low resolution and much noise. Therefore, it can not be used to 3D reconstruction. However, combining two cameras with one 3D image sensor and regarding the depth map as an initial sparse disparity map is an advantage method in stereo vision matching. This method can largely improve the matching accuracy and decrease the running time. Finally, a dense disparity map can be obtained. The experiment results indicate that the proposed algorithm performs well and the disparity map has more accuracy comparing with existing methods.
Linear stereo vision based objects detection and tracking using spectral clustering
Safaa Moqqaddem, Y. Ruichek, R. Touahni, et al.
Objects detection and tracking is a key function for many applications like video surveillance, robotic, intelligent transportation systems,...etc. This problem is widely treated in the literature in terms of sensors (video cameras, laser range finder, Radar) and methodologies. This paper proposes a new approach for detecting and tracking objects using stereo vision with linear cameras. After the matching process applied to edge points extracted from the images, the reconstructed points in the scene are clustered using spectral analysis. The obtained clusters are then tracked throughout their center of gravity using a Kalman filter and a Nearest Neighbour (NN) based data association algorithm. The approach is tested and evaluated on real data to demonstrate its effectiveness for obstacle detection and tracking in front of a vehicle. This work is a part of a project that aims to develop advanced driving aid systems, supported by the CPER, STIC and Volubilis programs.
Novel People and Vehicle Tracking Approaches
icon_mobile_dropdown
Real-time car detection system
This paper presents a car detection system that is able to work in close to real-time on a smart camera. A cascade of histograms of oriented gradients was used as a detector. The algorithm and code were optimized for speed to meet the real-time constraints, without loosing too much on detection quality. The system is now able to process 10 frames per second on an Atom Z530 (1.6 GHz) processor used in the smart camera. The application on which the paper is based is ready to detect cars in real world scenarios. It is planned to extend it to also track and analyze the driver behavior patterns.
Real-time people and vehicle detection from UAV imagery
Anna Gaszczak, Toby P. Breckon, Jiwan Han
A generic and robust approach for the real-time detection of people and vehicles from an Unmanned Aerial Vehicle (UAV) is an important goal within the framework of fully autonomous UAV deployment for aerial reconnaissance and surveillance. Here we present an approach for the automatic detection of vehicles based on using multiple trained cascaded Haar classifiers with secondary confirmation in thermal imagery. Additionally we present a related approach for people detection in thermal imagery based on a similar cascaded classification technique combining additional multivariate Gaussian shape matching. The results presented show the successful detection of vehicle and people under varying conditions in both isolated rural and cluttered urban environments with minimal false positive detection. Performance of the detector is optimized to reduce the overall false positive rate by aiming at the detection of each object of interest (vehicle/person) at least once in the environment (i.e. per search patter flight path) rather than every object in each image frame. Currently the detection rate for people is ~70% and cars ~80% although the overall episodic object detection rate for each flight pattern exceeds 90%.
Real-time pose invariant logo and pattern detection
Oliver Sidla, Michal Kottmann, Wanda Benesova
The detection of pose invariant planar patterns has many practical applications in computer vision and surveillance systems. The recognition of company logos is used in market studies to examine the visibility and frequency of logos in advertisement. Danger signs on vehicles could be detected to trigger warning systems in tunnels, or brand detection on transport vehicles can be used to count company-specific traffic. We present the results of a study on planar pattern detection which is based on keypoint detection and matching of distortion invariant 2d feature descriptors. Specifically we look at the keypoint detectors of type: i) Lowe's DoG approximation from the SURF algorithm, ii) the Harris Corner Detector, iii) the FAST Corner Detector and iv) Lepetit's keypoint detector. Our study then compares the feature descriptors SURF and compact signatures based on Random Ferns: we use 3 sets of sample images to detect and match 3 logos of different structure to find out which combinations of keypoint detector/feature descriptors work well. A real-world test tries to detect vehicles with a distinctive logo in an outdoor environment under realistic lighting and weather conditions: a camera was mounted on a suitable location for observing the entrance to a parking area so that incoming vehicles could be monitored. In this 2 hour long recording we can successfully detect a specific company logo without false positives.
FirstAidAssistanceSystem (FAAS): improvement of first aid measures using Car2Car-communication technology
Sven Tuchscheerer, Tobias Hoppe, Christian Krätzer, et al.
This work's goal is the enhancement of first aid measures directly after car accidents by calling suited first aiders via Car-to-Car (C2C) communication and to assist them providing detailed multimedia support instructions. Our concept combines upcoming C2C communication technologies with established technology, in particular GPS and GSM. After a collision, the proposed FirstAidAssistanceSystem (FAAS) sends a broadcast message using C2C technology according to the IEEE 802.11p standard. All nearby cars (as potential first aiders) are located and at least one nearest candidate (we suggest 3-5) driving towards the accident scene is chosen and notified as first aider. A support guide on his multipurpose display (e.g. the navigation system) provides first aiders with detailed instructions and illustrative tutorials. The paper presents our concept in detail with a discussion of practical evaluation criteria and an introduction of a first test implementation.
Tracking Methods for Intelligent Robots
icon_mobile_dropdown
Study of temporal modified-RANSAC based method for the extraction and 3D shape reconstruction of moving objects from dynamic stereo images and for estimating the camera pose
Naotomo Tatematsu, Jun Ohya
This paper proposes a Temporal Modified-RANSAC based method that can discriminate each moving object from the still background in the stereo video sequences acquired by moving stereo cameras, can compute the stereo cameras' egomotion, and can reconstruct the 3D structure of each moving object and the background. We compute 3D optical flows from the depth map and results of tracking feature points. We define "3D flow region" as a set of connected pixels whose 3D optical flows have a common rotation matrix and translation vector. Our Temporal Modified-RANSAC segments the detected 3D optical flows into 3D flow regions and computes the rotation matrix and translation vector for each 3D flow region. As opposed to the conventional Modified-RANSAC for only two frames, The Temporal Modified- RANSAC can handle temporal images with arbitrary length by performing the Modified-RANSAC to the set of a 3D flow region that classified in the latest frame and new 3D optical flows detected in the current frame iteratively. Finally, the 3D points computed from the depth map in all the frames are registered using each 3D flow region's matrix to the initial positions in the initial frame so that the 3D structures of the moving objects and still background are reconstructed. Experiments using multiple moving objects and real stereo sequences demonstrate promising results of our proposed method.
A multiple feature based particle filter using mutual information maximization
Kihyun Hong, Kyuseo Han
In designing a tracking algorithm, utilizing several different features, e.g., color histogram, gradient histogram and other object descriptors, is preferable to increase robustness of tracking performance. In this paper, we propose a multiple feature fusion framework to improve the tracking by assigning appropriate weights to individual features. The feature weights are optimally obtained by a waterfilling procedure that maximizes mutual information between target object features and query features. Especially, in this paper, we focus on a particle filter tracking implementation of the multiple feature fusion framework. Our experiments show that object tracking with multiple features outperforms single feature based tracking methods and illustrates that the proposed optimal feature weighting increases robustness of multiple-feature based tracking performance.
High precision object segmentation and tracking for use in super resolution video reconstruction
T. Nathan Mundhenk, Rashmi Sundareswara, David R. Gerwe, et al.
Super resolution image reconstruction allows for the enhancement of images in a video sequence that is superior to the original pixel resolution of the imager. Difficulty arises when there are foreground objects that move differently than the background. A common example of this is a car in motion in a video. Given the common occurrence of such situations, super resolution reconstruction becomes non-trivial. One method for dealing with this is to segment out foreground objects and quantify their pixel motion differently. First we estimate local pixel motion using a standard block motion algorithm common to MPEG encoding. This is then combined with the image itself into a five dimensional mean-shift kernel density estimation based image segmentation with mixed motion and color image feature information. This results in a tight segmentation of objects in terms of both motion and visible image features. The next step is to combine segments into a single master object. Statistically common motion and proximity are used to merge segments into master objects. To account for inconsistencies that can arise when tracking objects, we compute statistics over the object and fit it with a generalized linear model. Using the Kullback-Leibler divergence, we have a metric for the goodness of the track for an object between frames.
Robust pedestrian detection and tracking from a moving vehicle
Nguyen Xuan Tuong, Thomas Müller, Alois Knoll
In this paper, we address the problem of multi-person detection, tracking and distance estimation in a complex scenario using multi-cameras. Specifically, we are interested in a vision system for supporting the driver in avoiding any unwanted collision with the pedestrian. We propose an approach using Histograms of Oriented Gradients (HOG) to detect pedestrians on static images and a particle filter as a robust tracking technique to follow targets from frame to frame. Because the depth map requires expensive computation, we extract depth information of targets using Direct Linear Transformation (DLT) to reconstruct 3D-coordinates of correspondent points found by running Speeded Up Robust Features (SURF) on two input images. Using the particle filter the proposed tracker can efficiently handle target occlusions in a simple background environment. However, to achieve reliable performance in complex scenarios with frequent target occlusions and complex cluttered background, results from the detection module are integrated to create feedback and recover the tracker from tracking failures due to the complexity of the environment and target appearance model variability. The proposed approach is evaluated on different data sets both in a simple background scenario and a cluttered background environment. The result shows that, by integrating detector and tracker, a reliable and stable performance is possible even if occlusion occurs frequently in highly complex environment. A vision-based collision avoidance system for an intelligent car, as a result, can be achieved.
Target matching based on multi-view tracking
Yahui Liu, Changsheng Zhou
A feature matching method is proposed based on Maximally Stable Extremal Regions (MSER) and Scale Invariant Feature Transform (SIFT) to solve the problem of the same target matching in multiple cameras. Target foreground is extracted by using frame difference twice and bounding box which is regarded as target regions is calculated. Extremal regions are got by MSER. After fitted into elliptical regions, those regions will be normalized into unity circles and represented with SIFT descriptors. Initial matching is obtained from the ratio of the closest distance to second distance less than some threshold and outlier points are eliminated in terms of RANSAC. Experimental results indicate the method can reduce computational complexity effectively and is also adapt to affine transformation, rotation, scale and illumination.
A target detection method in multimodal images with complex backgrounds and different views
Target detection in multimodal (multisensor) images is a difficult problem especially with the impact of different views and the complex backgrounds. In this paper, we propose a target detection method based on ground region matching and spatial constraints to solve it. First, the extrinsic parameters of camera are used to transform the images to reduce the impact of viewpoints differences. Then the stable ground object regions are extracted by MSER. Those regions are used to build a graph model to describe the reference image with spatial constraints to reduce the impact of multimodal and complex backgrounds. At last, the ground region matching and the model registration with sensed images are used to find the target. Using this method, we overcome those difficulties and obtain a satisfied experiment result; the final detection rate is 94.34% in our data set of visible reference images in top views and infrared sensed images in side views
Human Robot Interaction and Manipulation
icon_mobile_dropdown
Design and evaluation of multimedia security warnings for the interaction between humans and industrial robots
In this document a multi-media security warning design approach for automated production scenarios with industrial robots is introduced. This first approach is based on and adapts design principles of common security programs and a German VDI standard for safety warnings design. We focus on direct human-to-robot interaction scenarios, e.g. the online-programming of industrial robots, because of their potential indirect safety impacts, which could be caused by malicious codes infection of a robots control computer. We designed ten different multi-media security warnings, composed of visual and acoustical information. Visual information of warnings is transported via a traffic light metaphor (symbolizing three different threat levels), different warn icons (symbolizing properties of malicious codes) and instructions icons to programmers or operators and additional textual information. With an acknowledgment button in the middle of the warning, the programmer's confirmation of the reception of the warning is verified. Additionally, three different acoustical signals also indicate the threat level of the warning. Furthermore, an evaluation is presented, which uses concepts known from usability testing (method of loud thinking, questionnaire, time measurement). The aim is to evaluate general design criteria of our developed security warnings and tendency of user perception for further advancement of our warnings design.
Fingertip guiding manipulator for blind persons to create mental images by switch passive/active hybrid line-drawing explorations
The authors developed a fingertip guiding system which consists of a haptic manipulator (PHANTOM Omni) to help a blind person to create mental images of a pre-planned trajectory. When using this system, the person will grasp a stylus (a pen-shaped stick) of the haptic manipulator by his/her fingertip. The system is equipped with dual mode fingertip guiding function which allows switching between modes (a passive or an active mode) in recognizing the image provided. In passive mode, the system will guide and pull a person's hand, and it will also provide force feedback to his/her fingertip, while constraining the fingertips motion along the trajectory. On the other hand, in active mode, the system will guide and provide force feedback at the fingertips of the subject but allows the persons to freely move his/her fingertip on the trajectory. The main objective of the research is to create a hybrid exploration system which consists of active and passive modes. In this paper, we targeted our scope on the active mode system. We examined the effectiveness in understanding the trajectory by moving his/her fingertips with haptic manipulator either on a flat surface or in a free aerial space (without any surface provided).It was confirmed that the active system with a flat surface expedites the understanding of the layout's trajectory.
Augmented reality user interface for mobile ground robots with manipulator arms
Steven Vozar, Dawn M. Tilbury
Augmented Reality (AR) is a technology in which real-world visual data is combined with an overlay of computer graphics, enhancing the original feed. AR is an attractive tool for teleoperated UGV UIs as it can improve communication between robots and users via an intuitive spatial and visual dialogue, thereby increasing operator situational awareness. The successful operation of UGVs often relies upon both chassis navigation and manipulator arm control, and since existing literature usually focuses on one task or the other, there is a gap in mobile robot UIs that take advantage of AR for both applications. This work describes the development and analysis of an AR UI system for a UGV with an attached manipulator arm. The system supplements a video feed shown to an operator with information about geometric relationships within the robot task space to improve the operator's situational awareness. Previous studies on AR systems and preliminary analyses indicate that such an implementation of AR for a mobile robot with a manipulator arm is anticipated to improve operator performance. A full user-study can determine if this hypothesis is supported by performing an analysis of variance on common test metrics associated with UGV teleoperation.
An embedded omnidirectional vision navigator for automatic guided vehicles
Omnidirectional vision appears the definite significance since its advantage of acquiring full 360° horizontal field of vision information simultaneously. In this paper, an embedded original omnidirectional vision navigator (EOVN) based on fish-eye lens and embedded technology has been researched. Fish-eye lens is one of the special ways to establish omnidirectional vision. However, it appears with an unavoidable inherent and enormous distortion. A unique integrated navigation method which is conducted on the basis of targets tracking has been proposed. It is composed of multi-target recognition and tracking, distortion rectification, spatial location and navigation control. It is called RTRLN. In order to adapt to the different indoor and outdoor navigation environments, we implant mean-shift and dynamic threshold adjustment into the Particle Filter algorithm to improve the efficiency and robustness of tracking capability. RTRLN has been implanted in an independent development embedded platform. EOVN likes a smart crammer based on COMS+FPGA+DSP. It can guide various vehicles in outdoor environments by tracking the diverse marks hanging in the air. The experiments prove that the EOVN is particularly suitable for the guidance applications which need high requirements on precision and repeatability. The research achievements have a good actual applied inspection.
Gender classification system in uncontrolled environments
Pingping Zeng, Yu-Jin Zhang, Fei Duan
Most face analysis systems available today perform mainly on restricted databases of images in terms of size, age, illumination. In addition, it is frequently assumed that all images are frontal and unconcealed. Actually, in a non-guided real-time supervision, the face pictures taken may often be partially covered and with head rotation less or more. In this paper, a special system supposed to be used in real-time surveillance with un-calibrated camera and non-guided photography is described. It mainly consists of five parts: face detection, non-face filtering, best-angle face selection, texture normalization, and gender classification. Emphases are focused on non-face filtering and best-angle face selection parts as well as texture normalization. Best-angle faces are figured out by PCA reconstruction, which equals to an implicit face alignment and results in a huge increase of the accuracy for gender classification. Dynamic skin model and a masked PCA reconstruction algorithm are applied to filter out faces detected in error. In order to fully include facial-texture and shape-outline features, a hybrid feature that is a combination of Gabor wavelet and PHoG (pyramid histogram of gradients) was proposed to equitable inner texture and outer contour. Comparative study on the effects of different non-face filtering and texture masking methods in the context of gender classification by SVM is reported through experiments on a set of UT (a company name) face images, a large number of internet images and CAS (Chinese Academy of Sciences) face database. Some encouraging results are obtained.
Vision Navigation and Target Detection
icon_mobile_dropdown
Detecting stationary human targets in FLIR imagery
Alex Lipchen Chan
In the military arena, intelligent unmanned ground vehicles (UGVs), weighing 10 tons or more, may be designed and used for transportation or combat purposes. To ensure safe operations among civilians and friendly combatants, it is crucial for these UGVs to detect and avoid humans who might be injured unintentionally. In this paper, a multi-stage detection algorithm for stationary humans in forward-looking infrared (FLIR) imagery is proposed. This algorithm first applies an efficient feature-based anomalies detection algorithm to search the entire input image, which is followed by an eigen-neural-based clutter rejecter that examines only the portions of the input image identified by the first algorithm, and culminates with a simple evidence integrator that combines the results from the two previous stages. The proposed algorithm was evaluated using a large set of challenging FLIR images and the results support the usefulness of this multi-stage architecture.
Spectrally queued feature selection for robotic visual odometery
David M. Pirozzo, Philip A. Frederick, Shawn Hunt, et al.
Over the last two decades, research in Unmanned Vehicles (UV) has rapidly progressed and become more influenced by the field of biological sciences. Researchers have been investigating mechanical aspects of varying species to improve UV air and ground intrinsic mobility, they have been exploring the computational aspects of the brain for the development of pattern recognition and decision algorithms and they have been exploring perception capabilities of numerous animals and insects. This paper describes a 3 month exploratory applied research effort performed at the US ARMY Research, Development and Engineering Command's (RDECOM) Tank Automotive Research, Development and Engineering Center (TARDEC) in the area of biologically inspired spectrally augmented feature selection for robotic visual odometry. The motivation for this applied research was to develop a feasibility analysis on multi-spectrally queued feature selection, with improved temporal stability, for the purposes of visual odometry. The intended application is future semi-autonomous Unmanned Ground Vehicle (UGV) control as the richness of data sets required to enable human like behavior in these systems has yet to be defined.
Intuitive control of robotic manipulators
David Rusbarsky, Jeremy Gray, Douglas Peters
As part of the Modular Intelligent Manipulation system with Intuitive Control program, industry is working with the U.S. Army to explore technologies that will allow a user to intuitively control multiple degree of freedom robotic arms and maintain better awareness of the operating environment through haptic feedback. In addition to reporting resistance, haptic feedback can help make operators feel like they are actually there with the robot, opening doors, unscrewing blast caps, cutting wires, or removing batteries. Coupled with intuitive controls and advanced video feedback, the goal of this project is to provide users with the sensation that the robot is an extension of their body, all from a safe distance.
Lane marking detection by extracting white regions with predefined width from bird's-eye road images
Sadayuki Abe, Kenji Shoji, Fubito Toyama, et al.
Detecting lane markings on roads from in-vehicle camera images is very important because it is one of the fundamental tasks for autonomous running technology and safety driving support system. There are several lane markings detection methods using the width information, but most of these are considered to be insufficient for oblique markings. So, the primary intent of this paper is to propose a detecting lane markings method robust to orientation of markings. In this work, we focus on the width of lane markings standardized by road act in Japan, and propose a method for detecting white lane markings by extracting white regions with constant predefined width from bird's-eye road images after segmentation such as categorical color area one. The proposed method is based on the constrained Delaunay triangulation. The proposed method has a merit that can be measure an exact width for oblique markings on the bird's-eye images because it can be obtained perpendicular width for edge. The effectiveness of the proposed method was shown by experimental results for 187 actual road images taken from an in-vehicle camera.
Visual Algorithms
icon_mobile_dropdown
Dotted and curved line character segmentation
This paper presents new methodology for addressing curvature and segmentation in text as applied to dotted line images. Several improvements are provided, including methods and procedures for handling (a) curvature in text detection and segmentation, (b) rotation angle determinations, (c) isolation of only the character pixels in the image, (d) use of a fill algorithm to solidify the dotted character, and (e) discernment of segmented characters which may be printed too closely or that touch each other. Applying the new algorithms under various lighting conditions against 90 water bottle images (from OzarkaR and DasaniR) with differing text yielded a 93% accuracy rate. Additionally, the fill algorithm presented here improved recognition by more than 20% as compared to non-filled characters.
Accelerating robust 3D pose estimation utilizing a graphics processing unit
Adam R. Gerlach, Bruce K. Walker
A spin-image pose estimation algorithm is an accurate method for estimating pose of three-dimensional objects while being both robust to clutter and sensor noise. Unfortunately, the algorithm has a high computational complexity, thus preventing its use in applications that require a robotic system to interact with a dynamic environment. Upon inspection, the spin-image algorithm can be broken down into five portions where a single portion called spin-image matching commands 96% of the computation time in estimating pose. Because, the matching of individual spin-images can be performed independently regardless of order, this portion of the algorithm is ideal for the massively parallel architecture of the graphics processing unit (GPU). This paper introduces a GPU implementation of the spin-image matching portion of the spin-image algorithm whick makes no modifications to the spin-image algorithm, thus no compromising its robustness and accuracy. This implementation results in a speed-up in spin-image matching of 515x and total algorithmic speed-up of 24.6x out of a theoretical maximum of 26.0x over a MATLAB implementation. This GPU implementation extends the use of the spin-image algorithm towards practical real-time robotic applications.
Calibration and rectification research for fish-eye lens application
The purpose of this paper aims to promote the application of fish-eye lens. Accurate parameters calibration and effective distortion rectification of an imaging device is of utmost importance in machine vision. Fish-eye lens produces a hemispherical field of view of an environment, which appears definite significant since its advantage of panoramic sight with a single compact visual scene. But fish-eye lens image has an unavoidable inherent severe distortion. The precise optical center is the precondition for other parameters calibration and distortion correction. Therefore, three different optical center calibration methods have been researched for diverse applications. Support Vector Machine (SVM) and Spherical Equidistance Projection Algorithm (SEPA) are integrated to replace traditional rectification methods. SVM is a machine learning method based on the theory of statistics, which have good capabilities of imitating, regression and classification. In this research, SVM provides a mapping table between the fish-eye image and the standard image for human eyes. Two novel training models have been designed. SEPA has been applied to promote the rectification effect of the edge of fish-eye lens image. The validity and effectiveness of our achievements are demonstrated by processing the real images.
A hardware-software co-design approach to a JPEG encoder design for a planetary micro-rover application
S. Sarma, S. Udupa, K. M. Bhardwaj, et al.
Micro-rovers aimed with the objective of planetary exploration of moons and heavenly bodies are becoming focus of many space missions. These micro-rover missions face hard challenges of harsh environment and resource constraints such as power and transmission bandwidth. The image data collected by the on-board cameras are often not possible to transmit to ground due to low bandwidth or adequate transmission duration. The JPEG image compression standard that is developed by the Joint Photographic Experts Group committee for use in compressing digital images and full color photographic images is ubiquitous and is a useful solution to the problem. In this paper, a hardware-software based co-design approach is presented with the aim to implement a JPEG encoder for reducing the transmission bandwidth requirement of a planetary micro-rover. A pipelined hardware architecture of the JPEG encoder requiring reduced hardware resources and power is designed for PowerPC and MIL-1750 processor interface and its performance and resource utilization using standard images of various sizes and quality settings for both these processor architecture is compared. Results are substantiated using extensive simulation and RTL implementation in FPGA. Based on these studies an efficient architecture is arrived at for use in a planetary microrover for future exploration by an Indian moon mission.
Selective locality preserving projections for face recognition
F. Dornaika, A. Assoum
Recently, a graph-based method was proposed for Linear Dimensionality Reduction (LDR). It is based on Locality Preserving Projections (LPP). LPP is a typical linear graph-based dimensionality reduction (DR) method that has been successfully applied in many practical problems such as face recognition. LPP is essentially a linearized version of Laplacian Eigenmaps. When dealing with face recognition problems, LPP is preceded by a Principal Component Analysis (PCA) step in order to avoid possible singularities. Both PCA and LPP are computed by solving an eigen decomposition problem. In this paper, we propose a novel approach called "Selective Locality Preserving Projections" that performs an eigenvector selection associated with LPP. Consequently, the problem of dimension estimation for LPP is solved. Moreover, we propose a selective approach that performs eigenvector selection for the case where the mapped samples are formed by concatenating the output of PCA and LPP. We have tested our proposed approaches on several public face data sets. Experiments on ORL, UMIST, and YALE Face Databases show significant performance improvements in recognition over the classical LPP. The proposed approach lends itself nicely to many biometric applications.
Intelligent Ground Vehicle Competition
icon_mobile_dropdown
Phobetor: Princeton University's entry in the 2010 Intelligent Ground Vehicle Competition
Joshua Newman, Han Zhu, Brenton A. Partridge, et al.
In this paper we present Phobetor, an autonomous outdoor vehicle originally designed for the 2010 Intelligent Ground Vehicle Competition (IGVC). We describe new vision and navigation systems that have yielded 3x increase in obstacle detection speed using parallel processing and robust lane detection results. Phobetor also uses probabilistic local mapping to learn about its environment and Anytime Dynamic A* (AD*) to plan paths to reach its goals. Our vision software is based on color stereo images and uses robust, RANSAC-based algorithms while running fast enough to support real-time autonomous navigation on uneven terrain. AD* allows Phobetor to respond quickly in all situations even when optimal planning takes more time, and uses incremental replanning to increase search efficiency. We augment the cost map of the environment with a potential field which addresses the problem of "wall-hugging" and smoothes generated paths to allow safe and reliable path-following. In summary, we present innovations on Phobetor that are relevant to real-world robotics platforms in uncertain environments.
Application of parallelized software architecture to an autonomous ground vehicle
Rahul Shakya, Adam Wright, Young Ho Shin, et al.
This paper presents improvements made to Q, an autonomous ground vehicle designed to participate in the Intelligent Ground Vehicle Competition (IGVC). For the 2010 IGVC, Q was upgraded with a new parallelized software architecture and a new vision processor. Improvements were made to the power system reducing the number of batteries required for operation from six to one. In previous years, a single state machine was used to execute the bulk of processing activities including sensor interfacing, data processing, path planning, navigation algorithms and motor control. This inefficient approach led to poor software performance and made it difficult to maintain or modify. For IGVC 2010, the team implemented a modular parallel architecture using the National Instruments (NI) LabVIEW programming language. The new architecture divides all the necessary tasks - motor control, navigation, sensor data collection, etc. into well-organized components that execute in parallel, providing considerable flexibility and facilitating efficient use of processing power. Computer vision is used to detect white lines on the ground and determine their location relative to the robot. With the new vision processor and some optimization of the image processing algorithm used last year, two frames can be acquired and processed in 70ms. With all these improvements, Q placed 2nd in the autonomous challenge.
WOAH: an obstacle avoidance technique for high speed path following
Nat Tuck, Michael McGuinness, Fred Martin
This paper presents WOAH, a method for real-time mobile robot path following and obstacle avoidance. WOAH provides reactive speed and turn instructions based on obstacle information sensed by a laser range finder. Unlike many previous techniques, this method allows a robot to move quickly past obstacles that are not directly in its path, avoiding slowdowns in path following encountered by previous obstacle avoidance techniques.