Intelligent Robots and Computer Vision XXV: Algorithms, Techniques, and Active Vision

Front Matter: Volume 6764

Show abstract

This PDF file contains the front matter associated with SPIE Proceedings Volume 6764, including the Title Page, Copyright information, Table of Contents, and the Conference Committee listing.

New design method for a hierarchical SVM-based classifier

Yu-Chiang Frank Wang, David Casasent

Show abstract

We propose to use new SVM-type classifiers in a binary hierarchical tree classification structure to efficiently address the multi-class classification problem. A new hierarchical design method, WSV (weighted support vector) K-means Clustering, is presented; it automatically selects the classes to be separated at each node in the hierarchy. Our method is able to visualize and cluster high-dimensional support vector data; therefore, it improves upon prior hierarchical classifier designs. At each node in the hierarchy, we apply an SVRDM (support vector representation and discrimination machine) classifier, which offers generalization and good rejection of unseen false objects; rejection is not achieved with the standard SVM classifier. We provide the theoretical basis and insight into the choice of the Gaussian kernel to provide the SVRDM's rejection ability. New classification and rejection test results are presented on a real IR (infra-red) database.

Eclectic theory of intelligent robots

E. L. Hall, M. Ghaffari, X. Liao, et al.

Show abstract

The purpose of this paper is to introduce a concept of eclecticism for the design, development, simulation and implementation of a real time controller for an intelligent, vision guided robots. The use of an eclectic perceptual, creative controller that can select its own tasks and perform autonomous operations is illustrated. This eclectic controller is a new paradigm for robot controllers and is an attempt to simplify the application of intelligent machines in general and robots in particular. The idea is to uses a task control center and dynamic programming approach. However, the information required for an optimal solution may only partially reside in a dynamic database so that some tasks are impossible to accomplish. So a decision must be made about the feasibility of a solution to a task before the task is attempted. Even when tasks are feasible, an iterative learning approach may be required. The learning could go on forever. The dynamic database stores both global environmental information and local information including the kinematic and dynamic models of the intelligent robot. The kinematic model is very useful for position control and simulations. However, models of the dynamics of the manipulators are needed for tracking control of the robot's motions. Such models are also necessary for sizing the actuators, tuning the controller, and achieving superior performance. Simulations of various control designs are shown. Much of the model has also been used for the actual prototype Bearcat Cub mobile robot. This vision guided robot was designed for the Intelligent Ground Vehicle Contest. A novel feature of the proposed approach lies in the fact that it is applicable to both robot arm manipulators and mobile robots such as wheeled mobile robots. This generality should encourage the development of more mobile robots with manipulator capability since both models can be easily stored in the dynamic database. The multi task controller also permits wide applications. The use of manipulators and mobile bases with a high-level control are potentially useful for space exploration, certain rescue robots, defense robots, medical robotics, and robots that aids older people in daily living activities.

Emerging directions in lunar and planetary robotics

Paul S. Schenker

Show abstract

We overview National Aeronautics and Space Administration (NASA) objectives for future robotic exploration of lunar, planetary and small bodies of the Solar System, and present several examples of supporting robotics R&D. The scope of development spans autonomous surface exploration typified by the Mars Exploration Rovers (MER) and sequel Mars surface missions, autonomous aerial and subsurface robotic exploration of the outer planet moons, and recently initiated efforts under the Vision for Space Exploration (VSE) toward a sustained human-robotic presence at Earth moon.

Embedded object concept: case balancing two-wheeled robot

Tero Vallius, Juha Röning

Show abstract

This paper presents the Embedded Object Concept (EOC) and a telepresence robot system which is a test case for the EOC. The EOC utilizes common object-oriented methods used in software by applying them to combined Lego-like software-hardware entities. These entities represent objects in object-oriented design methods, and they are the building blocks of embedded systems. The goal of the EOC is to make the designing of embedded systems faster and easier. This concept enables people without comprehensive knowledge in electronics design to create new embedded systems, and for experts it shortens the design time of new embedded systems. We present the current status of a telepresence robot created with Atomi-objects, which is the name for our implementation of the embedded objects. The telepresence robot is a relatively complex test case for the EOC. The robot has been constructed using incremental device development, which is made possible by the architecture of the EOC. The robot contains video and audio exchange capability and a controlling system for driving with two wheels. The robot consists of Atomi-objects, demonstrating the suitability of the EOC for prototyping and easy modifications, and proving the capabilities of the EOC by realizing a function that normally requires a computer. The computer counterpart is a regular PC with audio and video capabilities running with a robot control application. The robot is functional and successfully tested.

What the human eye tells the brain: a new approach toward a hardware-based modeling of mental functions

N. Lauinger

Show abstract

A better understanding of intelligent information processing in human vision can be reached through a closer look at the macro- and micro-hardware available in the hierarchy of cortical processors along the main visual pathway connecting the retina, the CGL (corpus geniculatum laterale) and area V1 (cortical visual area 17). The building of the eye is driven by the brain and the engineering of the main visual pathway back to V1 seems to be driven by the eyes. The human eye offers to the brain much more intelligent information about the outer visible world than a camera producing flat 2D images on a CCD. Intelligent processing of visual information in human vision - a strong cooperation between eyes and brain - relays on axes related symmetry operations relevant for navigation in 4D spectral space-times, on a hierarchy of dynamically balanced equilibrium states, on diffractive-optical transformation of the Visible into RGB space, on range mapping based on RGB data (monocular and binocular 3D vision), on illuminant-adaptive optical correlations of local onto global RGB data (color constancy performances) and on invariant fourier-optical log-polar processing of image data (generic object classification; identification of objects). These performances are more compatible with optical processing of modern diffractive-optical sensors and interference-optical correlators than with cameras. The R+D project NAMIROS (Nano- and Micro-3D gratings for Optical Sensors) [8], coordinated by an interdisciplinary team of specialists at Corrsys 3D Sensors AG, describes the roadmap towards a technical realization of outstanding high-tech performances corresponding to human eye-brain co-processing.

The 15^TH Annual Intelligent Ground Vehicle Competition: intelligent ground robots created by intelligent students

Bernard L. Theisen

Show abstract

The Intelligent Ground Vehicle Competition (IGVC) is one of three, unmanned systems, student competitions that were founded by the Association for Unmanned Vehicle Systems International (AUVSI) in the 1990s. The IGVC is a multidisciplinary exercise in product realization that challenges college engineering student teams to integrate advanced control theory, machine vision, vehicular electronics, and mobile platform fundamentals to design and build an unmanned system. Teams from around the world focus on developing a suite of dual-use technologies to equip ground vehicles of the future with intelligent driving capabilities. Over the past 15 years, the competition has challenged undergraduate, graduate and Ph.D. students with real world applications in intelligent transportation systems, the military and manufacturing automation. To date, teams from over 50 universities and colleges have participated. This paper describes some of the applications of the technologies required by this competition and discusses the educational benefits. The primary goal of the IGVC is to advance engineering education in intelligent vehicles and related technologies. The employment and professional networking opportunities created for students and industrial sponsors through a series of technical events over the four-day competition are highlighted. Finally, an assessment of the competition based on participation is presented.

HOG pedestrian detection applied to scenes with heavy occlusion

O. Sidla, M. Rosner

Show abstract

This paper describes the implementation of a pedestrian detection system which is based on the Histogram of Oriented Gradients (HOG) principle and which tries to improve the overall detection performance by combining several part based detectors in a simple voting scheme. The HOG feature based part detectors are specifically trained for head, head-left, head-right, and left/right sides of people, assuming that these parts should be recognized even in very crowded environments like busy public transportation platforms. The part detectors are trained on the INRIA people image database using a polynomial Support Vector Machine. Experiments are undertaken with completely different test samples which have been extracted from two imaging campaigns in an outdoor setup and in an underground station. Our results demonstrate that the performance of pedestrian detection degrades drastically in very crowded scenes, but that through the combination of part detectors a gain in robustness and detection rate can be achieved at least for classifier settings which yield very low false positive rates.

Robust pedestrian detection and tracking in crowded scenes

Yuriy Lypetskyy

Show abstract

This paper presents a vision based tracking system developed for very crowded situations like underground or railway stations. Our system consists on two main parts - searching of people candidates in single frames, and tracking them frame to frame over the scene. This paper concentrates mostly on the tracking part and describes its core components in detail. These are trajectories predictions using KLT vectors or Kalman filter, adaptive active shape model adjusting and texture matching. We show that combination of presented algorithms leads to robust people tracking even in complex scenes with permanent occlusions.

Multiple pedestrian detection using IR LED stereo camera

Bo Ling, Michael I. Zeifman, David R.P. Gibson

Show abstract

As part of the U.S. Department of Transportations Intelligent Vehicle Initiative (IVI) program, the Federal Highway Administration (FHWA) is conducting R&D in vehicle safety and driver information systems. There is an increasing number of applications where pedestrian monitoring is of high importance. Visionbased pedestrian detection in outdoor scenes is still an open challenge. People dress in very different colors that sometimes blend with the background, wear hats or carry bags, and stand, walk and change directions unpredictably. The background is various, containing buildings, moving or parked cars, bicycles, street signs, signals, etc. Furthermore, existing pedestrian detection systems perform only during daytime, making it impossible to detect pedestrians at night. Under FHWA funding, we are developing a multi-pedestrian detection system using IR LED stereo camera. This system, without using any templates, detects the pedestrians through statistical pattern recognition utilizing 3D features extracted from the disparity map. A new IR LED stereo camera is being developed, which can help detect pedestrians during daytime and night time. Using the image differencing and denoising, we have also developed new methods to estimate the disparity map of pedestrians in near real time. Our system will have a hardware interface with the traffic controller through wireless communication. Once pedestrians are detected, traffic signals at the street intersections will change phases to alert the drivers of approaching vehicles. The initial test results using images collected at a street intersection show that our system can detect pedestrians in near real time.

A fixed-point Kanade Lucas Tomasi tracker implementation for smart cameras

M. Rosner

Show abstract

This work presents the implementation of the Kanade-Lucas-Tomasi tracking algorithm on a Digital Signal Processor with a 40-bit fixed-point Arithmetic Logic Unit built into a smart camera. The main goal of this work was to obtain realtime frame processing performance while loosing as little tracking accuracy as possible. This task was motivated by increasing demand for the application of smart cameras as main data processing units in large surveillance systems, where factors like cost and demand of space are excluding PCs from this role. In a first effort the modification of the Kanade-Lucas-Tomasi to integer numbers was performed and then in the next step the influence on stability and accuracy of this modification was investigated. It is demonstrated how changing the numeric data type of intermediate results within the algorithm from float to integer, and decreasing the number of bits used to store variables, affects tracking accuracy. Nevertheless the DSP implementation can be used where the computation of optical flow based on a tracking algorithm needs to be done in real-time on an embedded platform where limited subpixel accuracy can be tolerated. As a further result of this implementation we can conclude that a DSP with a fixed-point arithmetic logic unit can be very effectively applied for complex computer vision tasks and is able deliver good performance even compared to high-end PC architectures.

Detection and tracking using multi-color target models

Hiroshi Oike, Toshikazu Wada, Takeo Iizuka, et al.

Show abstract

This paper presents object detection and tracking algorithm which can adapt to object color shift. In this algorithm, we train and build multi target models using color under different illumination conditions. Each model called as Color Distinctiveness look up Tables or CDT. The color distinctiveness is the value integrating 1) similarity with target colors and 2) dissimilarity with non-target colors, which represents how distinctively the color can be classified into target pixel. Color distinctiveness can be used for pixel-wise target detection, because it takes 0.5 for colors on decision boundary of nearest neighbor classifier in color space. Also, it can be used for target tracking by continuously finding the most distinctive region. By selecting the most suitable CDT for camera direction, lighting condition, and camera parameters, the system can adapt target and background color change. We implemented this algorithm for a Pan-tilt stereo camera system. Through experiments using this system, we confirmed that this algorithm is robust against color shift caused by illumination change and it can measure the target 3D position at video rate.

A real time vehicle tracking system for an outdoor mobile robot

Kanbing Ge

Show abstract

This paper describes progress toward a street-crossing system for an outdoor mobile robot. The system can detect and track vehicles in real time. It reasons about extracted motion regions to decide when it is safe to cross.

An occlusion robust likelihood integration method for multi-camera people head tracking

Yusuke Matsumoto, Takekazu Kato, Toshikazu Wada

Show abstract

This paper presents a novel method for human head tracking using multiple cameras. Most existing methods estimate 3D target position according to 2D tracking results at different viewpoints. This framework can be easily affected by the inconsistent tracking results on 2D images, which leads 3D tracking failure. For solving this problem, an extension of Condensation using multiple images has been proposed. The method generates many hypotheses on a target (human head) in 3D space and estimates the likelihood of each hypothesis by integrating viewpoint dependent likelihood values of 2D hypotheses projected onto image planes. In theory, viewpoint dependent likelihood values should be integrated by multiplication, however, it is easily affected by occlusions. Thus we nvestigate this problem and propose a novel likelihood integration method in this paper and implemented a prototype system consisting of six sets of a PC and a camera. We confirmed the robustness against occlusions.

Articulated motion analysis via axis-based representation

Sezen Erdem, Sibel Tari

Show abstract

Human motion analysis is one of the active research areas in computer vision. The trend shifts from computing motion fields to determining actions. We present an action coding scheme based on a trajectory of features defined with respect to a part based coordinate system. The method does not require prior human model or special motion capture hardware. The features are extracted from images segmented in the form of silhouettes. The feature extraction step ignores 3D effects such as self occlusions or motion perpendicular to the viewing plane. These effects are later revealed in the trajectory analysis. We demonstrate preliminary experiments.

An automatic quality system for injection molding

Pasi Koikkalainen, Michael Haranen, Anssi Lensu, et al.

Show abstract

This paper describes a fully automatic quality system for injection molding. The proposed system includes an on-line measurement platform with a digital camera, a methodology for adaptive design of experiments (DOE), statistical modeling, process monitoring, and a closed loop process control. The system has been tested in the manufacturing of plastic parts for mobile phones.

New experimental diffractive-optical data on E.Land's Retinex mechanism in human color vision: Part II

N. Lauinger

Show abstract

A better understanding of the color constancy mechanism in human color vision [7] can be reached through analyses of photometric data of all illuminants and patches (Mondrians or other visible objects) involved in visual experiments. In Part I [3] and in [4, 5 and 6] the integration in the human eye of the geometrical-optical imaging hardware and the diffractive-optical hardware has been described and illustrated (Fig.1). This combined hardware represents the main topic of the NAMIROS research project (nano- and micro- 3D gratings for optical sensors) [8] promoted and coordinated by Corrsys 3D Sensors AG. The hardware relevant to (photopic) human color vision can be described as a diffractive or interference-optical correlator transforming incident light into diffractive-optical RGB data and relating local RGB onto global RGB data in the near-field behind the 'inverted' human retina. The relative differences at local/global RGB interference-optical contrasts are available to photoreceptors (cones and rods) only after this optical pre-processing.

Creative learning for intelligent robots

Xiaoqun Liao, Ernest L. Hall

Show abstract

This paper describes a methodology for creative learning that applies to man and machines. Creative learning is a general approach used to solve optimal control problems. The creative controller for intelligent machines integrates a dynamic database and a task control center into the adaptive critic learning model. The task control center can function as a command center to decompose tasks into sub-tasks with different dynamic models and criteria functions, while the dynamic database can act as an information system. To illustrate the theory of creative control, several experimental simulations for robot arm manipulators and mobile wheeled vehicles were included. The simulation results showed that the best performance was obtained by using adaptive critic controller among all other controllers. By changing the paths of the robot arm manipulator in the simulation, it was demonstrated that the learning component of the creative controller was adapted to a new set of criteria. The Bearcat Cub robot was another experimental example used for testing the creative control learning. The significance of this research is to generalize the adaptive control theory in a direction toward highest level of human learning - imagination. In doing this it is hoped to better understand the adaptive learning theory and move forward to develop more human-intelligence-like components and capabilities into the intelligent robot. It is also hoped that a greater understanding of machine learning will motivate similar studies to improve human learning.

A concept for ubiquitous robotics in industrial environment

Mikko Sallinen, Juhani Heilala, Sauli Kivikunnas

Show abstract

In this paper a concept for industrial ubiquitous robotics is presented. The concept combines two different approaches to manage agile, adaptable production: firstly the human operator is strongly in the production loop and secondly, the robot workcell will be more autonomous and smarter to manage production. This kind of autonomous robot cell can be called production island. Communication to the human operator working in this kind of smart industrial environment can be divided into two levels: body area communication and operator-infrastructure communication including devices, machines and infra. Body area communication can be supportive in two directions: data is recorded by means of measuring physical actions, such as hand movements, body gestures or supportive when it will provide information to user such as guides or manuals for operation. Body area communication can be carried out using short range communication technologies such as NFC (Near Field communication) which is RFID type of communication. In the operator-infrastructure communication, WLAN or Bluetooth -communication can be used. Beyond the current Human Machine interaction HMI systems, the presented system concept is designed to fulfill the requirements for hybrid, knowledge intensive manufacturing in the future, where humans and robots operate in close co-operation.

High accuracy 2D sub-pixel matching method skillfully managing error characteristics

Hitoshi Nishiguchi, Yoshihiko Nomura, Ryota Sakamoto, et al.

Show abstract

In computer vision, many algorithms have been developed for image registration based on image pattern matching. However, there might be no universal method for all applications because of their advantages and disadvantages. Therefore, we have to select the best method suited for each task. A representative sub-pixel registration method uses one dimensional parabola fitting over the similarity measurements at three positions. The parabola fitting method could be applied to two dimensional, assuming that horizontal and vertical displacements are independent. Although this method has been widely used because of their simplicity and practical usability, large errors are involved. To avoid these errors depending on the spatial structure of image pattern, "two-dimensional simultaneous sub-pixel estimation" was proposed. However, it needs conditional branching control procedures such as scan field expansion and exception. The conditional branching control procedures make estimation instable and disturb the speed of processing. Therefore, the authors employ a paraboloid fitting: by using the least square method, a paraboloid is fitted with the image similarity values at nine points and the best matching point is obtained with sub-pixel order. It is robust against the image pattern and enables speed-up, but it still has error margin. The authors analyzed the error characteristics of the sub-pixel estimation using the paraboloid fitting. The error can be characterized by "a bias; a systematic error" and "dispersion; a random error." It was found that the magnitude of each error was different according to the sub-pixel values of the best matching positions. In this paper, based on the analysis, the authors proposed a novel accurate algorithm for 2D subpixel matching. The method does not need any iteration processes and any exception processes on runtime. Therefore, it is easy to implement the method on software and hardware. Experimental results demonstrated the advantage of the proposed algorithm.

Dynamic template size control in digital image correlation based strain measurements

Janne Koljonen, Olli Kanniainen, Jarmo T. Alander

Show abstract

Image matching is a common procedure in computer vision. Usually the size of the image template is fixed. If the matching is done repeatedly, as e.g. in stereo vision, object tracking, and strain measurements, it is beneficial, in terms of computational cost, to use as small templates as possible. On the other hand larger templates usually give more reliable matches, unless e.g. projective distortions become too great. If the template size is controlled locally dynamically, both computational efficiency and reliability can be achieved simultaneously. Adaptive template size requires though that a larger template can be sampled anytime. This paper introduces a method to adaptively control the template size in a digital image correlation based strain measurement algorithm. The control inputs are measures of confidence of match. Some new measures are proposed in this paper, and the ones found in the literature are reviewed. The measures of confidence are tested and compared with each other as well as with a reference method using templates of fixed size. The comparison is done with respect to computational complexity and accuracy of the algorithm. Due to complex inter-actions of the free parameters of the algorithm, random search is used to find an optimal parameter combination to attain a more reliable comparison. The results show that with some confidence measures the dynamic scheme outperforms the static reference method. However, in order to benefit from the dynamic scheme, optimization of the parameters is needed.

Hand gesture recognition by analysis of codons

Poornima Ramachandra, Neelima Shrikhande

Show abstract

The problem of recognizing gestures from images using computers can be approached by closely understanding how the human brain tackles it. A full fledged gesture recognition system will substitute mouse and keyboards completely. Humans can recognize most gestures by looking at the characteristic external shape or the silhouette of the fingers. Many previous techniques to recognize gestures dealt with motion and geometric features of hands. In this thesis gestures are recognized by the Codon-list pattern extracted from the object contour. All edges of an image are described in terms of sequence of Codons. The Codons are defined in terms of the relationship between maxima, minima and zeros of curvature encountered as one traverses the boundary of the object. We have concentrated on a catalog of 24 gesture images from the American Sign Language alphabet (Letter J and Z are ignored as they are represented using motion) [2]. The query image given as an input to the system is analyzed and tested against the Codon-lists, which are shape descriptors for external parts of a hand gesture. We have used the Weighted Frequency Indexing Transform (WFIT) approach which is used in DNA sequence matching for matching the Codon-lists. The matching algorithm consists of two steps: 1) the query sequences are converted to short sequences and are assigned weights and, 2) all the sequences of query gestures are pruned into match and mismatch subsequences by the frequency indexing tree based on the weights of the subsequences. The Codon sequences with the most weight are used to determine the most precise match. Once a match is found, the identified gesture and corresponding interpretation are shown as output.

Object and pose recognition with cellular genetic algorithms

Timo Mantere

Show abstract

We have studied the use of cellular automata and cellular genetic algorithms for the object recognition, pose recognition, and image classification problems. The cellular genetic algorithm is a genetic algorithm that has some similarities with cellular automata. The preliminary results seem to support the hypothesis that in principle this kind of object and pose recognition and image classification method works relatively well. The problem with the proposed method is a large amount of calculations needed when we are testing the unknown object against the objects in the comparison set.

Searching strain field parameters by genetic algorithms

Janne Koljonen, Timo Mantere, Olli Kanniainen, et al.

Show abstract

This paper studies the applicability of genetic algorithms and imaging to measure deformations. Genetic algorithms are used to search for the strain field parameters of images from a uniaxial tensile test. The non-deformed image is artificially deformed according to the estimated strain field parameters, and the resulting image is compared with the true deformed image. The mean difference of intensities is used as a fitness function. Results are compared with a node-based strain measurement algorithm developed by Koljonen et al. The reference method slightly outperforms the genetic algorithm as for mean difference of intensities. The root-mean-square difference of the displacement fields is less than one pixel. However, with some improvements suggested in this paper the genetic algorithm based method may be worth considering, also in other similar applications: Surface matching instead of individual landmarks can be used in camera calibration and image registration. Search of deformation parameters by genetic algorithms could be applied in pattern recognition tasks e.g. in robotics, object tracking and remote sensing if the objects are subject to deformation. In addition, other transformation parameters could be simultaneously looked for.

Terminal phase visual position estimation for a tail-sitting vertical takeoff and landing UAV via a Kalman filter

Allen C. Tsai, Peter W. Gibbens, R. Hugh Stone

Show abstract

Computer vision has been an active field of research for many decades; it has also become widely used for airborne applications in the last decade or two. Much airborne computer vision research has focused on navigation for Unmanned Air Vehicles; this paper presents a method to estimate the full 3D position information of a UAV by integrating visual cues from one single image with data from an Inertial Measurement Unit under the Kalman Filter formulation. Previous work on visual 3D position estimation for UAV landing has been achieved by using 2 or more frames of image data with feature enriched information in the image; however raw vision state estimates are hugely suspect to image noise. This paper uses a rather conventional type of landing pad with visual features extracted for use in the Kalman filter to obtain optimal 3D position estimates. This methodology promises to provide state estimates that are better suited for guidance and control of a UAV. This also promise autonomous landing of UAVs without GPS information to be conducted. The result of this implementation tested with flight images is presented.

Visual navigation aid for planetary UAV risk reduction

C. A. McPherson, M. S. Bottkol, R. W. Madison, et al.

Show abstract

Unlike the navigation problem of Earth operations, the precise navigation of a vehicle in a remote planetary environment presents a challenging problem for either absolute or relative navigation. There exist no GPS/INS solutions due to a lack of a GPS constellation, few or no accurately surveyed markers for use in terminal sensing measurements, and highly uncertain terrain elevation maps used by a TERCOM system. These, and other, issues prompted the investigation of the potential use of a visual navigation aid to supplement an Inertial Navigation System (INS) and radar altimeter suite of a planetary airplane for the purpose of the identifying the potential benefit of visual measurements to the overall navigation solution. The mission objective used in the study, described herein, requires the precise relative navigation of the airplane over an uncertain terrain. Unlike the previously successful employment of visual aided navigation on the MER1 landing vehicle, the mission objectives require that the airplane traverse a precise flight pattern over the objective terrain at relatively low altitudes for hundreds of kilometers, and is more akin to a velocity correlator application than a terminal fix problem. The results of the investigation indicate that a good knowledge of aircraft altitude is required in order to obtain the desired performance for velocity estimate accuracy. However, it was determined that the direction of the velocity vector can be obtained without a high accuracy height estimate. The characterization of the dependency of velocity estimate accuracy upon the variety of factors involved in the process is the primary focus of this report. This report describes the approach taken in this investigation to both define the architecture of the solution for minimal impact upon payload requirements, and the analysis of the potential gains to the overall navigation problem. Also described as part of the problem definition are the initially assumed contribution sources of visual measurement errors and some additional constraints which limit the choices of solutions.

Virtual force field based obstacle avoidance and agent based intelligent mobile robot

Saurabh Sarkar, Scott Reynolds, Ernest Hall

Show abstract

This paper presents a modified virtual force based obstacle avoidance approach suited for laser range finder. The modified method takes advantage of the polar coordinate based data sent by the laser sensor by mapping the environment in a polar coordinate system. The method also utilizes a Gaussian function based certainty values to detect obstacle. The method successfully navigates through complex obstacles and reaches target GPS waypoints.

Lane detection system for autonomous vehicle navigation

Amit Bhatia

Show abstract

This paper represents the vision processing solution used for lane detection by the Insight Racing team, for DARPA Grand Challenge 2007. The problem involves detecting the lane markings for maintaining the position of the autonomous vehicle within the lane, at usable frame rate. This paper describes a method based on color interpretation and scanning based edge detection for quick and reliable results. First the color information is extracted from the image using RGB to HSV transform and mapped to the Munsell color system. Next, the regions of useful color are scanned adaptively to do an equivalent of single pixel edge detection in one stage. These edges are then processed using Hough Transform to yield lines, which are then segmented, grouped and approximated to reduce the number of lines representing straight and curved lane markings. The final lines are then numbered and sent to the master controller for each frame. This allows the master controller to pick the bounding lane markings and center the vehicle accordingly and navigate autonomously. OpenGL is used to display the results. The solution has been tested and is being used by Insight Racing team for their entry to the DARPA Grand Challenge 2007.

Single-camera stereo vision for obstacle detection in mobile robots

William Lovegrove, Ben Brame

Show abstract

Stereo vision is attractive for autonomous mobile robot navigation, but the cost and complexity of stereo camera systems and the computational requirements make full stereo vision impractical. A novel optical system allows the capture of a pair of short, wide stereo images from a single camera, which are then processed to detect vertical edges and infer obstacle positions and locations within the planar field of view, providing real-time obstacle detection. Our optical system involves a pair of right-angle prisms stacked vertically, splitting the camera field of view vertically in half. Right angle mirrors on either side redirect the image forward but at a horizontally displaced location, creating two virtual cameras. Tilting these mirrors provides an overlapping image area. Alternately, tilting the prisms produces the same effect. This image area is wide but not very tall. However, in a mobile robot scenario the majority of obstacles of interest intersect this field of view.

Dynamic omni-directional vision localization using a beacon tracker based on particle filter

Zuoliang Cao, Shiyu Liu

Show abstract

Omni-directional vision navigation for AGVs appears definite significant since its advantage of panoramic sight with a single compact visual scene. This unique guidance technique involves target recognition, vision tracking, object positioning, path programming. An algorithm for omni-vision based global localization which utilizes two overhead features as beacon pattern is proposed in this paper. An approach for geometric restoration of omni-vision images has to be considered since an inherent distortion exists. The mapping between image coordinates and physical space parameters of the targets can be obtained by means of the imaging principle on the fisheye lens. The localization of the robot can be achieved by geometric computation. Dynamic localization employs a beacon tracker to follow the landmarks in real time during the arbitrary movement of the vehicle. The coordinate transformation is devised for path programming based on time sequence images analysis. The beacon recognition and tracking are a key procedure for an omni-vision guided mobile unit. The conventional image processing such as shape decomposition, description, matching and other usually employed technique are not directly applicable in omni-vision. Particle filter (PF) has been shown to be successful for several nonlinear estimation problems. A beacon tracker based on Particle Filter which offers a probabilistic framework for dynamic state estimation in visual tracking has been developed. We independently use two Particle Filters to track double landmarks but a composite algorithm on multiple objects tracking conducts for vehicle localization. We have implemented the tracking and localization system and demonstrated the relevant of the algorithm.

Testing imaging systems with genetic algorithms - case: error diffusion methods

Timo Mantere

Show abstract

This paper studies the testing of the imaging systems and algorithms with the genetic algorithms. We test if there are inherent natural weaknesses in the image processing algorithm or system and can they are search and found with the evolutionary algorithms. In this paper, we test the weaknesses of the error diffusion halftoning methods. We also take a closer look at the method and identify why these weaknesses appear and are relatively easy to identify with synthetic test images. Moreover, we discuss the importance of comprehensive testing before the results with some image processing methods can be trustworthy. The results seem to suggest that the error diffusion methods do not have as apparent inherent problems as e.g. dispersed dot method, but the GA testing does reveal some other problems, like delayed response to the image tone changes. The different error diffusion methods have similar problems, but with different intensity.

Calibration of the intensity-related distance error of the PMD TOF-camera

Marvin Lindner, Andreas Kolb

Show abstract

A growing number of modern applications such as position determination, online object recognition and collision prevention depend on accurate scene analysis. A low-cost and fast alternative to standard techniques like laser scanners or stereo vision is the distance measurement with modulated, incoherent infrared light based on the Photo Mixing Device (PMD) technique. This paper describes an enhanced calibration approach for PMD-based distance sensors, for which highly accurate calibration techniques have not been widely investigated yet. Compared to other known methods, our approach incorporates additional deviation errors related with the variation of the active illumination incident to the sensor pixels. The resulting calibration yields significantly more precise distance information. Furthermore, we present a simple to use, vision-based approach for the acquisition of the reference data required by any distance calibration scheme, yielding a light-weighted, on-site calibration system with little expenditure in terms of equipment.

Analysis of kernel distortion-invariant filters

David Casasent, Rohit Patnaik

Show abstract

Kernel techniques have been used in support vector machines (SVMs), feature spaces, etc. In kernel methods, the wellknown kernel trick is used to implicitly map the input data to a higher-dimensional feature space. If all terms can be written as a kernel function, one can then use data in higher-dimensional space without actually computing the higherdimensional features or knowing the mapping function Φ. In this paper, we address kernel distortion-invariant filters (DIFs). Standard DIFs are synthesized in a linear feature space (in the image or Fourier domain). They are fast since they use FFT-based correlations. If the data is mapped to a higher-dimensional feature space before filter synthesis and before performing correlations, kernel filters result and performance can be improved. Kernel versions of several DIFs (OTF, SDF, and Mace) have been presented in prior work. However, several key issues were ignored in all prior work. These include : the unrealistic assumption of centered data in tests, the significantly larger storage and on-line computation time required, and the proper type of energy minimization in filter synthesis to reduce false peaks is necessary when the filters are applied to target scenes and has yet to be done. In addition, prior kernel DIF work used test set data to select the value of the kernel parameter. In this paper, we analyze these issues, present supporting test results on two face databases, and present several improvements to prior kernel DIF work.

Fruit shape classification using support vector machine

Jiangsheng Gui, Xiuqin Rao, Yibin Ying

Show abstract

A new method along with shape descriptor using support vector machine for classify fruit shape is developed, the image is first subjected to a normalization process using its regular moments to obtain scale and translation invariance, the rotation invariant Zernike features are then extracted from the scale and translation normalized images and the numbers of features are decided by primary component analysis (PCA), at last, these features are input to support vector machine (SVM) classifier and are compared to different classifiers. This method using support vector machine as classifier performs better than traditional approaches that is verified by some experiments.

On generating automatic-object-extractable images

Chee Sun Won

Show abstract

Object segmentation and extraction play an important role in computer vision and recognition problems. Unfortunately, with current computing technologies, fully automatic object segmentation is not possible, but human intervention is needed for outlining the rough boundary of the object to be segmented. The goal of this paper is to make the object extraction automatic after the first semi-automatic segmentation. That is, once the semantically meaningful object such as a house or a human body is extracted from the image under human's guidance, an image manipulation technique is applied. There is no noticeable difference between the original and the manipulated images. However, the embedded signature by the image manipulation can be detected automatically to be used to differentiate the object from the background. The manipulated images, which is called automatic-object-extractible images, can be used to provide training images with the same object but various background images.

Intelligent Robots and Computer Vision XXV: Algorithms, Techniques, and Active Vision

Volume Details

Table of Contents

Table of Contents