Publications

Bookstore

Conference Proceedings

Proceedings Volume 9025

Intelligent Robots and Computer Vision XXXI: Algorithms and Techniques

Juha Röning, David Casasent

cover

Proceedings Volume 9025

Intelligent Robots and Computer Vision XXXI: Algorithms and Techniques

Juha Röning, David Casasent

View the digital version of this volume at SPIE Digital Libarary.

View on SPIE Digital Library

View on SPIE Digital Library

Volume Details

Date Published: 2 February 2014

Contents: 8 Sessions, 29 Papers, 0 Presentations

Conference: IS&T/SPIE Electronic Imaging 2014

Volume Number: 9025

Table of Contents

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library.

Show all abstracts

View Session

Front Matter: Volume 9025
Intelligent Mobile Robot Methods and Advancements
Computer Vision Algorithms and Applications
Mobile Cognitive System
Localization, Tracking, and Scene Analysis
3D Vision
Outdoor Robotics
Interactive Paper Session

Front Matter: Volume 9025

Front Matter: Volume 9025

Show abstract

This PDF file contains the front matter associated with SPIE Proceedings Volume 9025, including the Title Page, Copyright information, Table of Contents, Introduction (if any), and Conference Committee listing.

Intelligent Mobile Robot Methods and Advancements

Adaptation of human routines to support a robot's tasks planning and scheduling

Antti Tikanmäki, Sandra T. Feliu, Juha Röning

Show abstract

Service robots usually share their workspace with people. Typically, a robot’s tasks require knowing when and where people are, to be able to schedule requested tasks. Therefore, there exists the need to take into account the presence of humans when planning their actions and it is indispensable to have knowledge of robots’ environments. It means in practice knowing when (time and events duration) and where (in workspace) a robot's tasks can be performed. This research paper takes steps towards obtaining of the spatial information required to execute software to plan tasks to be performed by a robot. With this aim, a program capable to define meaningful areas or zones in the robot workspace by the use of a clustering is created tied with statistically reasoned time slots when to perform each task. The software is tested using real data obtained from different cameras located along the corridors of CSE Department of University of Oulu.

A novel lidar-driven two-level approach for real-time unmanned ground vehicle navigation and map building

Chaomin Luo, Mohan Krishnan, Mark Paulik, et al.

Show abstract

In this paper, a two-level LIDAR-driven hybrid approach is proposed for real-time unmanned ground vehicle navigation and map building. Top level is newly designed enhanced Voronoi Diagram (EVD) method to plan a global trajectory for an unmanned vehicle. Bottom level employs Vector Field Histogram (VFH) algorithm based on the LIDAR sensor information to locally guide the vehicle under complicated workspace, in which it autonomously traverses from one node to another within the planned EDV with obstacle avoidance. To find the least-cost path within the EDV, novel distance and angle based search heuristic algorithms are developed, in which the cost of an edge is the risk of traversing the edge. An EVD is first constructed based on the environment, which is utilized to generate the initial global trajectory with obstacle avoidance. The VFH algorithm is employed to guide the vehicle to follow the path locally. Its effectiveness and efficiency of real-time navigation and map building for unmanned vehicles have been successfully validated by simulation studies and experiments. The proposed approach is successfully experimented on an actual unmanned vehicle to demonstrate the real-time navigation and map building performance of the proposed method. The vehicle appears to follow a very stable path while navigating through various obstacles.

The 21st annual intelligent ground vehicle competition: robotists for the future

Bernard L. Theisen

Show abstract

The Intelligent Ground Vehicle Competition (IGVC) is one of four, unmanned systems, student competitions that were founded by the Association for Unmanned Vehicle Systems International (AUVSI). The IGVC is a multidisciplinary exercise in product realization that challenges college engineering student teams to integrate advanced control theory, machine vision, vehicular electronics and mobile platform fundamentals to design and build an unmanned system. Teams from around the world focus on developing a suite of dual-use technologies to equip ground vehicles of the future with intelligent driving capabilities. Over the past 21 years, the competition has challenged undergraduate, graduate and Ph.D. students with real world applications in intelligent transportation systems, the military and manufacturing automation. To date, teams from over 80 universities and colleges have participated. This paper describes some of the applications of the technologies required by this competition and discusses the educational benefits. The primary goal of the IGVC is to advance engineering education in intelligent vehicles and related technologies. The employment and professional networking opportunities created for students and industrial sponsors through a series of technical events over the fourday competition are highlighted. Finally, an assessment of the competition based on participation is presented.

Self-localization for an autonomous mobile robot based on an omni-directional vision system

Shu-Yin Chiang, Kuang-Yu Lin, Tsorng-Lin Chia

Show abstract

In this study, we designed an autonomous mobile robot based on the rules of the Federation of International Robotsoccer Association (FIRA) RoboSot category, integrating the techniques of computer vision, real-time image processing, dynamic target tracking, wireless communication, self-localization, motion control, path planning, and control strategy to achieve the contest goal. The self-localization scheme of the mobile robot is based on the algorithms featured in the images from its omni-directional vision system. In previous works, we used the image colors of the field goals as reference points, combining either dual-circle or trilateration positioning of the reference points to achieve selflocalization of the autonomous mobile robot. However, because the image of the game field is easily affected by ambient light, positioning systems exclusively based on color model algorithms cause errors. To reduce environmental effects and achieve the self-localization of the robot, the proposed algorithm is applied in assessing the corners of field lines by using an omni-directional vision system. Particularly in the mid-size league of the RobotCup soccer competition, selflocalization algorithms based on extracting white lines from the soccer field have become increasingly popular. Moreover, white lines are less influenced by light than are the color model of the goals. Therefore, we propose an algorithm that transforms the omni-directional image into an unwrapped transformed image, enhancing the extraction features. The process is described as follows: First, radical scan-lines were used to process omni-directional images, reducing the computational load and improving system efficiency. The lines were radically arranged around the center of the omni-directional camera image, resulting in a shorter computational time compared with the traditional Cartesian coordinate system. However, the omni-directional image is a distorted image, which makes it difficult to recognize the position of the robot. Therefore, image transformation was required to implement self-localization. Second, we used an approach to transform the omni-directional images into panoramic images. Hence, the distortion of the white line can be fixed through the transformation. The interest points that form the corners of the landmark were then located using the features from accelerated segment test (FAST) algorithm. In this algorithm, a circle of sixteen pixels surrounding the corner candidate is considered and is a high-speed feature detector in real-time frame rate applications. Finally, the dual-circle, trilateration, and cross-ratio projection algorithms were implemented in choosing the corners obtained from the FAST algorithm and localizing the position of the robot. The results demonstrate that the proposed algorithm is accurate, exhibiting a 2-cm position error in the soccer field measuring 600 cm2 x 400 cm2.

Computer Vision Algorithms and Applications

High-speed object matching and localization using gradient orientation features

Xinyu Xu, Peter van Beek, Xiaofan Feng

Show abstract

In many robotics and automation applications, it is often required to detect a given object and determine its pose (position and orientation) from input images with high speed, high robustness to photometric changes, and high pose accuracy. We propose a new object matching method that improves efficiency over existing approaches by decomposing orientation and position estimation into two cascade steps. In the first step, an initial position and orientation is found by matching with Histogram of Oriented Gradients (HOG), reducing orientation search from 2D template matching to 1D correlation matching. In the second step, a more precise orientation and position is computed by matching based on Dominant Orientation Template (DOT), using robust edge orientation features. The cascade combination of the HOG and DOT feature for high-speed and robust object matching is the key novelty of the proposed method. Experimental evaluation was performed with real-world single-object and multi-object inspection datasets, using software implementations on an Atom CPU platform. Our results show that the proposed method achieves significant speed improvement compared to an already accelerated template matching method at comparable accuracy performance.

Automatic lip reading by using multimodal visual features

Shohei Takahashi, Jun Ohya

Show abstract

Since long time ago, speech recognition has been researched, though it does not work well in noisy places such as in the car or in the train. In addition, people with hearing-impaired or difficulties in hearing cannot receive benefits from speech recognition. To recognize the speech automatically, visual information is also important. People understand speeches from not only audio information, but also visual information such as temporal changes in the lip shape. A vision based speech recognition method could work well in noisy places, and could be useful also for people with hearing disabilities. In this paper, we propose an automatic lip-reading method for recognizing the speech by using multimodal visual information without using any audio information such as speech recognition. First, the ASM (Active Shape Model) is used to track and detect the face and lip in a video sequence. Second, the shape, optical flow and spatial frequencies of the lip features are extracted from the lip detected by ASM. Next, the extracted multimodal features are ordered chronologically so that Support Vector Machine is performed in order to learn and classify the spoken words. Experiments for classifying several words show promising results of this proposed method.

A Viola-Jones based hybrid face detection framework

Thomas M. Murphy, Randy Broussard, Robert Schultz, et al.

Show abstract

Improvements in face detection performance would benefit many applications. The OpenCV library implements a standard solution, the Viola-Jones detector, with a statistically boosted rejection cascade of binary classifiers. Empirical evidence has shown that Viola-Jones underdetects in some instances. This research shows that a truncated cascade augmented by a neural network could recover these undetected faces. A hybrid framework is constructed, with a truncated Viola-Jones cascade followed by an artificial neural network, used to refine the face decision. Optimally, a truncation stage that captured all faces and allowed the neural network to remove the false alarms is selected. A feedforward backpropagation network with one hidden layer is trained to discriminate faces based upon the thresholding (detection) values of intermediate stages of the full rejection cascade. A clustering algorithm is used as a precursor to the neural network, to group significant overlappings. Evaluated on the CMU/VASC Image Database, comparison with an unmodified OpenCV approach shows: (1) a 37% increase in detection rates if constrained by the requirement of no increase in false alarms, (2) a 48% increase in detection rates if some additional false alarms are tolerated, and (3) an 82% reduction in false alarms with no reduction in detection rates. These results demonstrate improved face detection and could address the need for such improvement in various applications.

Towards automatic identification of mismatched image pairs through loop constraints

Armagan Elibol, Jinwhan Kim, Nuno Gracias, et al.

Show abstract

Obtaining image sequences has become easier and easier thanks to the rapid progress on optical sensors and robotic platforms. Processing of image sequences (e.g., mapping, 3D reconstruction, Simultaneous Localisation and Mapping (SLAM)) usually requires 2D image registration. Recently, image registration is accomplished by detecting salient points in two images and nextmatching their descriptors. To eliminate outliers and to compute a planar transformation (homography) between the coordinate frames of images, robust methods (such as Random Sample Consensus (RANSAC) and Least Median of Squares (LMedS)) are employed. However, image registration pipeline can sometimes provide sufficient number of inliers within the error bounds even when images do not overlap. Such mismatches occur especially when the scene has repetitive texture and shows structural similarity. In this study, we present a method to identify the mismatches using closed-loop (cycle) constraints. The method exploits the fact that images forming a cycle should have identity mapping when all the homographies between images in the cycle multiplied. Cycles appear when the camera revisits an area that was imaged before, which is a common practice especially for mapping purposes. Our proposal extracts several cycles to obtain error statistics for each matched image pair. Then, it searches for image pairs that have extreme error histogram comparing to the other pairs. We present experimental results with artificially added mismatched image pairs on real underwater image sequences.

Using short-wave infrared imaging for fruit quality evaluation

Dong Zhang, Dah-Jye Lee, Alok Desai

Show abstract

Quality evaluation of agricultural and food products is important for processing, inventory control, and marketing. Fruit size and surface quality are two important quality factors for high-quality fruit such as Medjool dates. Fruit size is usually measured by length that can be done easily by simple image processing techniques. Surface quality evaluation on the other hand requires more complicated design, both in image acquisition and image processing. Skin delamination is considered a major factor that affects fruit quality and its value. This paper presents an efficient histogram analysis and image processing technique that is designed specifically for real-time surface quality evaluation of Medjool dates. This approach, based on short-wave infrared imaging, provides excellent image contrast between the fruit surface and delaminated skin, which allows significant simplification of image processing algorithm and reduction of computational power requirements. The proposed quality grading method requires very simple training procedure to obtain a gray scale image histogram for each quality level. Using histogram comparison, each date is assigned to one of the four quality levels and an optimal threshold is calculated for segmenting skin delamination areas from the fruit surface. The percentage of the fruit surface that has skin delamination can then be calculated for quality evaluation. This method has been implemented and used for commercial production and proven to be efficient and accurate.

Mobile Cognitive System

Planning perception and action for cognitive mobile manipulators

Andre Gaschler, Svetlana Nogina, Ronald P. A. Petrick, et al.

Show abstract

We present a general approach to perception and manipulation planning for cognitive mobile manipulators. Rather than hard-coding single purpose robot applications, a robot should be able to reason about its basic skills in order to solve complex problems autonomously. Humans intuitively solve tasks in real-world scenarios by breaking down abstract problems into smaller sub-tasks and use heuristics based on their previous experience. We apply a similar idea for planning perception and manipulation to cognitive mobile robots. Our approach is based on contingent planning and run-time sensing, integrated in our knowledge of volumes" planning framework, called KVP. Using the general-purpose PKS planner, we model information-gathering actions at plan time that have multiple possible outcomes at run time. As a result, perception and sensing arise as necessary preconditions for manipulation, rather than being hard-coded as tasks themselves. We demonstrate the e ectiveness of our approach on two scenarios covering visual and force sensing on a real mobile manipulator.

Localization, Tracking, and Scene Analysis

Motion lecture annotation system to learn Naginata performances

Daisuke Kobayashi, Ryota Sakamoto, Yoshihiko Nomura

Show abstract

This paper describes a learning assistant system using motion capture data and annotation to teach “Naginata-jutsu” (a skill to practice Japanese halberd) performance. There are some video annotation tools such as YouTube. However these video based tools have only single angle of view. Our approach that uses motion-captured data allows us to view any angle. A lecturer can write annotations related to parts of body. We have made a comparison of effectiveness between the annotation tool of YouTube and the proposed system. The experimental result showed that our system triggered more annotations than the annotation tool of YouTube.

Illumination-robust people tracking using a smart camera network

Nyan Bo Bo, Peter Van Hese, Junzhi Guan, et al.

Show abstract

Many computer vision based applications require reliable tracking of multiple people under unpredictable lighting conditions. Many existing trackers do not handle illumination changes well, especially sudden changes in illumination. This paper presents a system to track multiple people reliably even under rapid illumination changes using a network of calibrated smart cameras with overlapping views. Each smart camera extracts foreground features by detecting texture changes between the current image and a static background image. The foreground features belonging to each person are tracked locally on each camera but these local estimates are sent to a fusion center which combines them to generate more accurate estimates. The nal estimates are fed back to all smart cameras, which use them as prior information for tracking in the next frame. The texture based approach makes our method very robust to illumination changes. We tested the performance of our system on six video sequences, some containing sudden illumination changes and up to four walking persons. The results show that our tracker can track multiple people accurately with an average tracking error as low as 8 cm even when the illumination varies rapidly. Performance comparison to a state-of-the-art tracking system shows that our method outperforms.

Image-based indoor localization system based on 3D SfM model

Guoyu Lu, Chandra Kambhamettu

Show abstract

Indoor localization is an important research topic for both of the robot and signal processing communities. In recent years, image-based localization is also employed in indoor environment for the easy availability of the necessary equipment. After capturing an image and sending it to an image database, the best matching image is returned with the navigation information. By allowing further camera pose estimation, the image-based localization system with the use of Structure-from-Motion reconstruction model can achieve higher accuracy than the methods of searching through a 2D image database. However, this emerging technique is still only on the use of outdoor environment. In this paper, we introduce the 3D SfM model based image-based localization system into the indoor localization task. We capture images of the indoor environment and reconstruct the 3D model. On the localization task, we simply use the images captured by a mobile to match the 3D reconstructed model to localize the image. In this process, we use the visual words and the approximate nearest neighbor methods to accelerate the process of nding the query feature's correspondences. Within the visual words, we conduct linear search in detecting the correspondences. From the experiments, we nd that the image-based localization method based on 3D SfM model gives good localization result based on both accuracy and speed.

Using probabilistic model as feature descriptor on a smartphone device for autonomous navigation of unmanned ground vehicles

Alok Desai, Dah-Jye Lee

Show abstract

There has been significant research on the development of feature descriptors in the past few years. Most of them do not emphasize real-time applications. This paper presents the development of an affine invariant feature descriptor for low resource applications such as UAV and UGV that are equipped with an embedded system with a small microprocessor, a field programmable gate array (FPGA), or a smart phone device. UAV and UGV have proven suitable for many promising applications such as unknown environment exploration, search and rescue operations. These applications required on board image processing for obstacle detection, avoidance and navigation. All these real-time vision applications require a camera to grab images and match features using a feature descriptor. A good feature descriptor will uniquely describe a feature point thus allowing it to be correctly identified and matched with its corresponding feature point in another image. A few feature description algorithms are available for a resource limited system. They either require too much of the device’s resource or too much simplification on the algorithm, which results in reduction in performance. This research is aimed at meeting the needs of these systems without sacrificing accuracy. This paper introduces a new feature descriptor called PRObabilistic model (PRO) for UGV navigation applications. It is a compact and efficient binary descriptor that is hardware-friendly and easy for implementation.

Classification and segmentation of orbital space based objects against terrestrial distractors for the purpose of finding holes in shape from motion 3D reconstruction

T. Nathan Mundhenk, Arturo Flores, Heiko Hoffman

Show abstract

3D reconstruction of objects via Shape from Motion (SFM) has made great strides recently. Utilizing images from a variety of poses, objects can be reconstructed in 3D without knowing a priori the camera pose. These feature points can then be bundled together to create large scale scene reconstructions automatically. A shortcoming of current methods of SFM reconstruction is in dealing with specular or flat low feature surfaces. The inability of SFM to handle these places creates holes in a 3D reconstruction. This can cause problems when the 3D reconstruction is used for proximity detection and collision avoidance by a space vehicle working around another space vehicle. As such, we would like the automatic ability to recognize when a hole in a 3D reconstruction is in fact not a hole, but is a place where reconstruction has failed. Once we know about such a location, methods can be used to try to either more vigorously fill in that region or to instruct a space vehicle to proceed with more caution around that area. Detecting such areas in earth orbiting objects is non-trivial since we need to parse out complex vehicle features from complex earth features, particularly when the observing vehicle is overhead the target vehicle. To do this, we have created a Space Object Classifier and Segmenter (SOCS) hole finder. The general principle we use is to classify image features into three categories (earth, man-made, space). Classified regions are then clustered into probabilistic regions which can then be segmented out. Our categorization method uses an augmentation of a state of the art bag of visual words method for object categorization. This method works by first extracting PHOW (dense SIFT like) features which are computed over an image and then quantized via KD Tree. The quantization results are then binned into histograms and results classified by the PEGASOS support vector machine solver. This gives a probability that a patch in the image corresponds to one of three categories: Earth, Man-Made or Space. Here man-made refers to artificial objects in space. To categorize a whole image, a common sliding window protocol is used. Here we utilized 90 high resolution images from space shuttle servicing missions of the international space station. We extracted 9000 128x128 patches from the images, then we hand sorted them into one of three categories. We then trained our categorizer on a subset of 6000 patches. Testing on 3000 testing patches yielded 96.8% accuracy. This is basically good enough because detection returns a probabilistic score (e.g. p of man-made). Detections can then be spatially pooled to smooth out statistical blips. Spatial pooling can be done by creating a three channel (dimension) image where each channel is the probability of each of the three classes at that location in the image. The probability image can then be segmented or co-segmented with the visible image using a classical segmentation method such as Mean Shift. This yields contiguous regions of classified image. Holes can be detected when SFM does not fill in a region segmented as man-made. Results are shown of the SOCS implementation finding and segmenting man-made objects in pictures containing space vehicles very different from the training set such as Skylab, the Hubble space telescope or the Death Star.

3D Vision

Discrete and continuous curvature computation for real data

Dirk Colbry, Neelima Shrikhande

Show abstract

This paper describes two methods for estimating the minimum and maximum curvatures for a 3D surface and compares the computational efficiency of these approaches on 3D sensor data. The classical method of Least Square Fitting (LSF) finds an approximation of a cubic polynomial fit for the local surface around the point of interest P and uses the coefficients to compute curvatures. The Discrete Differential Geometry (DDG) algorithm approximates a triangulation of the surface around P and calculates the angle deficit at P as an estimate of the curvatures. The accuracy and speed of both algorithms are compared by applying them to synthetic and real data sets with sampling neighborhoods of varying sizes. Our results indicate that the LSF and DDG methods produce comparable results for curvature estimations but the DDG method performs two orders of magnitude faster, on average. However, the DDG algorithm is more susceptible to noise because it does not smooth the data as well as the LSF method. In applications where it is not necessary for the curvatures to be precise (such as estimating anchor point locations for face recognition) the DDG method yields similar results to the LSF method while performing much more efficiently.

An evaluation of attention models for use in SLAM

Samuel Dodge, Lina Karam

Show abstract

In this paper we study the application of visual saliency models for the simultaneous localization and mapping (SLAM) problem. We consider visual SLAM, where the location of the camera and a map of the environment can be generated using images from a single moving camera. In visual SLAM, the interest point detector is of key importance. This detector must be invariant to certain image transformations so that features can be matched across di erent frames. Recent work has used a model of human visual attention to detect interest points, however it is unclear as to what is the best attention model for this purpose. To this aim, we compare the performance of interest points from four saliency models (Itti, GBVS, RARE, and AWS) with the performance of four traditional interest point detectors (Harris, Shi-Tomasi, SIFT, and FAST). We evaluate these detectors under several di erent types of image transformation and nd that the Itti saliency model, in general, achieves the best performance in terms of keypoint repeatability.

3D vision system for intelligent milking robot automation

M. A. Akhloufi

Show abstract

In a milking robot, the correct localization and positioning of milking teat cups is of very high importance. The milking robots technology has not changed since a decade and is based primarily on laser profiles for teats approximate positions estimation. This technology has reached its limit and does not allow optimal positioning of the milking cups. Also, in the presence of occlusions, the milking robot fails to milk the cow. These problems, have economic consequences for producers and animal health (e.g. development of mastitis). To overcome the limitations of current robots, we have developed a new system based on 3D vision, capable of efficiently positioning the milking cups. A prototype of an intelligent robot system based on 3D vision for real-time positioning of a milking robot has been built and tested under various conditions on a synthetic udder model (in static and moving scenarios). Experimental tests, were performed using 3D Time-Of-Flight (TOF) and RGBD cameras. The proposed algorithms permit the online segmentation of teats by combing 2D and 3D visual information. The obtained results permit the teat 3D position computation. This information is then sent to the milking robot for teat cups positioning. The vision system has a real-time performance and monitors the optimal positioning of the cups even in the presence of motion. The obtained results, with both TOF and RGBD cameras, show the good performance of the proposed system. The best performance was obtained with RGBD cameras. This latter technology will be used in future real life experimental tests.

SDTP: a robust method for interest point detection on 3D range images

Shandong Wang, Lujin Gong, Hui Zhang, et al.

Show abstract

In fields of intelligent robots and computer vision, the capability to select a few points representing salient structures has always been focused and investigated. In this paper, we present a novel interest point detector for 3D range images, which can be used with good results in applications of surface registration and object recognition. A local shape description around each point in the range image is firstly constructed based on the distribution map of the signed distances to the tangent plane in its local support region. Using this shape description, the interest value is computed for indicating the probability of a point being the interest point. Lastly a Non-Maxima Suppression procedure is performed to select stable interest points on positions that have large surface variation in the vicinity. Our method is robust to noise, occlusion and clutter, which can be seen from the higher repeatability values compared with the state-of-the-art 3D interest point detectors in experiments. In addition, the method can be implemented easily and requires low computation time.

Real time moving object detection using motor signal and depth map for robot car

Hao Wu, Wan-Chi Siu

Show abstract

Moving object detection from a moving camera is a fundamental task in many applications. For the moving robot car vision, the background movement is 3D motion structure in nature. In this situation, the conventional moving object detection algorithm cannot be use to handle the 3D background modeling effectively and efficiently. In this paper, a novel scheme is proposed by utilizing the motor control signal and depth map obtained from a stereo camera to model the perspective transform matrix between different frames under a moving camera. In our approach, the coordinate relationship between frames during camera moving is modeled by a perspective transform matrix which is obtained by using current motor control signals and the pixel depth value. Hence, the relationship between a static background pixel and the moving foreground corresponding to the camera motion can be related by a perspective matrix. To enhance the robustness of classification, we allowed a tolerance range during the perspective transform matrix prediction and used multi-reference frames to classify the pixel on current frame. The proposed scheme has been found to be able to detect moving objects for our moving robot car efficiently. Different from conventional approaches, our method can model the moving background in 3D structure, without online model training. More importantly, the computational complexity and memory requirement are low making it possible to implement this scheme in real-time, which is even valuable for a robot vision system.

Outdoor Robotics

Research and development of Ro-boat: an autonomous river cleaning robot

Aakash Sinha, Prashant Bhardwaj, Bipul Vaibhav, et al.

Show abstract

Ro-Boat is an autonomous river cleaning intelligent robot incorporating mechanical design and computer vision algorithm to achieve autonomous river cleaning and provide a sustainable environment. Ro-boat is designed in a modular fashion with design details such as mechanical structural design, hydrodynamic design and vibrational analysis. It is incorporated with a stable mechanical system with air and water propulsion, robotic arms and solar energy source and it is proceed to become autonomous by using computer vision. Both “HSV Color Space” and “SURF” are proposed to use for measurements in Kalman Filter resulting in extremely robust pollutant tracking. The system has been tested with successful results in the Yamuna River in New Delhi. We foresee that a system of Ro-boats working autonomously 24x7 can clean a major river in a city on about six months time, which is unmatched by alternative methods of river cleaning.

Real-time, resource-constrained object classification on a micro-air vehicle

Louis Buck, Laura Ray

Show abstract

A real-time embedded object classification algorithm is developed through the novel combination of binary feature descriptors, a bag-of-visual-words object model and the cortico-striatal loop (CSL) learning algorithm. The BRIEF, ORB and FREAK binary descriptors are tested and compared to SIFT descriptors with regard to their respective classification accuracies, execution times, and memory requirements when used with CSL on a 12.6 g ARM Cortex embedded processor running at 800 MHz. Additionally, the effect of x2 feature mapping and opponent-color representations used with these descriptors is examined. These tests are performed on four data sets of varying sizes and difficulty, and the BRIEF descriptor is found to yield the best combination of speed and classification accuracy. Its use with CSL achieves accuracies between 67% and 95% of those achieved with SIFT descriptors and allows for the embedded classification of a 128x192 pixel image in 0.15 seconds, 60 times faster than classification with SIFT. X2 mapping is found to provide substantial improvements in classification accuracy for all of the descriptors at little cost, while opponent-color descriptors are offer accuracy improvements only on colorful datasets.

New vision system and navigation algorithm for an autonomous ground vehicle

Hokchhay Tann, Bicky Shakya, Alex C. Merchen, et al.

Show abstract

Improvements were made to the intelligence algorithms of an autonomously operating ground vehicle, Q, which competed in the 2013 Intelligent Ground Vehicle Competition (IGVC). The IGVC required the vehicle to first navigate between two white lines on a grassy obstacle course, then pass through eight GPS waypoints, and pass through a final obstacle field. Modifications to Q included a new vision system with a more effective image processing algorithm for white line extraction. The path-planning algorithm adopted the vision system, creating smoother, more reliable navigation. With these improvements, Q successfully completed the basic autonomous navigation challenge, finishing tenth out of over 50 teams.

An effective trace-guided wavefront navigation and map-building approach for autonomous mobile robots

Chaomin Luo, Mohan Krishnan, Mark Paulik, et al.

Show abstract

This paper aims to address a trace-guided real-time navigation and map building approach of an autonomous mobile robot. Wave-front based global path planner is developed to generate a global trajectory for an autonomous mobile robot. Modified Vector Field Histogram (M-VFH) is employed based on the LIDAR sensor information to guide the robot locally to be autonomously traversed with obstacle avoidance by following traces provided by the global path planner. A local map composed of square grids is created through the local navigator while the robot traverses with limited LIDAR sensory information. From the measured sensory information, a map of the robot’s immediate limited surroundings is dynamically built for the robot navigation. The real-time wave-front based navigation and map building methodology has been successfully demonstrated in a Player/Stage simulation environment. With the wave-front-based global path planner and M-VFH local navigator, a safe, short, and reasonable trajectory is successfully planned in a majority of situations without any templates, without explicitly optimizing any global cost functions, and without any learning procedures. Its effectiveness, feasibility, efficiency and simplicity of the proposed real-time navigation and map building of an autonomous mobile robot have been successfully validated by simulation and comparison studies. Comparison studies of the proposed approach with the other path planning approaches demonstrate that the proposed method is capable of planning more reasonable and shorter collision-free trajectories autonomously.

Interactive Paper Session

An intelligent hybrid behavior coordination system for an autonomous mobile robot

Chaomin Luo, Mohan Krishnan, Mark Paulik, et al.

Show abstract

In this paper, development of a low-cost PID controller with an intelligent behavior coordination system for an autonomous mobile robot is described that is equipped with IR sensors, ultrasonic sensors, regulator, and RC filters on the robot platform based on HCS12 microcontroller and embedded systems. A novel hybrid PID controller and behavior coordination system is developed for wall-following navigation and obstacle avoidance of an autonomous mobile robot. Adaptive control used in this robot is a hybrid PID algorithm associated with template and behavior coordination models. Software development contains motor control, behavior coordination intelligent system and sensor fusion. In addition, the module-based programming technique is adopted to improve the efficiency of integrating the hybrid PID and template as well as behavior coordination model algorithms. The hybrid model is developed to synthesize PID control algorithms, template and behavior coordination technique for wall-following navigation with obstacle avoidance systems. The motor control, obstacle avoidance, and wall-following navigation algorithms are developed to propel and steer the autonomous mobile robot. Experiments validate how this PID controller and behavior coordination system directs an autonomous mobile robot to perform wall-following navigation with obstacle avoidance. Hardware configuration and module-based technique are described in this paper. Experimental results demonstrate that the robot is successfully capable of being guided by the hybrid PID controller and behavior coordination system for wall-following navigation with obstacle avoidance.

Increasing signal-to-noise ratio of registered images by using light spatial noise portrait of camera's photosensor

Nikolay N. Evtikhiev, Pavel A. Cheremkhin, Vitaly V. Krasnov, et al.

Show abstract

Increase of signal-to-noise ratio (SNR) of registered images is important task in such fields as image encryption, digital holography, pattern recognition, etc. Method of image SNR increasing by using light spatial noise portrait (LSNP) of camera's photosensor is presented. Use of proposed LSNP compensation method is especially effective after application of other methods for SNR increasing that suppress temporal noise. Procedure of LSNP measurement is described. LSNP of camera Canon EOS 400D was measured. Analytical expressions for estimation of achievable SNR increase were derived. Using characteristics of obtained LSNP, numerical experiments on estimation of SNR increase were performed. It is obtained that typically utilizing of averaging over frames method allows to increase SNR up to 2 times. Consequent application of LSNP compensation method leads to 10 times SNR increase. These numerical experiments results confirm derived analytical expressions. In case of using more accurate LSNP compared to the obtained, SNR can be increased up to 50 times. Using cameras Canon EOS 400D and MegaPlus II ES11000, test experiments were performed. Experimental results are in good agreement with numerical ones. Experimentally obtained SNR increased 5 times compared to the original.

Color back projection for fruit maturity evaluation

Dong Zhang, Dah-Jye Lee, Alok Desai

Show abstract

In general, fruits and vegetables such as tomatoes and dates are harvested before they fully ripen. After harvesting, they continue to ripen and their color changes. Color is a good indicator of fruit maturity. For example, tomatoes change color from dark green to light green and then pink, light red, and dark red. Assessing tomato maturity helps maximize its shelf life. Color is used to determine the length of time the tomatoes can be transported. Medjool dates change color from green to yellow, and the orange, light red and dark red. Assessing date maturity helps determine the length of drying process to help ripen the dates. Color evaluation is an important step in the processing and inventory control of fruits and vegetables that directly affects profitability. This paper presents an efficient color back projection and image processing technique that is designed specifically for real-time maturity evaluation of fruits. This color processing method requires very simple training procedure to obtain the frequencies of colors that appear in each maturity stage. This color statistics is used to back project colors to predefined color indexes. Fruit maturity is then evaluated by analyzing the reprojected color indexes. This method has been implemented and used for commercial production.

Unmanned ground vehicle: controls and dynamics

Ebrahim Attarwala, Pranav Maheshwari, Kumar Keshav, et al.

Show abstract

We have developed a ground vehicle capable of maneuvering in an open environment negotiating outdoor obstacle course autonomously carrying a payload by finding colored lanes, obstructions and navigating through GPS waypoints. In this paper, we will be discussing the hardware components like mechanical and electrical design, various sensors used and software components of the vehicle like image processing, environment mapping and navigation algorithms. The vehicle uses its sensors to develop some limited understanding of the environment, which is then used by control algorithms to determine the next action to take in the context of a human provided mission goal.