Techniques and Applications of Image Understanding

Summary Of The DARPA Image Understanding Research Program

Larry E. Druffel

Show abstract

The DARPA Image Understanding Program provides support for a variety of fundamental research efforts aimed at the derivation of information from images. This paper highlights the research and describes the approach used to develop synergism between researchers engaged in complementary sciences contributing to computer vision. Possible applications of Image Understanding to the Department of Defense are discussed. This paper is an up-date of a paper previously presented at the SPIE in August 1979.

Defense Mapping Agency (DMA) Overview Of Mapping, Charting, And Geodesy (MC&G) Applications Of Digital Image Pattern Recognition

William C. Mahoney

Show abstract

In the world of mapping, charting and geodesy (MC&G) image processing, the amount of knowledge a person must bring to bear is one of the deepest intellectual questions today. The human being looks for meaning wherever possible and develops ways to organize things perceptively even if he has to invent ways of doing it. This process leads him into a wide range of information processing activities and technologies to assist him in achieving image exploitation goals in the most effective and efficient manner. Within a given area of MC&G interest, any set of procedures developed must give him the capability to process all required information within a scene regardless of its diversity. This makes it mandatory that his range of image processing cannot be limited to just techniques of image manipulation, but must involve a total system concept starting from the particular attributes and capabilities of the human mind, i.e., the processes and paradigms used by that mind in the accomplishment of its tasks, the equipment and methods by which that mind interacts with source image materials and the computer processes used to extract information. All of this must be accomplished at rates commensurate with mapping large regions of the world, within relatively fixed periods of time, at a variety of scales and detail densities using input photography also varying over a wide range of scales and ground resolutions.

Application Of Image Understanding To Automatic Tactical Target Acquisition

Arden R. Helland, Thomas J. Willett, Glenn E. Tisdale

Show abstract

Real-time equipment has been developed and is now being tested for automatic recognition of targets on an individual basis. The recent use of frame-to-frame integration techniques has significantly improved the classification performance with this equipment to the point where the human interpreter can sometimes be surpassed. For some imagery, however, initial target segmentation remains unsatisfactory, causing targets to be missed, and the level of false alarms may be too high. As a result, more sophisticated image processing techniques are now being addressed which could provide a comprehensive understanding of overall image content. These include the use of such scene analysis operations as the derivation of motion vectors for passive ranging, false alarm discrimination, and detection of target motion. Additional areas of interest lie in the "intelligent" tracking of multiple targets, and the autonomous handoff of targets between sensors. The paper discusses the evolution of these areas, and their probable impact on the target acquisition process. It also addresses their impact on hardware implementation.

Intelligent Control Of Tactical Target Cueing

O. Firschein, C. M. Bjorklund, M. J. Hannah, et al.

Show abstract

For the past two years a study of the navigation of a small, low flying vehicle based on passively sensed imagery has been carried out. The dead reckoning is based on motion stereo, with corrections made periodically based on analysis of landmarks found in images. Because the landmark subsystem requires a database of knowledge concerning salient features of the region, it is natural to consider extension of the database to include data relevant to target cueing. This paper describes the passive navigation system and then discusses the role of the database for target cueing. Preliminary findings using this concept are described.

Algorithms And Hardware Technology For Real-Time Classification And Target Detection Of Military Vehicles

James M. Graziano, Grant R. Gerhart, Joseph F. Hannigan

Show abstract

This paper addresses the problem of the surveillance and counter-surveillance classification of military vehicles using one-dimensional analysis of the target images. A two-dimensional image is digitized into a n by n pixel matrix which is summed along each row and column to produce a pair of n-component vectors which are invariant under image translation or rotation. The Fourier Transform of this one-dimensional image representation is analogous to the spectra which are produced by the Direct Electronic Fourier Transform (DEFT) acousto-optical real time devices. The digital pixel vector formalism simulates the DEFT device for the purpose of bandwidth extension and signal to noise improvement. Computer algorithms are presented to preprocess the image data for edge enhancement and feature extraction.. This data is further processed to extract a single number which identifies the vehicle image. Results are presented for several different types of the US Army vehicles in a number of distinct background scenarios.

Fast Adaptive Algorithms For Low-Level Scene Analysis: Applications Of Polar Exponential Grid (PEG) Representation To High-Speed, Scale-And-Rotation Invariant Target Segmentation

P. S. Schenker, K. M. Wong, E. G. Cande

Show abstract

This paper presents results of experimental studies in image understanding. Two experiments are discussed, one on image correlation and another on target boundary estimation. The experiments are demonstrative of polar exponential grid (PEG) representation, an approach to sensory data coding which the authors believe will facilitate problems in 3-D machine perception. Our discussion of the image correlation experiment is largely an exposition of the PEG-representation concept and approaches to its computer implementation. Our presentation of the boundary finding experiment introduces a new robust stochastic, parallel computation segmentation algorithm, the PEG-Parallel Hierarchical Ripple Filter (PEG-PHRF).

Cellular Array Processing Simulation

Harry C. Lee, Earl W. Preston

Show abstract

The Cellular Array Processing Simulation (CAPS) system is a high-level image language that runs on a multiprocessor configuration. CAPS is interpretively decoded on a conventional minicomputer with all image operation instructions executed on an array processor. The synergistic environment that exists between the minicomputer and the array processor gives CAPS its high-speed throughput, while maintaining a convenient conversational user language. CAPS was designed to be both modular and table driven so that it can be easily maintained and modified. CAPS uses the image convolution operator as one of its primitives and performs this cellular operation by decomposing it into parallel image steps that are scheduled to be executed on the array processor. Among its features is the ability to observe the imagery in real time as a user's algorithm is executed. This feature reduces the need for image storage space, since it is feasible to retain only original images and produce resultant images when needed. CAPS also contains a language processor that permits users to develop re-entrant image processing subroutines or algorithms.

Autonomous Acquisition Simulator System: A Testbed For Tactical Image Processing Algorithm Development

H. Mack, A. W. Mathe, L. E. Chall, et al.

Show abstract

The Autonomous Acquisition Simulator System (AAS) is a modular real time image processing system that is currently being developed by the Ford Aerospace & Communications Corporation (FACC) for the U.S. Army Night Vision & Electro-Optics Laboratory (NV&EOL). As a cooperative effort between FACC and NV&EOL, the AAS Program has as its primary objectives, the development of a flexible real time image processing algo-rithm simulation device that may be used as a tool for the evaluation of artificial intelligence and image processing algorithms for application to tactical systems over the next few years. This paper summarizes the operational and system requirements for a testbed that would function as such a simulation device. The architecture of both the hardware and software designs for the AAS are then described in the context of resulting data structures and computational requirements for implementing these algorithms in real time. Also included is an approach for providing a compiler level language as a highly desirable feature to facilitate the development of software for implementing new algorithms.

Reconnaissance Applications Of Image Understanding

Jon L. Roberts

Show abstract

The relationship of numerous DARPA and privately pursued Image Understanding Research efforts are related to the various phases of reconnaissance image interpretation. These efforts can directly contribute to the various subtasks, associated with each phase of interpretation. The phases are briefly outlined along with sample contributing IU research.

Computer-Assisted Photo Interpretation Research At United States Army Engineer Topographic Laboratories (USAETL)

George E. Lukes

Show abstract

A program in computer-assisted photo interpretation research (CAPIR) has been initiated at the U.S. Army Engineer Topographic Laboratories. In a new laboratory, a photo interpreter (PI) analyzing high-resolution, aerial photography interfaces directly to a digital computer and geographic information system (GIS). A modified analytical plotter enables the PI to transmit encoded three-dimensional spatial data from the stereomodel to the computer. Computer-generated graphics are displayed in the stereomodel for direct feedback of digital spatial data to the PI. Initial CAPIR capabilities include point positioning, mensuration, stereoscopic area search, GIS creation and playback, and elevation data extraction. New capabilities under development include stereo graphic superposition, a digital image workstation, and integration of panoramic Optical Bar Camera photography as a primary GIS data source. This project has been conceived as an evolutionary approach to the digital cartographic feature extraction problem. As a working feature extraction system, the CAPIR laboratory can serve as a testbed for new concepts emerging from image understanding and knowledge-based systems research.

Automatic Reconnaissance-Based Target-Coordinate Determinations

J. G . Hardy, A T. Zavodny

Show abstract

The imaging reconnaissance sensors being increasingly used today for intelligence gathering activities provide data more rapidly than it can be exploited for many intelligence purposes. One exploitation task typically performed on reconnaissance imagery is the determination of the geographical coordinates of targets and features of interest. Because the current techniques used to perform this task rely heavily on time-consuming and error-prone manual methods, a totally automatic technique for determining target coordinates is needed. The automatic technique proposed herein relies on current image correlation/matching technologies.

Hough Transform For Target Detection In Infrared Imagery

Tsutomu Shibata, Werner Frei

Show abstract

An algorithm is described for detecting and recognizing targets in infrared imagery in real-time. Suspicious areas are selected from a large frame by a very fast pre-processor which flags suspicious locations based upon local statistics in the frame. For finding the outlines of the target, the Sobel operator is used to extract edge gradients and orientations, which are then mapped into parameter space by the Hough transformation. Normalization and sharpening operations applied in the parameter space subsequently enhance straight boundaries associated with possible targets. A discriminant function for the recognition of the targets is formed based upon the assumption that the four boundaries of the target produce four sharp peaks in the parameter space. The above algorithm successfully detected targets in 25 subimages selected form a large frame.

Registration Of A Synthetic Aperture Radar (SAR) Reconnaissance Image With A Map Reference Data Base

Garo K. Kiremidjian

Show abstract

The problem of registering a reconnaissance side-looking synthetic aperture radar (SAR) image to a three-dimensional reference map is examined by developing a technique which is based on computing an image-to-date base correspondence in terms of a SAR sensor model as a function of such parameters as altitude, range, scale, etc. If their exact values are known, registration is accomplished by utilizing the model to predict the two-dimensional image coordinates for any three-dimensional data base point. In general, the platform ephemeris (PE) data provides only coarse model parameter estimates. The objective, then, is to refine them so that the model can predict the image location of any data base point within a desired accuracy range. The results obtained in this paper demonstrate the feasi-bility of achieving location accuracy within 50 m.

Dimensional Automatic Target Classification

Harold W. Rose, James C. Rachal

Show abstract

High resolution electro-optical sensors are emerging from technology, offering a capability to remotely determine geometric shape of objects. In terms of military reconnaissance and surveilance needs, this capability enhances the potential to detect, classify, and identify enemy targets automatically, and record or report only their type and location. The objective of this paper is to outline the general nature of an automatic classifier for processing high resolution, active electro-optical sensor data on the basis of target dimensional features. A simple geometric analysis is used to demonstrate the predominant features of the data, and to suggest approaches for cueing potential targets by masking out extraneous background data. The goal is to provide an early data reduction so that potential target subframes can be processed in the classification processor at lower data rates. Both line-scan and raster-scan data formats are considered, in forward- and down-looking configurations.

Applications Of Moments To Image Understanding

Norman B. Nill

Show abstract

Several problems in image understanding can be solved by applying the mathematical concept of moments. In this presentation a review of the moment concept will be given, followed by a discussion of the applications of moments to image pattern recognition, change detection, quality evaluation and data compression.

Overview Of Applications Of Image Understanding To Industrial Automation

Robert C. Bolles

Show abstract

An overview of possible applications of image-understanding (IU) research to industrial automation (IA) problems is presented. Current IA and IU vision systems are characterized and compared. A selection of IU techniques and their potential for increasing the competence of IA systems is then briefly described.

Geometric Modeling In Vision For Manufacturing

Rodney A. Brooks, Thomas O. Binford

Show abstract

We describe a geometric modeling and geometric reasoning system which is being developped to model real world industrial parts and to carry out examples of programmable inspection and picking parts from bins. It models both specific objects and generic classes of industrial parts. We concentrate on the problem of programming complex vision tasks. The models are used to guide a vision system interpreting industrial images. It identifies instances of objects, and odject classes, and extracts three dimensional information from monocular images. it can make use of depth information from stereo or projection ranging. The vision system relies on propagating symbolic algebraic constraints, both to predict image features and to extract three dimensional information from image feature measurements. This research is integrated with extensive research on finding image intensity boundaries in images and on surface description from stereo.

Real-Time Pattern RecognitionÃ¢â‚¬â€?An Industrial Example

Gary M. Fitton

Show abstract

Rapid advancements in cost effective sensors and micro computers are now making practical the on-line implementation of pattern recognition based systems for a variety of industrial applications requiring high processing speeds. One major application area for real time pattern recognition is in the sorting of packaged/cartoned goods at high speed for automated warehousing and return goods cataloging. While there are many OCR and bar code readers available to perform these functions, it is often impractical to use such codes (package too small, adverse esthetics, poor print quality) and an approach which recognizes an item by its graphic content alone is desirable. This paper describes a specific application within the tobacco industry, that of sorting returned cigarette goods by brand and size.

Visual Control For Robot Handling Of Unoriented Parts

John R. Birk, Robert B. Kelley, Jean-Daniel Dessimoz

Show abstract

Robots guided by image analysis have successfully acquired parts from bins, computed part position and orientation, and performed manipulation to bring parts to the same position and orientation. This task is common to many workstations in manufacturing industries since parts are often supplied in bins and must be oriented before insertion into a machine. Systems which are currently technically demonstrated as feasible have profitable applications now, but the number of suitable applications now is small when compared to those which will be appropriate in several years. More research is needed to make such robot systems fast, easily programmed, inexpensive, reliable, and applicable to a wider variety of parts. The roles of vision include directing a hand into a bin to grasp a single piece, computing the orientation of a piece in a robot's hand, piece guidance into a fixture, verification of placement in a fixture, and inspection for piece integrity.

Automatic Visual Inspection Of Metal Surfaces

G. B. Porter, J. L. Mundy

Show abstract

Applications in automatic visual inspection require careful control of image contrast in the presence of varying part surface conditions. This paper models the part imaging problem as a stochastic light scattering phenomenon. Several equations are presented which provide a relation between a visual image and surface roughness, reflectance, and position.

Nonreference Optical Inspection Of Complex And Repetitive Patterns

Warren M. Sterling

Show abstract

A system designed for non-reference real time optical inspection and analysis of complex or repetitive patterns is described. The primary application of the system has been the inspection of etched and plated patterns. For printed wiring board (PWB) inspection, the system can inspect a 20 x 24 inch board in approximately four minutes, utilizing one mil ( .001 inch) imaging. resolution. Imaging resolution of 0.5 mil (.0005 inch) can be achieved. The system employs a. non-reference inspection technique in that it does not compare the article under test to an article known to be good. Rather, the acceptability criteria are based on design rules and material parameters. Video rate image analysis is attained by initial encoding of the video data, and utilization of analysis packages which operate directly on the encoded data. System implementation lends itself to easy modification for diverse patterns and testing criteria. This paper will discuss all aspects of the system including the linear CCD array based sensor and scanner assembly, hardware and firmware encoding of the video data, and the defect detection technique. Application examples are presented, including insection of inner layers of multi-layer PWB's, inspection of plated repetitive structures, and optical character recognition (OCR)

Data Base Support For Automated Photo Interpretation

David M. McKeown Jr., Takeo Kanade

Show abstract

This paper is concerned with the use of a database to support automated photo interpretation. The function of the database is to provide an environment in which to perform photo interpretation utilizing software tools, and represent domain knowledge about the scenes being interpreted. Within the framework of the database, image interpretation systems use knowledge stored as map, terrain, or scene descriptions to provide structural or spatial constraints to guide human and machine processing. We describe one such system under development, MAPS (Map Assisted Photo interpretation System), and give some general rationales for its design and implementation.

Determining The Instantaneous Direction Of Motion From Optical Flow Generated By A Curvilinearly Moving Observer

K. Prazdny

Show abstract

A method is described capable of decomposing the optical flow into its rotational and translational components. The translational component is extracted implicitly by locating the focus of expansion associated with the translational component of the relative motion. The method is simple, relying on minimizing an (error) function of 3 parameters. As such, it can also be applied, without modification, in the case of noisy input information. Unlike the previous attempts at interpreting optical flow to obtain elements, the method uses only relationships between quantities on the projection plane. No 3D geometry is involved. Also outlined is a possible use of the method for the extraction of that part of the optical flow containing information about relative depth directly from the image intensity values, without extracting the "retinal" velocity vectors.

Relaxation Matching Applied To Aerial Images

K. E. Price

Show abstract

We have developed a symbolic matching system which can be used for a variety of matching tasks in scene analysis. The system is designed to handle many of the problems encountered in the analysis of real scenes: noisy feature values, missing elements, extra pieces of objects, many features, many objects. At the heart of this system is a relaxation based matching scheme. A variety of relaxation procedures have been used with varying results.

Line Finding With Subpixel Precision

P. J. MacVicar-Whelan, T. O. Binford

Show abstract

An intermediate level vision system that utilises grey scale levels (typically 8 bits or 256 levels in our case) has been implemented which locates and links intensity discontinuities in a digitized image to subpixel precision. The discontinuities are located and localised by utilizing the zero crossings in the laterally inhibited image of the digitized picture.

Multiresolution Pixel Linking For Image Smoothing And Segmentation

Teresa Silberberg, Shmuel Peleg, Azriel Rosenfeld

Show abstract

When an image is smoothed using small blocks or neighborhoods, the results may be somewhat unreliable due to the effects of noise on small samples. When larger blocks are used, the samples become more reliable, but they are more likely to be mixed, since a large block will often not be contained in a single region of the image. A compromise approach is to use several block sizes, representing versions of the image at several resolutions, and to carry out the smoothing by means of a cooperative process based on links between blocks of adjacent sizes. These links define "block trees" which segment the image into regions, not necessarily connected, over which smoothing takes place. In this paper, a number of variations on the basic block linking approach are investigated, and some tentative conclusions are drawn regarding preferred methods of initializing the process and of defining the links, yielding improvements over the originally proposed method.

Interpretation Of Geometric Structure From Image Boundaries

David G. Lowe, Thomas O. Binford

Show abstract

General constraints on the interpretation of image boundaries are described and implemented. We illustrate the use of these constraints to carry out geometric interpretation of images up to the volumetric level. A general coincidence assumption is used to derive suggestive but incomplete interpretations for local features. A reasoning system is described which can use these suggestive hypotheses to derive consistent global interpretations, while maintaining the ability to remove the implications of hypotheses which are disproved in the face of further evidence. An important aspect of interpretation is the classification of image boundaries (intensity discontinuities) into those caused by geometric, reflectance, or illumination discontinuities. These interact with other hypotheses regarding occlusion by solid objects, the direction of illumination, aspects of object geometry, and the production of illumination discontinuities by geometric discontinuities. Although only a subset of the constraints and system design features have been implemented to date, we demonstrate the successful interpretation of some simulated image boundaries up to the volumetric level, including the construction of a three-space model.

Relative Depth And Local Surface Orientation From Image Motions

K. Prazdny

Show abstract

A simple mathematical formalism is presented suggesting a mechanism for computing relative depth of any two texture elements characterized by the same relative motion parameters. The method is based on a ratio of a function of the angular velocities of the projecting rays corresponding to the two texture elements. The angu-lar velocity of a ray cannot, however, be computed directly from the instantaneous characterization of motion of a "retinal" point. It is shown how it can be obtained from the (linear) velocity of the image element on the projection surface and the first time derivative of its direction vector. A similar analysis produces a set of equations which directly yield local surface orientation relative to a given visual direction. The variables involved are scalar quantities directly measurable on the projection surface but, unlike the case of relative depth, the direction of (instantaneous) motion has to be computed by different means before the method can be applied. The relative merits of the two for-malisms are briefly discussed.

Structural Analysis Of Natural Textures

Felicia Vilnrotter, Ramakant Nevatia, Keith E. Price

Show abstract

A technique to analyze patterns in terms of individual texture primitives and their spatial relationships is described. The technique is applied to natural textures. The descriptions consist of the primitive sizes and their repetition pattern if any. The derived descriptions can be used for recognition or reconstruction of the pattern.* INTRODUCTION Areas of an image are better characterized by descriptions of their texture than by pure intensity information. Texture is most easily described as the pattern of the spatial arrangement of different intensities (or colors). The different textures in an image are usually very apparent to a human observer, but automatic description of these patterns has proved to be very complex. In this research, we are concerned with a description of the texture which corresponds, in some sense, to a description produced by a person looking at the image. Many statistical textural measures have been proposed in the past [1-4]. Reference 1 gives a good review of various texture analysis methods. Among the statistical measures which have been discussed, and used, analysis of generalized gray-level co-occurrence matrices [2], analysis of edge directions with co-occurrence matrices [3], and analysis of the edges (or micro-edges) in a subwindow [4]. The statistical methods, by themselves, do not produce descriptions in the form which we desire. Some of the measures may indicate certain underlying structures in the pattern, but do not produce a general description. The Fourier transform has been used to determine some structural descriptions but was only partially successful for more complex patterns [5].

Computational Models For Texture Analysis And Synthesis

David D. Garber, Alexander A. Sawchuk

Show abstract

In this paper, binary and gray-level natural textures are synthesized using several methods. The quality of the natural texture simulations depends on the computation time for data collection, computation time for generation, and storage used in each process. Many textures are adequately simulated using simple models thus providing a potentially great information compression for many applications. Other textures with macrostructure and nonstationary characteristics require more extensive computation to synthesize visually pleasing results. Although the success of texture synthesis is highly dependent on the texture itself and the modeling method chosen, general conclusions regarding the performance of various techniques are given.

Two Hierarchial Linear Feature Representations: Edge Pyramids And Edge Quadtrees

Michael Shneier

Show abstract

Two related methods for hierarchical representation of curve information are presented. First, edge pyramids are defined and discussed. An edge pyramid is a sequence of successively lower resolution images, each image containing a summary of the edge or curve information in iLs predecessor. This summary includes the average magnitude and direction in a neighborhood of the preceding image, as well as an intercept in that neighborhood and a measure of the error in the direction estimate. An edge quadtree is a variable-resolution representation of the linear information in the image. It is constructed by recursively splitting the image into quadrants based on magnitude, direction and intercept information. Advantages of the edge quadtree representation are its ability to represent several linear features in a single tree, its registration with the original image, and its ability to perform many common operations efficiently.

Edge Evaluation Using Local Edge Coherence

Les Kitchen, Azriel Rosenfeld

Show abstract

A method of evaluating edge detector output is proposed, based on the local good form of the detected edges. It combines two desirable qualities of well-formed edges -- good continuation and thinness. The measure has the expected behavior for known input edges as a function of their blur and noise. It yields results generally similar to those obtained with measures based on discrepancy of the detected edges from their known ideal positions, but it has the advantage of not requiring ideal positions to be known. It can be used as an aid to threshold selection in edge detection (pick the threshold that maximizes the measure), as a basis for comparing the performances of different detectors, and as a measure of the effectiveness of various types of preprocessing operations facilitating edge detection.

Towards A Real Time Implementation Of The Marr And Poggio Stereo Matcher

H. K. Nishihara, N. G. Larson

Show abstract

This paper reports on research--primarily at Marr and Poggio's [9] mechanism level--to design a practical hardware stereo-matcher and on the interaction this study has had with our understanding of the problem, at the computational theory and algorithm levels. The stereo-matching algorithm proposed by Marr and Poggio [10] and implemented by Grimson and Marc [3] is consistent with what is presently known about human stereo vision [2]. Their research has been concerned with understanding the principles underlying the stereo-matching problem. Our objective has been to produce a stereo-matcher that operates reliably at near real time rates as a tool to facilitate further research in vision and for possible application in robotics and stereo-photogrammetry. At present the design and construction of the camera and convolution modules of this project have been completed and the design of the zero-crossing and matching modules is progressing. The remainder of this section provides a brief description of the Marr and Poggio stereo algorithm. We then dis-cuss our general approach and sonic of the issues that have come up concerning the design of the individual modules.

Bootstrap Stereo Error Simulations

Marsha Jo Hannah

Show abstract

Over the past three years, Lockheed has been working in navigation of an autonomous aerial vehicle using passively sensed images. One technique which has shown promise is bootstrap stereo, in which the vehicle's position is determined from the perceived locations of known ground control points. Successive pairs of known vehicle camera positions are then used to locate corresponding image points on the ground, creating new control points. This paper describes a series of error simulations which have been performed to investigate the error propagation as the number of boot-strapping iterations increases.

Model-Based Three-Dimensional Interpretations Of Two-Dimensional Images

Rodney A. Brooks

Show abstract

ACRONYM is a comprehensive domain independent model-based system for vision and manipulation related tasks. Many of its sub-modules and representations have been described elsewhere. Here the derivation and use of invariants for image feature prediction is described. We describe how predictions of image features and their relations are made and how instructions are generated which tell the interpretation algorithms how to make use of image feature measurments to derive three dimensional sizes and structural and spatial constraints on the original three-dimensional models. Some preliminary examples of ACRONYM'S interpretations of aerial images are shown.

Determining Optical Flow

Berthold K.P. Horn, Brian G. Schunck

Show abstract

Optical flow cannot be computed locally, since only one independent measurement is available from the image sequence at a point, while the flow velocity has two components. A second constraint is needed. A method for finding the optical flow pattern is presented which assumes that the apparent velocity of the brightness pattern varies smoothly almost everywhere in the image. An iterative implementation is shown which successfully computes the optical flow for a number of synthetic image sequences. The algorithm is robust in that it can handle image sequences that are quantized rather coarsely in space and time. It is also insensitive to quantization of brightness levels and additive noise. Examples are included where the assumption of smoothness is violated at singular points or along lines in the image.

Geometric Constraints For Interpreting Images Of Common Structural Elements: Orthogonal Trihedral Vertices

Sidney Liebes Jr.

Show abstract

A simple analytical procedure is introduced for utilizing a ubiquitous engineering and architural structural subelement to facilitate automatically cuing, monoscopically inferring surface structure and orientation, and resolving stereo correspondences: orthogonal trihedral vertices, or OTVs. OTVs occur in profusion indoors and out. They are identifiable, and are a rich source of information regarding relative surface conformation and orientation. Practical considerations often constrain OTVs to be vertically aligned. General obligue perspective properties of OTVs are examined. The especially important case of nadir-viewing aerial stereophotogrammetry is developed in detail. An object-space vertex labeling convention incorporates vertex type and orientation. A set of image space junction signature rules based upon the object space invariance of OT V edge vanishing points enables unambiguous vertex label assignment for interior and exterior OTVs. An independent application of the labeling scheme to both members of a stereo pair, taken at arbitrarily wide convergence angle, identically labels corresponding juntions. An illustrative example is presented. Algorithmic implementation has not yet been undertaken.

Procedure For Camera Calibration With Image Sequences

Kenneth L. Clarkson

Show abstract

A procedure for calibration of the stereo camera transform is described which uses a variable projection minimization algorithm, applied to an error function whose dependence on the five camera model parameters is separable into linear and non-linear components. The result is a non-linear minimization over three variables rather than five. The procedure has been implemented in MACLISP, with good preliminary results.

General Purpose Very Large Scale Integration (VLSI) Chip For Computer Vision With Fault-Tolerant Hardware

Michael R. Lowry, Allan Miller

Show abstract

This article describes a VLSI NMOS chip suitable for parallel implementation of computer vision algorithms. The chip contains a two dimensional array of processors, each connected to its four neighbors. Each processor currently has 32 bits of internal storage in three shift registers, and can do arbitrary boolean functions as well as serial bit arithmetic. Our objective is to make a vision processor with one processor for each pixel. This will require a very high density VLSI implementation, filling an entire wafer. We will need fault-tolerant hardware to deal with the fabrication errors present in such large circuits. We plan to do this by incorporating redundant links in the processor interconnections and routing the links around faulty processors. Current work focuses on testing a prototype chip with one processor, redesigning the chip for a more compact and regular layout, and designing the redundant link interconnections and hardware support for picture size arrays of processors.

Residue-Based Image Processor For Very Large Scale Integration (VLSI) Implementation

S. D. Fouse, G. R. Nudd, G. M. Thorne-Booth, et al.

Show abstract

This paper describes recent work undertaken at Hughes Research Laboratories, Malibu, California, in support of the DARPA Image Understanding (IU) program. The principal goal of the work is to investigate the application of VLSI technologies to IU systems and identify processor candidates well suited to VLSI implementation. One candidate that is very well suited to the VLSI technology is a programmable local-area processor with residue arithmetic based computations. The design and development of this processor, which operates on 5x5 kernel, are described. Of significant interest is an LSI custom circuit that we are developing and which will perform the bulk of the residue computations. In addition, an interface that will permit this processor to be controlled by a general-purpose host computer (e.g., PDP 11/34) is described.

Techniques and Applications of Image Understanding

Volume Details

Table of Contents

Table of Contents