Visual Communications and Image Processing '88: Third in a Series

Simultaneous Image Identification And Restoration Using The Em-Algorithm

Reginald L. Lagendijk, Jan Biemond, Dick E. Boekee

Show abstract

In many situations of interest the degradation an image has suffered is unknown prior to the restoration process. For this reason the point-spread function of the degrading system has to be estimated directly from the available noisy blurred image. We present a maximum likelihood approach to this image identification problem, and employ the computationally efficient and flexible EM-algorithm to solve the resulting highly nonlinear optimization problem. This approach results in an efficient iterative algorithm, which simultaneously identifies and restores noisy blurred images.

Fast Adaptive Iterative Imag Restoration Algorithms

Serafim N. Efstratiadis, Aggelos K. Katsaggelos

Show abstract

In this paper fast adaptive iterative image restoration algorithms are proposed. These algorithms are based on a class of nonadaptive restoration algorithms ihich exhibit a first or higher order of convergence and some of them consist of an on-line and an off-line -computational part. Since only the linear algorithm can take a computationally feasible adaptive formulation, an iterative algorithm which combines the linear adaptive and the higher order nonadaptive algorithms is proposed. An adaptive window size method is followed in the implementation of the linear adaptive algorithm, rithm, which improves the restoration results. A method to update the computation of the measure of the bcal activity only in the near edge areas is also proposed, thus resulting in great computational savings. Finally, experimental results are presented.

Applications Of Adaptive Least Square Filtering To Image Restoration

Mohamed Chellali, Vinay K. Ingle

Show abstract

Images in real life are inhomogeneous while the transmission channels are time varying, hence, adaptive schemes are suitable for tracking slow variations on system identification and inverse filtering. In this paper we report theoretical and experimental study of the application of adaptive filtering to two dimensional signals. In particular, we focus on the recursive least square transversal filter applied to image restoration. A multichannel extension of the well known one dimensional recursive least square (RLS) algorithm is presented. Finally simulation results on real world images are given.

Regularization Theory In Discrete Image Restoration

Nicolaos B. Karayiannis, Anastasios N. Venetsanopoulos

Show abstract

This paper presents several aspects of the application of Regularization Theory in image restoration. This is accomplished by extending the applicability of the stabilizing functional approach to 2-D ill-posed inverse problems. Image restoration is formulated as the constrained minimization of a stabilizing functional. The analytical study of this optimization problem results in a variety of regularized solutions. A relationship between these regularized solutions and optimal Wiener estimation is identified. The resulting algorithms are evaluated through experimental results.

Comparison Study Of Non-Linear Filters In Image And Signal Processing Applications

Yu- Shan Fong, Carlos A. Pomalaza-Raez, Xiao-Hua Wang

Show abstract

This study presents a summary of application of non-linear filters in signal processing, based on robust estimation theory. Non-linear filters are used in many applications including speech and image processing, mainly because of their ability to suppress noise and preserve signal features such as edges. This is a property that linear filters do not have. In the past several years, many types of non-linear filters have been propsed. However, each of these filters shows optimal smoothing efficiency only for a specific type of noise or a specific type of image. In order to further apply these filters in speech and image processing properly, it is important to conduct a review and analysis of all these filters and give an objective evaluation of their performance. Based upon the point estimation theory, three classes of estimations can be distinguished, namely L-estimators, which are based on linear combination of order statistics; R-estimators, which are derived from rank tests; and M-estimators - maximum likelyhood estimators. The first part of the work carried out in this study is classification of various approaches of non-linear filters in these three types of estimators according to the behavior of the filter itself. The second part is the computer implementation of all the filters discussed. Finally the summary of experimental results are presented.

A Fast Noise Reduction Technique Of Binary Images From An Image Scanner

R. H. Park, G. S. Jung, S. K. Kim, et al.

Show abstract

Binary image sometimes shows the random noisy dots and blobs, which are independent of the input data, depending on the illumination condition of the input data and characteristic of an image scanner itself. We propose the binary image enhancement algorithm that smooths out the noise element while preserving thin contours. The proposed algorithm is fast enough to be applicable to nearly real-time services. Computer simulation shows that the proposed method gives better performance than the existing methods in terms of the computational complexity and the effectiveness of the noise reduction with preserving thin contours. It also increases the data compression ratio of the image data based on the G4 facsimile encoding schemes.

The Importance Of Localized Phase In Vision And Image Representation

J. Behar, M. Porat, Y. Y. Zeevi

Show abstract

It is well known that Fourier-phase is sufficient for image representation and reconstruction within a scale factor, under a variety of conditions. For a finite length sequence, i.e., a sampled signal, a finite number of Fourier coefficients suffices to reconstruct the sequence. Several algorithms exist for this purpose. A close form solution involves solving a large set of linear equations. Another approach is an iterative technique which involves repeated transformation between the frequency and the spatial domains. The application of these techniques to image reconstruction from global (Fourier) phase is, however rather limited in practice due to the computational complexity. In this paper we present a new approach to image representation by means of which, similarly to the biological processing at the level of the cortex, the partial information is defined by localized phase. Also, like processing in vision, dc is first extracted from the signal and signaled separately. Computational results and theoretical analysis indicate that image reconstruction from localized phase-only is more efficient than image reconstruction from global (Fourier) phase. It also lends itself to hardware implementation of fast algorithms using highly parallel architecture.

Generalization Of The Radix Method Of Finding The Median To Weighted Median, Order Statistic And Weighted Order Statistic Filters

Olli Yli-Harja, Jaakko Astola, Yrjo Nuevo

Show abstract

This paper describes the generalization of the Radix method of finding the median to Weighted Median (WM), Order Statistic (OS) and Weighted Order Statistic (WOS) filters. The method requires that input signal is discretized to 2M possible magnitude levels. Supposing that N is the filter window width the time complexities for different one-dimensional filters are O(M) for Standard Median (SM) filters, 0(NM) for WM filters, O(M) for OS filters and 0(NM) for WOS filters. Comparison of time complexities with other methods is performed.

Applications Of Fractals And Chaos Models In Visual Computing

Robert F. Brammer

Show abstract

Fractals and related mathematical models of chaotic phenomena have become active areas in research and applications in many diverse fields. These include weather forecasting, structural analysis, mapping, entertainment, biology, and many others. These fields appear to be a significant part of an emerging interdisciplinary science of complexity and complex systems. The uses of these models in the rapidly developing field of visual computing (i.e., the computational aspects of image sciences: image processing, analysis, and synthesis) are leading to many innovations. This paper will describe a range of activities at TASC and elsewhere illustrating significant new developments, and it describes a trend toward unification in visual computing using these techniques.

Fractals, Fuzzy Sets And Image Representation

D. R. Dodds

Show abstract

This paper addresses some uses of fractals, fuzzy sets and image representation as it pertains to robotic grip planning and autonomous vehicle navigation AVN. The robot/vehicle is assumed to be equipped with multimodal sensors including ultrashort pulse imaging laser rangefinder. With a temporal resolution of 50 femtoseconds a time of flight laser rangefinder can resolve distances within approximately half an inch or 1.25 centimeters. (Fujimoto88)

Mathematical Morphology On Graphs

Luc Vincent

Show abstract

In several fields of Image Processing (e.g. geography, histology, 2-D electrophoretic gels), the space turns out to be partitioned into zones whose neighborhood relationships are of interest. More generally, a great number of phenomena may be modelled by graphs. Now, a graph being a lattice, it can be approached by Mathematical Morphology. This is the purpose of the present study: after introducing the theoretical framework, it deals with various graphs that can be defined on a given set of objects, depending on the intensity of the desirable relationships (e.g. Delaunay triangulations., Gabriel graphs, Relative neighborhood graphs...). The computational aspect is especially emphasized and new digital algorithms, based on Euclidean zones of influence, are introduced. The main operators. of Mathematical Morphology are then defined and implemented for graphs (erosions and dilations, morphological. filters, distance function, skeletons, labelling, geodesic operators, reconstruction, etc...) in both binary and decimal cases. A series of fast computational algorithms is developed, leading to a complete software package.

Threshold Parallelism In Morphological Feature Extraction, Skeletonization, And Pattern Spectrum

Petros Maragos, Robert D. Ziff

Show abstract

In this paper it is shown that many composite morphological systems, such as morphological edge detection, peak/valley extraction, skeletonization, and shape-size distributions obey a weak linear super-position, called threshold-linear superposition: Namely, the output graytone image is the sum of outputs due to input binary images, which result from thresholding the input graytone image at all levels. Then these results are generalized to a vector space formulation, e.g., to any linear combination of simple morphological systems such as erosion, dilation, rank-order filters, and their cascade or max/min combinations. Thus many such systems processing graytone images are reduced to corresponding binary image processing systems, which are easier to analyze and implement.

Holographic Grids

Walter Schempp

Show abstract

The holographic transform H is thought of as the sesquilinear mapping which assigns to the elements of the tensor product vector space L2(R) @ L2(R) the matrix coefficient function of the linear Schrodinger representation of the three-dimensional real Heisenberg nil-potent Lie group. - It is the purpose of the present paper to study the quantum mechanical approach to optical holography including optical phase conjugation, electron holography, and computerized tomographic methods in high resolution imaging radar, for instance synthetic-aperture imaging radar like side-looking airborne SAR, and space shuttle SAR. The aspect common to these techniques is the geometric encoding of the two-wave mixing u @ v by the interference pattern H(u,v;.,.) in the holographic plane R @ R performed simultaneously for the amplitude and the phase by the holographic transform H.

Application Of Recurrent Iterated Function Systems To Images

Michael F. Barnsley, Arnaud E. Jacquin

Show abstract

A new fractal technique for the analysis and compression of digital images is presented. It is shown that a family of contours extracted from an image can be modelled geometrically as a single entity, based on the theory of recurrent iterated function systems (RIFS). RIFS structures are a rich source for deterministic images, including curves which cannot be generated by standard techniques. Control and stability properties are investigated. We state a control theorem - the recurrent collage theorem - and show how to use it to constrain a recurrent IFS structure so that its attractor is close to a given family of contours. This closeness is not only perceptual; it is measured by means of a min-max distance, for which shape and geometry is important but slight shifts are not. It is therefore the right distance to use for geometric modeling. We show how a very intricate geometric structure, at all scales, is inherently encoded in a few parameters that describe entirely the recurrent structures. Very high data compression ratios can be obtained. The decoding phase is achieved through a simple and fast reconstruction algorithm. Finally, we suggest how higher dimensional structures could be designed to model a wide range of digital textures, thus leading our research towards a complete image compression system that will take its input from some low-level image segmenter.

Nonparametric Estimation Of Fractal Dimension

Michael C. Stein, Keith D. Hartt

Show abstract

Nonparametric techniques for estimating fractal dimension are discussed. These techniques are shown to be more robust and efficient when compared with the standard least-squares estimation techniques currently in use. In particular, two alternative nonparametric schemes are presented and their use in image modeling applications is discussed. Theoretical and practical reasons are given to indicate why nonparametric methods lead to improved dimension estimation algorithms. Test results are provided that demonstrate the potential of these methods for use in dimension estimation problems, such as image modeling, where small sample sizes are necessary.

A Fast Algorithm For The Morphological Coding Of Binary Images

Dan Schonfeld, John Goutsias

Show abstract

In this paper we investigate the computational performance of a number of morphological skeletons. We also develop the reduced-cardinality geometric-step morphological skeleton which is shown to achieve logarithmic computational complexity, compared to the linear computational complexity of the uniform-step morphological skeleton, when implemented on a large neighborhood pipelined morphological processor. Furthermore, an upper bound on the skeleton cardinality is derived.

Geometric Interpolation Based On Mathematical Morphology

Fang-Kuo Sun

Show abstract

This paper presents an interpolation technique emphasizing the preservation of- the original geometric impression. The algorithm is based on the concept of distance function defined between a point in a connected set and its neighboring sets. It is shown that the distance function for any given point in a set to its neighbor can be evaluated by successive morphological operations, typically a dilation followed by a set intersection. an application of this algorithm to a simaple example is discussed.

Morphological Algorithms For Analysis Of Geological Phase Structure

Michael Skolnick, Eric Brechner, Peter Marineau

Show abstract

Morphological algorithms for the analysis of the microstructure and estimation of the physical properties of volcanic geological specimens are described. Various properties of materials, such as bulk viscosity, rigidity, tensile and compressive strengths, etc., depend on the microstructure of the material, which is typically composed of a mosaic of interlocking particles of different phases. The images of the geological microstructure are first separated into their constituent phases. The phase images can then be processed to estimate the percentage contribution of each phase, the total boundary per phase, phase-phase contact boundaries, particle size distributions for each phase and angle of contact between particles. These morphological image measurements can then be used to characterize the physical properties of magma.

Technical Issues In Low Rate Transform Coding

Hamed Amor, Dietmar Biere, Andrew G. Tescher

Show abstract

The transmission of video information through a narrow band channel requires a significant degree of image compression. In addition to the actual coding procedure, the frame rate is reduced, the spatial sample rate is lowered and the color components are further filtered. Via the review of a practical and flexible image compression configuration, the important technical aspects of videocoding is presented. The complexity of the currently developed system is discussed for the operation over a 64 kbit/sec communication channel. The specific and primary considerations include the following: After a reduction of temporal and spatial resolution, the luminance and color difference signals are divided in pixel blocks of 16 x 16 and 8 x 8, respectively. Then, they are passed on to a motion compensated DPCM coding system with a subsequent discrete cosine transform of the prediction error. The appropriate quantization is performed via several Huffman tables.

Region Based Image Representation With Variable Reconstruction Quality

Uwe Franke, Rudolf Mester

Show abstract

Region based image representation is a very promising approach in the class of second generation image coding techniques. In this paper we describe a new approach that is based on a spectral description of the texture signals instead of polynomial approximation which has been proposed by other authors. The fundamental problems in transforming arbitrarily shaped regions due to the window effect are overcome using the algorithm of Selective Deconvolution [FRA87a]. Spectral representation of the texture signals and the used polygonal approximation of the region contours enable a hierarchical organization of the image description. Utilizing the linearity of the generated data structure our scheme combines very high data compression ratios (if the reconstruction of the most important image content is required only) with the option of image reconstruction with increasing resolution up to high quality, if detail information is transmitted.

Non-Separable Two-Dimensional Perfect Reconstruction Filter Banks

Gunnar Karlsson, Martin Vetterli, Jelena Kovacevic

Show abstract

Non-separable filter banks are studied for general two-dimensional sub-band coding. The developed results are valid for any sub-sampling pattern. A synthesis structure is given which enables cascade-structure design of all two-dimensional paraunitary perfect reconstruction filter banks. The resulting structure is evaluated with regard to free variables and computational complexity. The synthesis structure is also extended to embrace non-paraunitary perfect reconstruction systems. Finally, conditions are derived to test filter banks for linear phase in the polyphase domain.

A New Efficient Hybrid Coding For Progressive Transmission Of Images

Ali N. Akansu, Richard A. Haddad

Show abstract

The hybrid coding technique developed here involves a function of two concepts: progressive interactive image transmission coupled with transform differential coding. There are two notable features in this approach. First, a local average of an mxm (typically 5 x 5) pixel array is formed, quantized and transmitted to the receiver for a preliminary display. This initial pass provides a crude but recognizable image before any further processing or encoding. Upon request from the receiver, the technique then switches to an iterative transform differential encoding scheme. Each iteration progressively provides more image detail at the receiver as requested. Secondly, this hybrid coding technique uses a computationally efficient, real, orthogonal transform, called the Modified Hermite Transform(MHT) [1], to encode the difference image. This MHT is then compared with the Discrete Cosine Transform(DCT) [2] for the same hybrid algorithm. For the standard images tested, we found that the progressive differential coding method per-forms comparably to the well-known direct transform coding methods. The DCT was used as the standard in this traditional approach. This hybrid technique was within 5% of SNR peak-to-peak for the "LENA" image. Comparisons between MHT and DCT as the transform vehicle for the hybrid technique were also conducted. For a transform block size N=8, the DCT requires 50% more multiplications than the MHT. The price paid for this efficiency is modest. For the example tested ("LENA"), the DCT performance gain was 4.2 dB while the MHT was 3.8 dB.

Vector Quantization Of Images Based Upon A Neural-Network Clustering Algorithm

Nasser M. Nasrabadi, Yushu Feng

Show abstract

A neural-network clustering algorithm proposed by Kohonen is used to design a codebook for the vector quantization of images. This neural-network clustering algorithm, which is better known as the Kohonen Self-Organizing Feature Maps is a two-dimensional extensively interconnected nodes or unit of processors. The synaptic strengths between the input and the output nodes represent the centroid of the clusters after the network has been adapted to the input vector patterns. Input vectors are presented one at a time, and the weights connecting the input signals to the neurons are adaptively updated such that the point density function of the weights tend to approximate the probability density function of the input vector. Results are presented for a number of coded images using the codebook designed by the Self-Organization Feature Maps. The results are compared with coded images when the codebook is designed by the well known Linde-Buzo-Gray (LBG) algorithm.

Address-Vector Quantization : An Adaptive Vector Quantization Scheme Using Interblock Correlation

Yushu Feng, Nasser M. Nasrabadi

Show abstract

Memoryless vector quantizers exploit the high correlation and the spatial redundancy between neighboring pixels, but they totally ignore the spatial redundancy between the blocks. In this paper a new vector quantization scheme called the Address-Vector Quantizer (A-VQ) is proposed. It is based on exploiting the inter-block correlation to encode a group of blocks together by using an address-codebook. The address-codebook consists of a set of address codevectors where each represents a combination of blocks, with each of its element being the address of a LBG-codebook entry representing a vector quantized block. The address-codebook consists of two regions, one is the active (addressable) region, and the other is the inactive (non-addressable) region. During the encoding process the codevectors in the address-codebook are reordered adaptively in order to bring the most probable address codevectors into the active region. When encoding an address combination the active region of the address-codebook is checked, if such an address combination exists its index is transmitted to the receiver, otherwise the address of each block is transmitted individually. The performance (SNR value) of the proposed A-VQ method is the same as that of a memoryless vector quantizer, but the bit rate would be reduced by a factor of two when compared with a memoryless vector quantizer.

Bandwidth Reduction For The Transmission Of Sign Language Over Telephone Lines

Mansouria Boubekker

Show abstract

This paper describes a system for the transmission of American Sign Language (ASL) over standard telephone lines. Video quality digital images require a transmission bandwidth at least 4000 times wider than that allowable by the standard voice telephone network. This study presents compression, segmentation and coding techniques that reduce the information content of visual images while preserving the meaning of the transmitted message. The resulting processed information can be transmitted at digital rates within the range of the commercially available 9600 baud modems. The simulated real-time animations of the decoded cartoon images used for the experimental data represent ASL and finger spelling with an intelligibility approximating 85%. This result is within the range of the intelligibility provided by speech communication over telephone lines.

Quad-Tree Modelling Of Colour Image Regions

R. W. McColl, G. R. Martin

Show abstract

A model is presented which describes and codes the regions within a segmented image by means of quad-tree analysis of each region. A simple criterion is used to determine the importance of any node in the tree as a function of its perceptual relevance. This approach is an improvement on the parametric model which is frequently applied, by preserving textures at high compression. Chromaticity quantisation involves supplying only an average in each region. The additional use of dynamic entropy coding further improves the compression without additional distortion. In this way, compression to 0.5 bits/pel is achievable for colour images.

Hybrid Image Coding Based On Local-Variant Source Models

G. Tu, L. Van Eycken, A. Oosterlinck

Show abstract

A hybrid coding method based on estimation and detection of local features and classification of image data is presented in this paper. The local edge-orientations as well as the statistical properties are detected and estimated prior to the vector and the scalar quantization of the DCT coefficients. In a specific "feature prediction" process, both the local average grey-level which determines the DC coefficient and the local variance which contributes the ac energy distribution are estimated for each 8x8 image block by using the surrounding pixels of the block. In this way, the nonstationary image data are locally classified into sub-sources with each sub-source containing its own specific characteristics. The generally existent statistical dependences between the neighboring transform blocks are also exploited due to the feature prediction operation. The classification information which contains the local edge orientation and the local variance, determines the ac-energy distribution and consequently the vector forming of the ac coefficients in the vector quantization process. An adaptive scalar quantizer which is controlled by both the classification information and the channel buffer is then followed and clearly, the properties of the human visual system can be incorporated with this supplementary scalar quantization process to improve the coding performance. At a bit-rate round 1 bit/pixel, very good image quality can be observed with high signal-noise-ratio (up to 40dB).

Classifying Objects Of Continuous Feature Variability: When Do We Stop Classifying?

Takashi Okagaki

Show abstract

Pattern recognition by a computer assumes that there is a correct answer in classifying the objects to which we can make reference for correctness of recognition. Classification of a set of objects may have absolutely correct answers when the objects are artifacts (e.g. bolts vs nuts) or highly evolved biological species. However, classification of many other objects is arbitrary (e.g. color, clouds), and is frequently a subject of cultural bias. For instance, traffic lights consist of red, yellow and green in the U.S.A.; they are perceived as red, yellow and blue by Japanese. When human bias is involved in classification, a natural solution is to set a panel of human "experts" and concensus of the panel is assumed to be the correct classification. For instance, expert interior decorators can define classification of different colors and hues, and performance of a machine is tested against the reference set by human experts.

A New Stochastic Model-Based Image Segmentation Technique For X-Ray CT Image

Tianhu Lei, Wilfred Sewchand

Show abstract

This manuscript demonstrates that X-ray CT image can be modeled by a finite normal mixture. The number of image classes in the observed image is detected by the information criteria (AIC or MDL). Parameters of the model are estimated by a modified K-mean algorithm and Bayesian decision criterion is the basis for this image segmentation approach. The use of simulated and real image data demonstrate the very promising results of this proposed technique.

An Algorithm For Segmenting Chest Radiographs

Dexin Cheng, M. Goldberg

Show abstract

In this paper, an algorithm for segmenting chest radiographs is presented where the left and right lungs are analysed separately. For each lung, a minimum rectangular frame is first determined, which completely encloses the object to be processed. The vertical edges of the minimum rectangular frame are determined by using the horizontal signature. The horizontal edges of the rectangular frame are determined by exploiting both the vertical signatures and the maximum vertical gradient sums. The image enclosed within the rectangular frame is called the subimage and its histogram is calculated. Segmentation into two classes is based upon grey level histogram thresholding, where the pixels less than a prefixed threshold are set to the background and those above to the foreground. The choice of threshold value is based upon a priori knowledge about the chest radiograph. Finally, a noise removal process is applied to smooth the boundaries: the top borders are refined by parabolas and the side borders by straight lines. Simulations are carried out on several chest radiographs and demonstrate the effectiveness of the proposed segmentation algorithm.

Error Free Compression Of Medical X-Ray Images

Charles G. Boncelet Jr., Joseph R. Cobbs, Allan R. Moser

Show abstract

We present the results of a study of error free compression of digitized medical x-ray images. This study used high resolution (2048 x 1684), high quality (8 to 11 bits per pixel) images, each approximately 5 Mbytes before compression. Since the medical community is very reluctant to introduce unnecessary noise into their imaging, we require error-free compression or, equivalently, perfect reconstruction. In particular, we investigated the following error free methods: Huffman coding, arithmetic coding, Lempel-Ziv coding, runlength coding, ordinary quadtree coding, and a new quadtree method, multi-window quadtree.

Noncausal Parametric Models For The Restoration Of Angiograms

An-Loong Kok, Dimitris Manolakis

Show abstract

Digital Subtraction Angiography (DSA) is a medical imaging modality used to study the blood vessels and the cardiac chambers. The primary signal processing goal in DSA is the removal of interfering effects from secondary structures in the image in order to enhance clinically significant details. A combination of various image processing techniques including, image subtractions, contrast enhancements and matched filtering, are currently used in DSA systems. In this paper we mainly investigate the applicability of noncausal zero phase parametric two-dimensional (2-D) modeling to the restoration of angiograms.

Quantitative Multispectral Analysis Of Discrete Subcellular Particles By Digital Imaging Fluorescence Microscopy (DIFM)

C. Kathleen Dorey, David B. Ebenstein

Show abstract

Subcellular localization of multiple biochemical markers is readily achieved through their characteristic autofluorescence or through use of appropriately labelled antibodies. Recent development of specific probes has permitted elegant studies in calcium and pH in living cells. However, each of these methods measured fluorescence at one wavelength; precise quantitation of multiple fluorophores at individual sites within a cell has not been possible. Using DIFM, we have achieved spectral analysis of discrete subcellular particles 1-2 gm in diameter. The fluorescence emission is broken into narrow bands by an interference monochromator and visualized through the combined use of a silicon intensified target (SIT) camera, a microcomputer based framegrabber with 8 bit resolution, and a color video monitor. Image acquisition, processing, analysis and display are under software control. The digitized image can be corrected for the spectral distortions induced by the wavelength dependent sensitivity of the camera, and the displayed image can be enhanced or presented in pseudocolor to facilitate discrimination of variation in pixel intensity of individual particles. For rapid comparison of the fluorophore composition of granules, a ratio image is produced by dividing the image captured at one wavelength by that captured at another. In the resultant ratio image, a granule which has a fluorophore composition different from the majority is selectively colored. This powerful system has been utilized to obtain spectra of endogenous autofluorescent compounds in discrete cellular organelles of human retinal pigment epithelium, and to measure immunohistochemically labelled components of the extracellular matrix associated with the human optic nerve.

Vision Guided Laryngoscopic Analysis

H. C. Yung, C. R. Allen, R. Beresford, et al.

Show abstract

This paper describes a vision guided endoscopic computer system applied in the analysis of static vocal fold images of patients. The system comprises a flexible nasal-endoscope and a PC compatible compatible computer supported by a fully interactive menu-driven software environment which performs the analysis tasks. It is the intention of this paper to present the philosophy of the approach, the construction of the prototype, the results of the investigation and the merits and drawbacks of the prototype at this stage of development. The cost effectiveness of such system being used in private clinics and general hospitals as an expert adviser to the examining doctor will also be discussed.

Surface Detection In Dynamic Tomographic Myocardial Perfusion Images By Relaxation Labelling

Tracy L. Faber, Ernest M. Stokely, James R. Corbett

Show abstract

A relaxation labelling model was implemented to detect 3-d endo- and epicardial surfaces in ECG gated single photon emission computed tomographic perfusion studies of the heart. The model was tested using studies from normal volunteers and from patients with coronary artery disease. LV volumes calculated from the abnormal patient studies correlated well with those from contrast ventriculography, r=0.82 and r=0.89 for end systole and end diastole, respectively. The ejection fractions correlated well with those from radionuclide ventriculograms (r=0.93.) For normal volunteers, left ventricular endocardial volumes calculated using relaxation labelling correlated to those computed from user-traced surfaces with r-0.98. The correlation between epicardial volumes computed using relaxation labelling and hand-traced edges was also 0.98.

A Pictorial Data Representation And Compaction Algorithm For Networked Server Applications

C. R. Allen, R. Booth, J. S. Sheblee, et al.

Show abstract

Requirements for the transmission and automatic compaction of complex imagery on local area networks is a rapidly developing field, driven by the needs of industry, commerce and medicine who require image based data on existing LAN computing systems. This paper describes a prototype workstation/server architecture and the algorithm used to code, store in archive and present pictorial data is described using representative real images. The data is derived from on-line endoscopic examination of subjects from Ear,Nose and Throat (ENT) clinics.

Ophthalmic Image Acquisition And Analysis System

Ulrich Klingbeil, Andreas Plesch, Wolfgang Rappl, et al.

Show abstract

The concept of an ophthalmic workstation will be presented. A Scanning Laser Ophthalmoscope was developed as a recording device for its superior retinal imaging characteristics. The system architecture provides general electronic image acquisition and analysis. The software includes digital image archiving and development tools for image analysis. Applications will be discussed.

POPART - Performance Optimized Algebraic Reconstruction Technique

K. M. Hanson

Show abstract

A method for optimizing image-recovery algorithms is presented that is based on how well a specified visual task can be performed using the reconstructed images. Visual task performance is numerically assessed by a Monte Carlo simulation of the complete imaging process including the generation of scenes appropriate to the desired application, subsequent data taking, image recovery, and performance of the stated task based on the final image. This method is used to optimize the Algebraic Reconstruction Technique (ART), which reconstructs images from their projections, by varying the relaxation factor employed in the updating procedure. In some of the imaging situations studied, it is found that the optimization of constrained ART, in which a nonnegativity constraint is invoked, can vastly increase the detectability of objects. There is little improvement attained for unconstrained ART.

Systolic Architectures For Hidden Markov Models

J. N. Hwang, J. A. Vlontzos, S. Y. Kung

Show abstract

This paper proposes an unidirectional ring systolic architecture for implementing the hidden Markov models (HMMs). This array architecture maximizes the strength of VLSI in terms of intensive and pipelined computing and yet circumvents the limitation on communication. Both the scoring and learning phases of an HMM are formulated as a consecutive matrix-vector multiplication problem, which can be executed in a fully pipelined fashion (100% utilization effi-ciency) by using an unidirectional ring systolic architecture. By appropriately scheduling the algorithm, which combines both the operations of the backward evaluation procedure and reestimation algorithm at the same time, we can use this systolic HMM in a most efficient manner. The systolic HMM can also be easily adapted to the left-to-right HMM by using bidirectional semi-global links with significant time saving. This architecture can also incorporate the scaling scheme with little extra effort in the computations of forward and backward evaluation variables to prevent the frequently encountered mathematical undertow problems. We also discuss a possible implementation of this proposed architecture using Inmos transputer (T-800) as the building block.

Algorithms And Architecture For Image Adaptive Vector Quantization

S. Panchanathan, M. Goldberg

Show abstract

In this paper, we present two algorithms for vector quantization of images and an architecture to implement these algorithms. In vector quantization (VQ), the image vectors are usually coded with an "universal" codebook, however, for a given image, only a subset of the codewords in the universal codebook may be needed. This means that effectively a smaller label size can be employed at the expense of a small overhead information to indicate to the receiver the codewords used. Simulation results demonstrate the superior coding performance of this technique. VQ using an universal codebook (VQUC) is computationally less demanding but its performance is poor for images outside the training sequence. Image adaptive techniques, where new codebooks are generated, for each input image (VQIAC) can improve the performance but at the cost of increased computational complexity. A technique which combines the advantages of VQUC and VQIAC is presented in this paper. Simulation results demonstrate that the technique gives a coding performance close to that obtained with image adaptive VQ at a substantially reduced computational complextiy. A systolic array architecture to implement the algorithms in real-time is also presented. The regular and iterable structure makes possible the VLSI implementation of the architecture.

Making Curvature Estimates Of Range Data Amenable To A VLSI Implementation

M. E. Malowany, A. S. Malowany

Show abstract

This paper describes a method for computing Gaussian and mean curvature maps from range data and a modification to this method aimed at facilitating its implementation as a VLSI circuit. The curvature computations constitute a surface characterization step in range-data processing. A comparison of the performance of the original and modified algorithms on artificial range images is presented. Results of applying the modified algorithm to real range data is also presented.

VLSI Architectures For High Speed Two-Dimensional State-Space Recursive Filtering

Jin Yun Zhang, Willem Steenaart

Show abstract

This paper describes the problem of 2-D state-space filter algorithms and presents some high speed VLSI implementations. The state-space filters are known to be capable of minimizing the finite-word-length effects, but the computations will be increased. By exploiting concurrency in two-dimensional state-space systems, the following speed up architectures are obtained. The local speed-up processors realize the matrix-vector multiplications and decrease the processing time for each pixel. The global speed-up structures in addition use the inherent spatial concurrency and decrease the total processing time in a global sense. These architectures feature a high degree of parallelism and pipelining. They can work on multiple columns or multiple lines of images concurrently. The throughput rate can be up to one column or one line of images per clock time. Another high speed architecture, based on the 2-D block-state update technique, is then presented. It is shown that the throughput rate can be adjusted by varying the block size. Finally, comparisons among the different architectures are given in terms of hardware complexity, throughput rate, latency and efficiency.

A 50Mbit/Sec. CMOS Video Linestore System

Yeun C. Jeung

Show abstract

This paper reports the architecture, design and test results of a CMOS single chip programmable video linestore system which has 16-bit data words with 1024 bit depth. The delay is fully programmable from 9 to 1033 samples by a 10 bit binary control word. The large 16 bit data word width makes the chip useful for a wide variety of digital video signal processing applications such as DPCM coding, High-Definition TV, and Video scramblers/descramblers etc. For those applications, the conventional large fixed-length shift register or static RAM scheme is not very popular because of its lack of versatility, high power consumption, and required support circuitry. The very high throughput of 50Mbit/sec is made possible by a highly parallel, pipelined dynamic memory architecture implemented in a 2-um N-well CMOS technology. The basic cell of the programmable video linestore chip is an four transistor dynamic RAM element. This cell comprises the majority of the chip's real estate, consumes no static power, and gives good noise immunity to the simply designed sense amplifier. The chip design was done using Bellcore's version of the MULGA virtual grid symbolic layout system. The chip contains approximately 90,000 transistors in an area of 6.5 x 7.5 square mm and the I/Os are TTL compatible. The chip is packaged in a 68-pin leadless ceramic chip carrier package.

GaAs Supercomputing: Architecture, Language, And Algorithms For Image Processing

John T. Johl, Nick C. Baker

Show abstract

The application of high-speed GaAs processors in a parallel system matches the demanding computational requirements of image processing. The architecture of the McDonnell Douglas Astronautics Company (MDAC) vector processor is described along with the algorithms and language translator. Most image and signal processing algorithms can utilize parallel processing and show a significant performance improvement over sequential versions. The parallelization performed by this system is within each vector instruction. Since each vector has many elements, each requiring some computation, useful concurrent arithmetic operations can easily be performed. Balancing the memory bandwidth with the computation rate of the processors is an important design consideration for high efficiency and utilization. The architecture features a bus-based execution unit consisting of four to eight 32-bit GaAs RISC microprocessors running at a 200 MHz clock rate for a peak performance of 1.6 BOPS. The execution unit is connected to a vector memory with three buses capable of transferring two input words and one output word every 10 nsec. The address generators inside the vector memory perform different vector addressing modes and feed the data to the execution unit. The functions discussed in this paper include basic MATRIX OPERATIONS, 2-D SPATIAL CONVOLUTION, HISTOGRAM, and FFT. For each of these algorithms, assembly language programs were run on a behavioral model of the system to obtain performance figures.

Effective Realization Of Systolic Functions In Logic Programs

Albert C. Chen, Fenfen Shueh, Chuan-Lin Wu

Show abstract

This paper presents methods to concurrently exploit AND and OR parallelism in logic programs. The execution sequence of the system is driven by the demand to reduce complexity. The concept of environment copying is applied to achieve modularity when OR parallelism is exploited. As a result, multiple independent structures can be dynamically constructed according to the data dependency to solve various systolic algorithms.

Design Of A Special-Purpose Computer For Locating Targets Within An Image

L. M. Napolitano, Jr., P. R. Bryson, K. R. Berry, et al.

Show abstract

Starloc (Sandia TARget LOcation Computer) is a special-purpose computer designed for target location in an image. It is based on a generalized correlation filter algorithm. Correlation of the input image and a small number (10-20) of specially-designed filter kernels is performed in the Fourier domain. The correlation of the input image with the filter set produces an encoding at each input image pixel location; targets are located at pixels with specific codes. This target signature is not sensitive to target variations such as rotation, range, brightness, and angle of view. Starloc is now under development and when completed will process two 256 pixel by 256 pixel input images per second. Starloc's basic architecture consists of ten pipelined processing stages (eight for Fast Fourier Transform operations and two for pixel by pixel operations) arranged in a ring-like structure. Within each stage are a controller, image memory, address generator, and register file with parallel floating-point processors. Starloc is designed to be fault tolerant by including two hot standby stages that can be switched into the data path when other stages fail and by incorporating a comprehensive set of error checkers. Using currently available floating-point processors, Starloc will run at a sustained rate of 188 MFLOPS (million floating-point operations per second). At this rate, it is performing 36 complex 256 pixel by 256 pixel two-dimensional Fast Fourier Transforms per second.

Entertainment Video-On-Demand At T1 Transmission Speeds (1.5 Mb/S )

C. N. Judice

Show abstract

With the emergence of high speed packet networks as proposed in CCITT Study Group XVIII, metropolitan area networks as defined in IEEE 802.6, and multimedia, high capacity storage such as CDROMs, there is renewed interest among image processing specialists in achieving high bit rate compressions without severe compromises in image quality. Advances in signal processing techniques and their implementation in VLSI have lead to the point where it is now possible to digitally encode full motion video sequences at less than 1.5 Mbps with a spatial and temporal resolution comparable to that commonly viewed on a standard VHS video tape recorder. Work at the David Sarnoff Labs on DVI encoding for CD-ROMs has demonstrated impressive image quality for scenes compressed to an average of l.SMbps. Lippman and his colleagues at MIT Media Lab, also are making significant progress in improving the image quality while maintaining the same 1.5 Mbps rate (important because it is the transfer rate of CD-ROMs). Recently, work at Bellcore analyzed a broad spectrum of video material and determined the feasibility of achieving 1.5Mbps while maintaining near VCR quality.

Eye Movements And Coding Of Video Sequences

Bernd Girod

Show abstract

The influence of eye movements on the perception of spatiotemporal impairments and their relevance for the encoding of video sequences is discussed comprehensively. Two simple experiments show that it is neither permissable to generally blur the video signal in moving areas, nor is it justified to introduce more noise in moving areas. Eye movements slant the spatiotemporal frequency response of the HVS. The influence of eye movements on spatial and temporal masking is demonstrated by a computational model of visual perception. Smooth pursuit eye movements reduce or eliminate temporal masking, but they increase spatial masking effects. A coding system is considered, that utilizes an eye tracker to pick up the point of regard of a single viewer. Such a system has an enormous potential for data compression, but the usefulness of the approach is limited because of the delay introduced by encoding and transmission.

A Fixed Quality, Variable Rate Finite State Vector Quantizer For Encoding Video Over Packet Networks

Hsiao-hui Shen, Richard L. Baker

Show abstract

The advent of packet-switched networks has renewed interest in variable rate video coding schemes which can deliver a specified level of image quality. This paper describes one such scheme, using a variable rate Finite State Vector Quantizer (FSVQ). The codec alternates between an intraframe mode, for encoding foreground (moving) areas, and an interframe mode, for background areas. Separate FSVQs are used for the two modes, each with its own set of super-codebooks. Background blocks having low residual energies are conditionally replenished. The codec maintains a fixed SNR by using sub-codebooks having various numbers of codevectors and dimensions, making it an adaptive multi-rate FSVQ. Image quality is controlled by a set of SNR status variables which dynamically select classes of sub-codebooks. Each status variable is a temporally filtered measure of the SNR in a pre-assigned region of the frame that is updated each frame time. The codec may also be configured to include a rate buffer so that image quality can be degraded in a controlled fashion, should the attached network become congested. Distortion during non-congested operation is maintained to within about 1 dB of the desired value over a range of SNRs up to about 45 dB, with rates as high as 1.5 bpp.

Adaptive Vector Quantization In Transform Domain With Variable Block Size

H. Sun, M. Goldberg

Show abstract

A novel method for adaptive vector quantization has been proposed to address the problem of edge degradation in image coding. This technique utilizes variable block size, in our example, 4x4 and 2x2. The basic idea is to partition the image into two parts: non-active and active. For the non-active area, a large block size is used, whereas for the active area a small block size is employed. The quality of the reconstructed images is improved using this algorithm because the bits saved in the non-active areas are allocated to the active areas where the edges may exist. To decrease the complexity of the encoder and decoder, one codebook is used for both active and non-active areas. First, both non-active blocks (large block size) and active blocks (small block size) undergo a two-dimensional cosine transform. The same number of coefficients are retained in both cases; for the large size blocks this could tend to be the lower order coefficients, whereas for the small size blocks, some of the higher order coefficients are retained. A single codebook is then generated. The experimental results show that good quality of reconstructed images by this algorithm is obtained at bit rates as low as 0.8 bits/pixel.

Adaptive Methods Applied To Intra And Interframe Predictive Coding

J. C. Pesquet, G. Tziritas

Show abstract

One way to achieve image compression is to use predictive coding schemes. Furthermore, as the signal to process is not stationary, such schemes can even be improved by adopting adaptive strategies. In this paper, it will be proven that it is possible to use a continuous adaptation. The coding protocol can thus be simplified because there is no need to transmit any signal other than the prediction error. The predictor used is a transversal filter with continuously adapted coefficients. To arrive at this, one-dimensional Least Mean Square and adaptive Least Squares are generalized to the special case of digital image processing. The methods described are of great interest and could be applied to many fields other than image coding. A method of adapting the levels of a quantizer to the dynamic range of the two-dimensional signal will also be developed. This is based on a 2D backward estimation of the prediction error variance. The practical implementation of these algorithms will then be analyzed for intraframe and interframe cases. Comparison with a fixed encoder, clearly shows the interest of such techniques. With no excessive computational complexity, we obtain a 40 percent improvement for the distortion by using an LMS -adapted predictor and a three-level adaptive quantizer. The numerous simulations made also give good visual results.

Local Motion-Adaptive Interpolation Technique

Joon-Seek Kim, Rae-Hong Park

Show abstract

In this paper, we propose the local motion-adaptive interpolation technique which can be used in the codec using MCC-BMA (motion compensated coding-block matching algorithm) and the simple four-region segmentation algorithm. In addition, we propose the new BMA using integral projections to make it easy for real-time processing, and simulate the proposed local motion-adaptive interpolation techniques combined with the conventional three-step search algorithm and the fast BMA using integral projections.

An Edge Preserving Filter With Variable Window Size For Real Time Video Signal Processing

O. Vainio, H. Tenhunen, T. Korpiharju, et al.

Show abstract

Fast pipelined CMOS integrated circuit implementation of the FIR Median Hybrid filter has been designed. The filter structure combines linear averaging subfilters to a median operation in a way that leads to a very efficient implementation. The filter removes noise while preserving sharp edges in the signals. The circuit can operate up to 80 MHz clock rates and it is capable of filtering real time video signals including future HDTV applications. The window size is adjustable from 3 to 257 samples.

VLSI Architecture For Generalized 2-D Convolution

Yu-Chung Liao

Show abstract

This paper proposes a VLSI architecture for the parallel processing of the generalized 2-D convolution. The processor consists of a shift-buffer pipeline, an array of multipliers and a tree of adders. The image data enter the processor in a raster scan format and are stroed and shifted in the pipeline. The multiplier array takes data from the pipeline and does the mulitiplication in parallel, and then sends the partial products to the adder tree to complete the computation. The simple architecture and control strategy makes the proposed scheme suitable for VLSI implementation.

Adaptive Sensitivity / Intelligent Scan Image Processor

Ran Ginosar, Yehoshua Y. Zeevi

Show abstract

Novel image processor architectures are presented. The architectures are designed to take advantage of the advanced concepts of Adaptive Sensitivity and Intelligent Scan, and, in particular, of the new sensors described in the companion paper. The processors are capable of manipulating selective visual data sets, which are generated by Intelligent Scan sensors. They are organized in feedback adaptive forms, exhibiting performance superior to conventional, feedforward architectures.

Adaptive Sensitivity / Intelligent Scan Imaging Sensor Chips

Ran Ginosar, Yehoshua Y. Zeevi

Show abstract

CMOS VLSI imaging sensors are described, which employ a novel approach to image acquisition and processing. Principles of biological visual systems, such as adaptation and selective attention, are applied to achieve a high performance imaging system. The Adaptive Sensitivity mechanism enables the acquisition of extremely wide dynamic range, much wider than that of conventional semiconductor sensors, while preserving image details and while dynamically adapting to the image contents. The Intelligent Scan mechanism filters in only the areas of interest, thus reducing the amount of transmitted visual data in an intelligent, image-dependent, adaptive manner, The sensor are complemented with an image processing system, described in a companion paper.

A High Performance Convolution Processor

J. F. Cote, C. Collet, D. D. Haule, et al.

Show abstract

This paper summarizes the design of a convolution processor card that is very low in cost, easy to use and most importantly, performs a 9x9 convolution in less than a second. Its high-performance is attributed to a VLSI systolic convolution cell which has been designed in our laboratories and to an efficient supporting data path architecture. The new Intel 82258 Advanced DMA Controller is used to perform each pixel transfer to and from the host computer's memory. Due to the DMA's software programmability, pictures of any size can be processed. The circuit is assembled on a 36 column MULTIBUS bus card and is installed on an Intel System 310 running iRMX 286 real-time multitasking operating system.

Kanji Character Recognition Unit With Hand-Scanner Using SIMD Processor

Toshio Kondo, Shunkichi Tada, Sueharu Miyahara

Show abstract

A prototype OCR is constructed using a very compact parallel processing unit. This unit, designed for interactive character recognition applications, is equipped with a hand-scanner for input and a personal computer for word and/or image processing. The heart of this unit is a bit-serial Single Instruction Multiple Data stream (SIMD) array processor constructed with four identical cellular array LSIs (AAP2). The processor is fully programmable and the complex pro-cess of Japanese character recognition can be carried out with a single program package. Its architecture permits flexible and high-speed SIMD operations to process bitline data such as local fields of scanned documents. The processor components were integrated into one board and confirmed to be more than ten times faster than present image processors of the same size through various image processing tests. High character recognition performance is obtained at a reading speed of 8 Japanese characters per second which is sufficient for hand-scanning data input operations. The recognition rate is higher than 98% for about 3,300 Japanese characters.

Image Rotation Correction With CORDIC Array Processor

Keh-Hwa Shyu, Bor-Shenn Jeng, I-Chang Jou, et al.

Show abstract

In the document analysis system or the understanding system[1,2], the rotation of the document's image will cause optical character recognition error. Then the document must be scanned and recognized again. This phenomenon will degrade the performance of the automatic document input system. In this paper, we propose a method to estimate the unexpected rotational angle of the image. And we suggest using the pipelined CORDIC array processor architecture to rotate the image back quickly. Thus the performance of the automatic document input system will increase.

A High Performance VLSI Computer Architecture For Computer Graphics

Chi-Yuan Chin, Wen-Tai Lin

Show abstract

A VLSI computer architecture, consisting of multiple processors, is presented in this paper to satisfy the modern computer graphics demands, e.g. high resolution, realistic animation, real-time display etc.. All processors share a global memory which are partitioned into multiple banks. Through a crossbar network, data from one memory bank can be broadcasted to many processors. Processors are physically interconnected through a hyper-crossbar network (a crossbar-like network). By programming the network, the topology of communication links among processors can be reconfigurated to satisfy specific dataflows of different applications. Each processor consists of a controller, arithmetic operators, local memory, a local crossbar network, and I/O ports to communicate with other processors, memory banks, and a system controller. Operations in each processor are characterized into two modes, i.e. object domain and space domain, to fully utilize the data-independency characteristics of graphics processing. Special graphics features such as 3D-to-2D conversion, shadow generation, texturing, and reflection, can be easily handled. With the current high density interconnection (MI) technology, it is feasible to implement a 64-processor system to achieve 2.5 billion operations per second, a performance needed in most advanced graphics applications.

The Visibility Of Moire' Patterns

Joseph Shou-Pyng Shu, Robert Springer, Chia Lung Yeh

Show abstract

Unacceptable Moire' distortion may result when images that include periodic structures such as halftone dots are scanned. In the frequency domain, Moire' patterns are the result of visible aliased frequencies. In the spatial domain, the aliased frequency components correspond to cyclic changes in the size of halftone dots which are visible as periodic "beat" patterns. Moire' formation depends on the following factors: (1) halftone screen frequency, (2) scan frequency, (3) angle between the scan direction and the halftone screen, (4) the scanner aperture size and shape, (5) quantization errors from the thresholding operation, (6) scanner and printer noise, and (7) ink flow on paper during printing. This paper analyzes the visibility of Moire' patterns in terms of these factors. Moreover, the paper describes an approach to reduce the visibility of Moire' patterns by manipulating the Moire' formation factors directly rather than by post-scan processing. Computer simulated, and actual scan images are presented to illustrate the approach.

A Multiple Resolution Approach To Regularization

Charles Bouman, Bede Liu

Show abstract

Regularization of optical flow estimates and the restoration of noisy images are examples of problems which may be solved by modeling the unknown field as a Markov random field (MRF) and calculating the maximum a posteriori (MAP) estimate. This paper presents a multiple resolution algorithm for maximizing the a posteriori probability associated with a class of MRF's. These MRF's combine the smooth regions found in Gaussian random fields with the abrupt boundaries characteristic of discrete valued MRF's. This makes them well suited for modeling image properties which vary smoothly over large regions but change abruptly at object boundaries. The multiple resolution algorithm first performs the maximization at coarse resolution and then proceeds to finer resolutions until the pixel level is reached. Since coarser resolution solutions are used to guide maximization at finer resolutions, the algorithm is resistant to local maxima and has performance equivalent to simulated annealing, but with dramatically reduced computation. In fact, the multiple resolution algorithm has been found to require less computation than local greedy algorithms because constraints propagate more rapidly at coarse resolutions. Regularization of optical flow problems and the restoration of fields corrupted with additive white Gaussian noise are explicitly treated and simulation results are presented.

Contrast In Images

Eli Peli, Robert B. Goldstein

Show abstract

The contrast of simple images such as sinusoidal gratings or a single patch of light on a uniform background is well defined, but this is not the case for complex images. When the Michelson definition, used for sinusoidal test patterns, is applied to complex scenes, the contrast of the whole picture may be defined based on only one point in the image. Human contrast sensitivity is known to be a function of the spatial frequency; therefore, the spatial frequency content of an image should also be considered in the definition of contrast. We propose a definition of local band-limited contrast in images that assigns a contrast value to every point in the image as a function of the spatial frequency band. For each frequency band, the contrast is defined as the ratio of the band-pass filtered image at that frequency to the low-pass filtered image filtered to an octave below the same frequency. This definition is useful for understanding the effects of enhancement techniques on image contrast. The definition can be implemented in the design of recognizable, minimal-contrast images, thus enabling optimal use of the available dynamic range of the display, and more efficient coding algorithms.

Pyramidal Edge Detection And Image Representation

A. Schrift, Y. Y. Zeevi, M. Porat

Show abstract

We present a scheme of extracting edge information from parallel spatial frequency bands. From these we create an integrated image of most significant edges of different scales. The frequency bands are realized using the formalism of a Gaussian pyramid in which the levels represent a bank of spatial lowpass filters. The integrated edge image is created in a top-down algorithm, starting from the smallest version of the image. The sequential algorithm uses mutual edge information of two consecutive levels to control the processing in the lower one. This edge detection algorithm constitutes an image-dependent nonuniform processing scheme. Computational results show that only 20%-50% of the operations are needed to create an edge pyramid, compared to the number required in the regular scheme. The proposed generic scheme of image-dependent processing can be also implemented with operators other than edge detectors to exploit the advantages inherent in biological processing of images.

Orthographic Projection Views Generation From The Intensity Image

Chung Lin Huang

Show abstract

This paper presents an approach to automatic generation of three orthographic projection (O.P.) views of 3-D objects from the intensity image. Given image taken from different viewpoints of the designated object, an algorithm is developed to generate the same O.P. views.

Fusion Of Edge Maps In Color Images

C. J. Delcroix, M. A. Abidi

Show abstract

In this paper, a new analytic method for the detection, of edges in color images is presented. This method focuses on the integration of three edge maps in order to increase one's confidence about the presence/absence of edges in a depicted scene. The integration process utilizes an algorithm developed by the authors under a broader research topic: The integration of registered multisensory data. It is based on the interaction between the following two constraints: the principle of existence, which tends to maximize the value of the output edge map at a given location if one input edge map features an edge, and the principle of confirmability, which adjusts this value according to the edge contents in the other input edge map at the same location by maximiz-ing the similarity between them. The latter two maximizations are achieved using the Euler-Language Calculus of Variations equations. This algorithm, which fuses optimally two correlated edge maps with regard to the above principles is extended to the simultaneous fusion of three edge maps. Experiments were conducted using not only the red, green, and blue representation of color information but also other bases.

Mutual Information Of Images: A New Approach To Pyramidal Image Analysis

B. Hason, Y. Y. Zeevi

Show abstract

The similarity measure is defined as the amount of information necessary to transform one image into another. Application of this measure between pyramid elements reveals that significant data-compression, subject to controllable error, can be achieved by saving only the necessary information to pass from the lower-resolution image to its higher-resolution version. Further, this data can be estimated without any side information to obtain a super-resolution effect in images.

Pyramidal Image Representation In Nonuniform Systems

Y. Y. Zeevi, N. Peterfreund, E. Shlomot

Show abstract

In this paper we present a formalism for representation and processing of images which are not band-limited. Motivated by the properties of the visual system, we elaborated an example of a sampling scheme wherein the sampling rate decreases as a function of the distance from the center of the visual field. The broader class of images under consideration, obtained by the application of an appropriate projection filter, constitutes a Reproducing Kernel space which is characterized by "locally band-limited" properties. Sequential half-band filtering of such images generates a pyramidal scheme in the context of nonuniform systems and images.

A Study On Description And Recognition Of Objects Based On Segment And Normal Vector Distributions

Osamu Nakamura, Kouji Kobayashi, Hiroshi Nagata, et al.

Show abstract

Three dimensional (3-D) object recognition holds an important position in a study of artificial intelligence and has been investigated from various viewpoints. From the standpoint of an application, a visual sensor used in current industrial robots uses only two dimensional information of images to increase processing speed[1]. However to advance the progress of factory automation, it is necessary to improve the function of a robot, and to introduce 3-D object recognition as a fundamental technique.

Unsupervised Segmentation Of Texture Images

Xavier Michel, Riccardo Leonardi, Allen Gersho

Show abstract

Past work on unsupervised segmentation of a texture image has been based on several restrictive assumptions to reduce the difficulty of this challenging segmentation task. Typically, a fixed number of different texture regions is assumed and each region is assumed to be generated by a simple model. Also, different first order statistics are used to facilitate discrimination between different textures. This paper introduces an approach to unsupervised segmentation that offers promise for handling unrestricted natural scenes containing textural regions. A simple but effective feature set and a novel measure of dissimilarity are used to accurately generate boundaries between an unknown number of regions without using first order statistics or texture models. A two stage approach is used to partition a texture image. In the first stage, a set of sliding windows scans the image to generate a sequence of feature vectors. The windowed regions providing the highest inhomo-geneity in their textural characteristics determine a crude first-stage boundary, separating textured areas that are unambiguously homogeneous from one another. These regions are used to estimate a set of prototype feature vectors. In the second stage, supervised segmentation is performed to obtain an accurate boundary between different textured regions by means of a constrained hierarchical clustering technique. Each inhomo-geneous window obtained in the first stage is split into four identical subwindows for which the feature vectors are estimated. Each of the subwindows is assigned to a homogeneous region to which it is connected. This region is chosen according to the closest prototype vector in the feature space. Any two adjacent subwindows that are assigned to different regions will in turn be considered as inhomogeneous windows and each is then split into four subwindows. The classification scheme is repeated in this hierarchical manner until the desired boundary resolution is achieved. The technique has been tested on several multi-texture images yielding accu-rate segmentation results comparable or superior to the performance obtained by human visual segmentation.

On-Line Recognition Of Cursive Korean Characters By Descriptions Of Basic Character Patterns And Their Connected Patterns

Heedong Lee, Masayuki Nakajima, Takeshi Agui

Show abstract

The present paper reports an on-line recognition method of cursive Korean characters. In the present method, we treat a Korean character pattern as a finite sequence of basic character patterns. After extracting candidate basic character patterns from an input character pattern, we determine basic character patterns making a Korean character from the candidates by connecting processing. We described basic character patterns and their connected patterns used in the present method according to their features. By extracting and connecting basic character patterns based on the descriptions, we improve description ability of patterns and processing speed. Precise description ability of patterns and extracting candidates can remove unstable writing movements and can separate strokes stably.

An Algorithm For Normalisation Of The Fourier Descriptor

A. Abo Zaid, E. Horne

Show abstract

This paper proposes a new method for normalisation of the Fourier descriptor which requires only simple calculation, thus saving computational time and load. Results obtained from simulation studies show that the method is both efficient and accurate .

Region Extraction Using A Dynamic Thresholding Pyramid

C. Horne, M. Spann

Show abstract

An algorithm is presented capable of extracting homogeneous regions from an image based on multi-resolution processing. The algorithm is based on applying a local clustering at each resolution level of a linked pyramid data structure allowing seed nodes to be defined. These seed nodes are compact descriptions, roots nodes, of regions at the base of the pyramid, appearing in the multi-resolution data structure at a level appropriate to their size. Without a priori knowledge accurate segmentations are obtained by applying a merging process followed by a classification step. Results show the accuracy of the algorithm for a range of images, even in low signal to noise ratios.

Method Of Detecting Face Direction Using Image Processing For Human Interface

Kazunori Ohmura, Akira Tomono, Yukio Kobayashi

Show abstract

A method of non-contact detection of human face direction using a TV image and its application to pointing operation are presented. First, an algorithm for detecting the direction of a human face is described. In this method, the human face is defined as a plane having three feature points. Here, 2-D coordinates of these points projected on a TV image, together with the distances between points measured on the face are used to obtain the 3-D positions of the feature points. The normal vector of the plane, same as face direction, can be calculated from those values of the 3-D positions. Next, the accuracy of this algorithm is evaluated by experiment. Three blue marks are pasted on the human face as the feature points. These marks are extracted from the TV image by the chroma key method and the face direction is calculated by the above algorithm. This result is compared with the direction obtained from a high accuracy 3-D magnetic sensor attached to the subject's head. For real time detection, a fast method of finding the marks on a TV image is described. A window is set around each mark and the search process is applied within this window. The position of the window follows the marks' movement. Finally, to determine if the processing speed is in real time, a pointing operation that uses face direction detected by this method is tested with a menu selection task. Results of these experiments show that 3-D face direction can he detected in real time, and that this method is effective in human-interfaces such as menu selection.

Omnifont Character Recognition Based On Fast Feature Vectorization

H. C. Yung, I. M. Green

Show abstract

This paper presents major achievements made towards the development of a high-speed optical character recognition (OCR) workstation for characters of various fonts and sizes. The system is based upon an efficient feature extraction concept centred around an edge-vectorization technique. The resulting edges are mapped into a feature space from where a binary feature vector is built and subsequently fed to a standard statistical Bayesian classifier. The technique has been demonstrated on an IBM-PC/XT (without coprocessor) to operate at least 25 times the speed of conventional OCR techniques, achieving a 100% recognition rate with learned characters and 87% with unlearned.

Extracting Road Networks From Low Resolution Aerial Imagery

David Izraelevitz, Mark Carlotto

Show abstract

We present a new algorithm for the enhancement and detection of linear features such as roads in satellite imagery. A cost inversely related to the response of a local linear feature operator is associated with each pixel, and an analysis based on a dynamic programming procedure is performed to determine if each pixel under study is part of a "low cost" path of predominantly linear pixels. The technique has been applied to several SPOT and Landsat images and shown to reduce the false alarm rate (percentage of pixels incorrectly classified as roads) by 50% over techniques which rely on local information alone.

Mathematical Transform Of (R, G, B) Color Data To Munsell (H, V, C) Color Data

Makoto Miyahara, Yasuhiro Yoshida

Show abstract

In studying new-generation color image codings, it is very effective 1) to code signals in the space of inherent tri-attributes of human color perception, and 2) to relate a coding error with perceptual degree of deteriorations. For these purpose, we have adopted the Munsell Renotation System in which color signals of tri-attributes of human color perception (Hue, Value and Chroma) and psychometrical color differences are defined. In the Munsell Renotation System, however, intertransformation between (RGB) data and corresponding color data is very cumbersome. Because the intertransformation depends on a look up table. This article presents a new method of mathematical transformation. The mathematical transformation is obtained by multiple regression analysis of 250 color samples, which are uniformly sampled from whole color ranges that a conventional NTSC color TV camera can present. The new method can transform (RGB) data to the data of the Munsell Renotation System far better than the conventional method given by the CIE(1976)L*a*b*.

Postprocessing For Edge Enhancement In Low Bit Rate Coded Images

Ken Sauer

Show abstract

Low bit rate image coding results in artifacts at reconstruction whose characteristics depend on the form of the coding/decoding algorithm used. A very common, and visually objectionable effect of most coding methods is distortion of edges. Classical spatially-invariant filtering techniques are of little use in removing this signal-dependent reconstruction error. This paper presents an algorithm for locating and tracing perceptually important edges, and for enhancement through nonstationary filtering. Edge detection and enhancement are reduced to a sequence of simple one-dimensional processes. The algorithm yields encouraging experimental results when applied to images coded by the block list transform and vector quantization.

Reconstruction Of Non-Minimum Phase Multidimensional Signals Using The Bispectrum

M. R. Raghuveer, S. A. Dianat, G. Sundaramoorthy

Show abstract

While the bispectrum has been used for the reconstruction of non-minimum phase 1-D signals not much work of a similar nature has been done for multidimensional signals. Here we present a technique for reconstructing 2-D non-minimum phase signals from samples of their bispectra. The reconstruction procedure involves recovering the magnitude of the Fourier transform of the signal from the bispectrum magnitude and its phase from the bispectrum phase. By using the bispectrum the ambiguity regarding phase is considerably removed (up to a linear phase factor) when compared to the spectral factorization approach.

Parallel Image Sequence Coding By Adaptive Vector Quantization

H. D. Cheng, H. Sun, J. Y. Zhang

Show abstract

Vector quantizer is used to map a sequence of co to map a sequence of continuous or discrete vectors into a digital sequence for storing or communicating with the advantage of data compression. It has been paid a lot of attention by many researchers. However in image sequence case, the computation complexity is very high, and it has to be solved in order to have practical applications. In this paper, a new codebook updating algorithm is proposed and its VLSI implementation is discussed. The VLSI architecture has several processing arrays which can reduce the time complexity from 0(M x N x K) to 0(M -1- N -1- K) where N is the number of vectors per frame, M is the dimension of the vector and K is the size of the codebook. It will be very useful for real-time information processing.

Adaptive Parameter Restricted Hough Transformation -Application To Extraction Of Car Number Plates-

Hyungjin Choi, Takeshi Agui, Masayuki Nakajima

Show abstract

Two methods of extracting the region of a car number plate using parameter restricted Hough transformation[PRH] and that of hierarchical parameter restricted Hough transformation[HPRH] were proposed(1). The computational time and storage requirement are reduced by these methods. In the present report, an adaptive parameter restricted Hough transformation[APRH] is proposed. The APRH algorithm, in which the range of the parameter plane is adaptively restricted, gives higher performance than the previous two methods. As an example of application of this method, extraction of car number plate and its results are described.

A Chinese Outline Font Generator With An Intelligence-Based Learning System

Bor-Shenn Jeng, Kuang-Yao Chang

Show abstract

A novel approach to generating Chinese outline character is described in this paper. Our breakthrough method comprises two systems, i.e., an intelligence-based learning system and a realtime generating system. The former system can automatically encode the dot-matrix characters into outline characters through digitization, scale transformation and quality enhancement processes. The latter system, through zooming, quality enhancement and outline-filling processes, can realtime decode the outline characters and generate high quality Chinese characters.

Goodness-Of-Fit Testing In The Presence Of Nuisance Parameters With Applications To Feature Selection And Pattern Recognition In Digital Image Processing

Nicholas A. Nechval

Show abstract

The objective of this paper is to focus attention on a new practicable statistical approach to goodness-of-fit testing which is based on the notion of sufficiency and. provides an unified efficient approach to the problem of test construction in the presence of nuisance parameters. The general strategy of the above approach is to transform a set of random variables into a smaller set of independently and identically distributed uniform random variables on the interval (0,1)-i.i.d. U(0,1) under the null hypothesis HO. Under the alternative hypothesis this set of rv's will, in general, not be i.i.d. U(0,1). In other words, we replace the composite hypotheses by equivalent simple ones. Any statistic which measures a distance from uniformity in the transformed sample can be used as a test statistic. For instance, for this situation standard procedures of goodness-of-fit testing such as those based on Kolmogorov-Smirnov and Cramervon Mises statistics can be used. The obtained results are applicable to feature selection and pattern recognition. According to proposed approach, the best subset of feature measurements is the subset which maximizes the likelihood function of statistic that measures a distance from uniformity in the transformed sample. For the sake of illustrations the examples are given.

A HVS-Weighted Cosine Transform Coding Scheme With Adaptive Quantization

King N. Ngan, Kin S. Leong, Harcharan Singh

Show abstract

An adaptive cosine transform coding scheme capable of real-time operation is described. It employs an adaptive quantization scheme where the quantizer range is dynamically scaled by a fedback parameter from the rate buffer. To take into account the human visual characteristics, human visual system (HVS) properties are incorporated into the coding scheme. Results showed that the subjective quality of the processed images is significantly improved even at a low bit rate of 0.15 bit per pixel (bpp). Two images were coded with the adaptive scheme achieving an average of 0.2 bpp with very little perceivable degragation.

Multistage Adaptive Vector Quantization With Least Squares Approximation

H. Sun, H. Hsu, C. N. Manikopoulos

Show abstract

In this paper, we propose an image coding scheme, multistage adaptive vector quantization with least sqaures approximation. In this scheme, an image of size 2n x 2n is first partitioned into 2m x 2m blocks (m<n). Each pixel in the block undergoes a normalization by subsracting the block mean and then dividing by the block deviation. The resulting values of each block are approximated by a 2 x 2 matrix with least squares method. If the error of approximation is greater than a preset threshold value, then the 2m x 2m block is further partitioned into four 2m-1 x 2m-1 - blocks, and each of those blocks is also approximated by a 2 x 2 matrix with least squares method. This procedure is continued from the top level of 2m x 2m blocks to the bottom level of 4 x 4 blocks. To achieve a more efficient transmission, the vector quantization is applied to the 4 or 8 dimensional vector set on level-by-level basis. The residual errors due to quantization at each level are fedforward and included in the next level. Thus, it is possible to reprocess at the lower levels, the information lost at the higher level. The simulation results demonstrate that very well reconstructed images can be obtained at the bit rate as low as 0.7 bits/pixel.

Adaptive Two-Dimensional Neighborhood Sensitivity Control By A One-Dimensional Process

Oliver A. Hilsenrath, Yehoshua Y. Zeevi

Show abstract

The adaptive characteristics of the retina, which operates over an extremely wide range of light intensities, exceed by far the performance of serially accessed sensors such as vidicons or C.C.D.'s. The latter, operating over the entire optical field (or sampled grid) as a single neighborhood, are easily cut off by small areas of high intensity, resulting in the well known effect of "obscured images". Implementation of the biological solution of two-dimensional neighborhood processing, in which sensitivity at a given point is determined by the surrounding (receptive) field, would be the best solution. This approach requires, however, a new generation of still unavailable detector arrays with random access, neighborhood processing and local sensitivity feedback. The approach proposed in this paper is devised for sequentially accessed detectors/detector-arrays with one-dimensional processing, in which a Peano-Hilbert-like scan-path acquires visual data, thereby preserving a compact two-dimensional neighborhood relation. Neighborhood suppression is controlled by the expected value of a one-dimensional string, mapped onto the relevant two-dimensional neighborhood and subtracted, in turn, from the intensity of the delayed central element of the confined data string.

Two-Dimensional Systolic Arrays For Two-Dimensional Convolution

Hon Keung Kwan, T. S. Okullo-Oballa

Show abstract

In this paper, two, two-dimensional systolic arrays are derived by matrix-vector formulations of two-dimensional convolution. One array has a serial input, a serial output and uses a minimum number of mul-tipliers; while the other array has parallel inputs, parallel outputs and is suitable for high-speed processing using slow processing elements. Both arrays are modular with nearest neighbour communications and are suitable for VLSI implementation.

Textural Analysis Of 3D Objects By A Fractal-Preserving Approach

C. Dambra, S. Dellepiane, C. Regazzoni, et al.

Show abstract

The natural world is typically made up of 3D objects characterized by more or less complex internal structure. The purpose of this work is to investigate some 3D natural objects by fractal geometry and to discuss the results of texture-based segmentation. We suggest extending the use of fractals to 3D data (voxel type) by a fractal preserving approach. This approach achieves two opposite goals: reduction of noise effects and homogeneity of fractal dimension (F) values. To calculate F we select the most homogeneous neighbourhood (in terms of fractal dimension) around each voxel using a focusing mechanism and a minimum-variance criterion. The validity of the proposed methodology has been confirmed by analyzing a Nuclear Magnetic Resonance (NMR) spatial sequence of a human head; in fact different complex structures has been enhanced.

Universal Systolic Architecture For Morphological And Convolutional Image Filters

Edward R. Dougherty, Charles R. Giardina

Show abstract

Systolic arrays are employed in a number of computational settings where a simple computation is repeated a large number of times and where the algorithm exhibits a high degree of concurrency. The present paper demonstrates the relationship between a universal algebraic paradigm appearing in image algebra and a corresponding universal systolic scheme that can be employed upon algorithms that fit the paradigm.

The Compression Of Digital Images Using Graphic Signal Processor

David R. Ahlgren, Hasan Gadjali

Show abstract

The need for a compression system to significantly reduce the size of high resolution gray-scale and full-color image (picture) files, which can range from 100 Kbytes to over 3 Mbytes, stimulated an extensive research program that resulted in the development of new image compression systems. These new compression systems, based on well established and suitable compression schemes which run on Graphic Signal Processor (GSP), can reduce image file sizes by a factor of more that 8:1 with only minor detectable image degradation. Image compression of such kind is of interest because GSP is now an integral part of display control. The need for separate and expensive DSP for Discrete Cosine Transform computation is eliminated. This paper will discuss the importance of this compression scheme in the GSP world in terms of cost, speed and performance.

Design Of Perimeter Estimators For Digitized Planar Shapes

J. Koplowitz, A. Bruckstein

Show abstract

Measurement of perimeters of planar shapes from their digitized images is a common task of computer vision systems. A general methodology for the design of simple and accurate perimeter estimation algorithms is described. It is based on minimizing the maximum estimation error for digitized straight edges over all orientations. A new perimeter estimator is derived and its performance is tested on digitized circles using computer simulations. The experimental results may be used to predict the performance of the algorithms on shapes with arbitrary contours of continuous curvature. The simulations also show that fast and accurate perimeter estimation is possible even for objects that are small relative to pixel size.

On-Line Visual Data Compression Along A One-Dimensional Scan

Noam Sorek, Yehoshua Y. Zeevi

Show abstract

A new method for on-line visual data compression is presented, implementing nonuniform data selection according to gray-level neighborhood relations computed along a one-dimensional Hilbert scanning trajectory. Selection of a newly-scanned point (sample) is based on comparison of its gray level with a threshold determined by neighboring points previously scanned along the trajectory. Inherent in this algorithm is a mechanism for image pre-emphasis over areas (such as edges) exhibiting rapid changes in gray levels, whereas the reconstructed gray-level distribution over such areas is less accurate. Image reconstruction from the resultant (partial) subset of data is effected with the aid of a Zero-Order-Hold algorithm. At a rate of 0.38 bit/pixel obtained over a monochrome image of 512 by 512 pixels, the quality of the reconstructed image appears to be satisfactory, although it is not as good as that obtained with some of the best compression techniques, there is a clear advantage in using the proposed technique in cases where on-line processing and/or compression is important.

HDTV And The Emerging Broadband ISDN Network

Jules A. Bellisio, Kou-Hu Tzou

Show abstract

The Broadband Integrated Services Digital Network (BISDN) based on lightwave technology is expected to become the platform for all-purpose exchange area communications. It is likely that for residential applications high-quality television transmission will initially present the majority of traffic to the BISDN network. The transport of High-Definition Television (HDTV) must be achieved economically, and this requirement has an important influence on the overall BISDN design. In this paper, we present an overview of the emerging BISDN, examine its architecture, networking and control capabilities, and address various issues related to the transport of HDTV on this new broadband network.

Single-Channel High-Definition Television Systems, Compatible And Noncompatible

William F. Schreiber, Andrew B. Lippman

Show abstract

We describe the efforts to develop improved television systems, particularly for broadcasting and cable use. Both technical and regulatory issues are discussed. The questions of compatibility with the large number of existing receivers and with current channelization schemes are stressed. The desirable properties of new systems are given, and methods of improvement outlined. Two systems under development are presented, a receiver-compatible system suitable for immediate over-the-air applications, and a noncompatible (but channel-compatible) scheme suitable for immediate use on cable.

Embedding And Recovering Auxiliary Video Information Using Cooperative Paired Processing

Michael A. Isnardi, Terrence R. Smith, Barbara J. Roeder

Show abstract

The problem of transmitting widescreen HDTV signals to the home in a bandwidth-efficient manner has elicited many novel proposals, each having its own set of trade-offs. Most proposals assume some degree of compatibility with the present NTSC transmission standard. A small number of these proposals, such as the Advanced Compatible Television system,1,2 claim "full single-channel" compatibility by requiring the encoded signal to "look like" a normal picture on a conventional NTSC receiver and to occupy no more bandwidth than a standard 6 MHz RF channel. This paper discusses how cooperative paired processing can be used in video systems to simultaneously embed auxiliary video information within a main signal and to insure crosstalk-free recovery at the decoder.

A Hierarchical HDTV Coding System Using A DPCM-PCM Approach

T. C. Chen, Kou-Hu Tzou, P. E. Fleischer

Show abstract

A compatible high-definition television (HDTV) coding system is described in this paper. This scheme allows the HDTV signal to be transmitted at a channel rate of 135 Mbps while an extended-quality television (EQTV) signal is encoded into a 45 Mbps subchannel. Three major coding strategies are used to achieve this goal: first, a hierarchical one stage pyramid coding structure is adopted to separate the HDTV signal into two paths; second, a DPCM scheme with run-length coding and a novel modified quantization scheme is used to compress the first path into a 45 Mbps stream for EQTV; and third, a simple PCM scheme is used to compress the second residual path into a 90 Mbps stream. At the receiver, the first path alone is used to reconstruct the EQTV signal, or the two paths are combined to generate the HDTV signal. This scheme has the advantages of relatively low hardware complexity since the minimum number of channels are used (for a complementary approach) and, since it is implementable with concurrent processing, no scan conversion buffers are required. Also, the proposed pyramid structure incorporates a feedback loop for the coding of the complementary high-definition residual signal. The system design details are investigated and computer simulation results are presented. Both the HDTV and EQTV reconstructed picture quality were found to be very good.

Quadtree-Structured Interframe Coding Of HDTV Sequences

Peter Strobach

Show abstract

A new coding technique for high definition television (HDTV) image sequences is proposed. This technique makes use of the fact that the prediction error image in a motion-compensated temporal DPCM structure is weakly correlated in spatial dimensions. As a consequence, it can be shown, that the prediction error image can be directly coded using an adaptive quadtree mean decomposition coder. This new scheme was termed quadtree-structured difference pulse code modulation (QSDPCM). As an important property, the QSDPCM method does not rely on computationally burdensome spatial decorrelation techniques and hence, the decoder complexity of the QSDPCM algorithm is extremely low. This makes the QSDPCM algorithm an attractive candidate for application in future broadcast HDTV networks where a large number of high-quality but low-cost decoders might be an issue. Simulations have been carried out for typical HDTV sequences at 125Mbitis and 250Mbitis where the primary data rate was 827Mbit/s. A significant improvement of this coder compared to an earlier version was achieved by incorporation of half-pixel accuracy motion compensation. Besides the practical considerations, four new and interesting theorems are obtained.

Feedforward Control Of Hybrid Transform / Predictive Coder

Takashi Mochizuki, Jun-ichi Ohki

Show abstract

A new feedforward control method to control encod-ing parameters is proposed for low bit rate video encoding. At low transmission bit rates, all input video frames cannot be transmitted and an encoding frame interval varies with the information amount in each encoded frame. Subjective video quality is evaluated from both picture degradation and encoding frame intervals. Because picture degradation and encoding frame intervals are traded off, encoding parameter control is very important. The proposed method is based on a hybrid transform predictive coding scheme with adaptive coefficient selection. The method estimates the amount of information for some encoding parameters and selects parameters so that given control characteristics can be achieved. Computer simulations for both the proposed method and a conventional feedback method show the effectiveness of the new method. This method will be applied to an NETEC series low bit rate video codec.

Overlapping Block Transform For Image Coding Preserving Equal Number Of Samples And Coefficients

H. Schiller

Show abstract

An approach to transform coding of images is presented, which uses overlapping blocks without increasing the number of coefficients. This is achieved by subsampling in the frequency domain, thus causing spatial domain aliasing in each block. This spatial domain aliasing is however cancelled out by an overlapping addition of the backward transform output of neighboring blocks. Conditions are stated, under which this aliasing cancellation property holds even when a spatial window function is incorporated into the transform. By the design of the window function, block boundary artefacts typical for conventional transform coding can be completely avoided. The application of 2-dimensional overlapping block transform (2D-OBT) to picture coding is examined, the nature and visibility of quantization errors in the case of lossy coding are analyzed. The incorporation of 2D-OBT in a low bitrate hybrid image sequence coding environment is described. Simulation results are presented, showing the achievable picture quality at 64 kBit/s using 2D-OBT as compared to DCT.

A Novel Method For Compressing Images Using Discrete Directional Transforms

G. Bjontegaard

Show abstract

The main purpose of this paper is to present a coding method based on a new class of transforms that will be named Directional Transforms (DT). The idea behind using DT is, that the picture content shall determine which one out of several - 16 used here - transforms will describe the picture most efficiently. Most of the other aspects of the coding method are well known from hybrid coding schemes 2,3 - like two dimensional block transform, prediction by blockwise motion compensation and variable length coding (VLC) of transform coeffisients. It is shown that the use of DT gives both saving of bits to describe the natural images and subjectively better looking pictures due to cleaner reconstructions of edges and lines.

Very Low Bit Rate Image Sequence Coding Using Object Based Approach

Ari Nieminen, Murat Kunt, Maurice Gisler

Show abstract

A very low bit rate image sequence coding scheme is introduced. It is based on directional decomposition, which was used previously in static image coding. An image sequence is decomposed into a low-frequency sequence and a set of directional high-frequency sequences. Very high compression is obtained by detecting and coding the movements of directional edges extracted from directional high-frequency sequences. At this stage of research, data rates from 50 to 60 kbit/s can be achieved, though by refining the method even larger compression is expected.

A High Quality Videophone Coder Using Hierarchical Motion Estimation And Structure Coding Of The Prediction Error

Michael Gilge

Show abstract

For the realization of a videophone for use with the ISDN at data rates as low as 64 kbit/s there is a demand for powerful coding algorithms. In this paper new concepts for motion estimation and residual error coding, which are the most important parts of an hybrid codec, are introduced: Fast motion estimation algorithms to be found in many approaches, assume a monotonous behavior of the matching criterion with respect to the displacement. This assumption does not hold in reality. Therefore a hierarchical block matching motion estimation algorithm is introduced, which approaches the performance of a full search at a fraction of the computational cost. The algorithm combines a full search with a vector quantization as a first stage. In a second stage a full search with sub-pel accuracy is performed only over the small region corresponding to the codeword of the first stage. By motion compensated prediction the image of the residual error is obtained. A statistical analysis of this image shows that methods for data compression of natural images, e.g. transform coding, are not suitable for coding of the prediction error images. Instead, a structure coding scheme is proposed: The special signal statistics of the error signal allows a coarse quantization in the original domain. Dominant block types are coded using a quadtree. The scheme offers the additional advantage of easy implementation, unlike the DCT. Computer simulations show an improved picture quality even for strong scene movement without the usual artifacts, like blocking for example.

A Realization Of MC/DCT By Video Signal Processors

Yasuhiro Kosugi, Kiyoshi Sakai, Takahiro Hosokawa, et al.

Show abstract

One type of video signal processors is introduced in this paper, for which multi-processor configuration and required operations are studied. The hybrid coding scheme of MC/DCT is noted as an algorithm of high efficiency for low bit rate coding. But it has a problem that the decoded pictures of receiver are spoilt if the calculation types of inverse-DCT are not identical in transmitter and receiver. So two approaches of DC coefficient separation and conversion of the transfer range on the block by block basis are taken to solve this IDCT mismatch problem by improving the precision of IDCT calculation. To decide the transfer range, three methods are studied. The computer simulation is done to prove the performance of these methods, and the application of the processor is also studied to each method.

VLSI Architectures For Block Matching Algorithms

P. Pirsch, T. Komarek

Show abstract

This paper discusses architectures for realization of block matching algorithms with emphasis on highly concurrent systolic array processors. A three step mapping methodology for systolic arrays known from the literature is applied to block matching algorithms. Examples of 2-dimensional and 1-dimensional systolic arrays are presented. The needed array size, the transistor count and the maximum frame rate for processing video telephone and TV signals have been estimated.

VLSI Implementation Of Motion Compensation Full-Search Block-Matching Algorithm

Kun-Min Yang, Lance Wu, Hyonil Chong, et al.

Show abstract

In this paper, we describe a single chip VLSI implementation of a Motion Compensation algorithm for a very low bit-rate motion video codec. Our design aims at implementing the Block Matching Algorithm (BMA). The novel features of our design are the followings: 1. It has full-search capability. 2. It allows sequential data inputs but performs parallel processing with 100% efficiency. 3. Common buses are used for data transfer. 4. It is a highly modular design and easy to expand. 5. It contains testing circuitry. The design has been laid out. Simulation results show that with the use of double metal 1.2µ CMOS process, the design will be able to run up to 25 MHz. The schematic design for fractional-precision block matching system is also described in this paper.

Bit-Serial Architecture For Real Time Motion Compensation

Raffi Dianysian, Richard L. Baker

Show abstract

We describe a bit-serial VLSI architecture for a real time motion estimation chip. The chip can search windows of arbitrary size with integer displacement resolution. Using 3 micron CMOS, it is projected to perform up to 6 million matches per second. This would permit real time exhaustive motion estimation of 8 x 8 blocks on 16 x 16 windows at NTSC resolution and 20 frames/sec. A short design time, without silicon assemblers or compilers, for the high speed chip is made possible by its bit-serial architecture.

Architecture Of A Programmable Real-Time Processor For Digital Video Signals Adapted To Motion Estimation Algorithms

Thomas Wehberg, Hans Volkers

Show abstract

Real-time processing of digital pictures from a video sequence requires several 100 Mega operations per second (MOPS) at the picture element (pel) level and a correspondingly high data rate for the transport of pels to the processing unit. A real-time video processor (RTVP) can achieve the necessary processing speed and data rate by employing an adapted architecture. By analyzing distinct algorithms from the picture coding area with special consideration of motion estimation (ME) methods, features of the algorithms have been extracted that impose requirements on the RTVP-architecture. In order to achieve the required throughput, an architecture utilizing parallel and pipelined processing and a restriction of the programmability to a set of algorithms has been derived by adapting to the algorithm requirements. This contribution focusses on the algorithm analysis and on the resulting architectural con-cepts. Some detail on an implementation of the RTVP-architecture being designed in standardcell CMOS technology is also presented.

A Pel-Recursive Wiener-Based Algorithm For The Simultaneous Estimation Of Rotation And Translation

J. Biemond, J. N. Driessen, A. M. Geurtz, et al.

Show abstract

In this paper we derive a pel-recursive Wiener-based algorithm for the estimation of rotation and translation parameters in consecutive frames of an image sequence. The estimator minimizes the transformed frame difference (tfd) and the derivation is based on a Linearization of this tfd and a Wiener solution of the resulting stochastic linear observation equations. The use of a causal window and a segmentation algorithm allow the algorithm to work in a pet-recursive coding environment. Experiments on synthetic transformed data show that the algorithm is capable of estimating the motion parameters. Experiments on real coding data show a little gain in coding efficiency compared with existing pet-recursive displacement estimators. However, performance can be improved by a better segmentation algorithm and by taking into account the effect of motion boundaries.

Adaptive Schemes For Motion-Compensated Coding

A. Puri, H. M. Hang

Show abstract

Several variations on the popular motion-compensated interframe block-coding schemes are proposed. The idea behind these schemes is to handle each image block with a different parameter and/or algorithm based on the contents of that individual block. They are extensions of the basic nonadaptive algorithm and are designed to reduce the coding bit rate and to improve the reconstructed picture quality. Our goal in this paper is to explore the potential merits offered by these adaptive techniques. Three schemes are described in this paper: (1) adaptive block-size motion compensation and coding, (2) multiple-transform coding, and (3) DOT and cluster (pel-domain) hybrid coding. Similar to many other adaptive algorithms, each of the above schemes has a number of parameters that need to be chosen carefully for optimum performance. We make no such exhaustive attempt, rather the results presented here are based on a limited set of experiments using a few preselected parameters. However, we find that the experiments conducted still provide us with some insights into issues concerning the adaptive image-sequence coding schemes, such as, the performance versus complexity issue.

Displacement Estimation For Image Predictive Coding And Frame Motion-Adaptive Interpolation

George Tziritas

Show abstract

We present and investigate methods of displacement vector estimation which can be used in predictive image coding or motion-adaptive frame interpolative coding. In both cases, a differential approach is adopted. This means that the displacement vector estimation is based on the measurement of the spatiotemporal gradient of the image sequence. In the case of predictive coding the motion estimator is composed of three parts. The first is a spatial predictor of the displacement vector, which is based on a spatial autoregressive relation of the velocity field. The coefficients of this relation are made intensity-dependent. The presence of discontinuities inherent to the motion and to the instabilities of the estimation algorithm makes necessary a stage of detection of all type of discontinuities. Finally, an a posteriori estimator achieves the task of displacement vector estimation. This last stage is of iterative form. The application of this algorithm in a very noisy image sequence has permitted to obtain a gain of about 40% in the absolute value of the difference with the predicted displacement vector and 65% with the estimated one after two iterations: The same structure of displacement field estimation can be used to make a frame interpolation motion-adaptive. We present such an estimator, which is slightly differnt from that proposed in predictive coding, knowing that a non-causal a priori estimation can be realized. We also propose a completely different algorithm, which uses a hierarchical representation of the displacement field. A dynamic hierarchical 4-trees estimation algorithm is presented. The motivation for such a proposition comes from the fact that large areas in a great number of image sequences are no-moving or have homogeneous bidimensional motion, that is to say translational. Considering this redundance in the displacement field a significant gain in complexity of calculations can be obtained, if the estimation is carried out rather over blocks than pel-by-pel. We have applied the last algorithm in the case of a sequence with moderate motion and some numerical results are given in this article.

Displacement Estimation By Hierarchical Blockmatching

M. Bierling

Show abstract

A hierarchical blockmatching algorithm for the estimation of displacement vector fields in digital television sequences is presented. Known blockmatching techniques fail frequently as a result of using a fixed measurement window size. Using distinct sizes of measurement windows at different levels of a hierarchy, the presented blockmatching technique yields reliable and homogeneous displacement vector - fields, which are close to the true displacements, rather than only a match in the sense of a minimum mean absolute luminance difference. In the environment of a low bit rate hybrid coder for image sequences, the hierarchical blockmatching algorithm is well suited for both, motion compensating prediction, and motion compensating interpolation. Compared to other high sophisticated displacement estimation techniques, the computational effort is decreased drastically. Due to the regularity and the very small number of necessary operations, the presented hierarchical blockmatching algorithm can be implemented in hardware very easily.

Variable Rate Video Transport In Broadband Packet Networks

S. H. Lee, L. T. Wu

Show abstract

Broadband packet networking techniques based on high speed electronics in lightwave technology promise the fully integrated network of the future. Among other features, broadband packet networks could provide an efficient transport capability for variable rate real-time video traffic. This requires a careful tradeoff between the network and terminal design complexity. In this paper, terminal and network designs alternatives are investigated for end-to-end variable rate video transport in a broadband packet network. In particular, two different types of transport for a single video connection are presented using a hierarchical video coding scheme. This would allow efficient bandwidth sharing and minimum degradation in video quality.

Variable Bit Rate Video Coding In Atm Networks

W. Verbiest, L. Pinnoo, B. Voeten

Show abstract

Several aspects of the coding of video services in an ATM network environment have been studied. Variable Bit Rate (VBR) coding is discussed and the conditions for its effective application are explained. A method for the synchronization of VBR coded video services under the constraints of cell delay jitter is proposed. The role of the input buffer at the decoder side is explained, and the size is calculated for VBR connections. The influence of cell loss on picture quality is discussed and it is explained how cell loss can be concealed using a layered coding model.

Television Compression Algorithms And Transmission On Packet Networks

R. C. Brainard, J. H. Othmer

Show abstract

Wide-band packet transmission is a subject of strong current interest. The transmission of compressed TV signals over such networks is possible with any quality level. There are some specific advantages in using packet networks for TV transmission. Namely, any fixed data rate can be chosen, or a variable data rate can be utilized. However, on the negative side packet loss must be considered and differential delay in packet arrival must be compensated. The possibility of packet loss has a strong influence on compression algorithm choice. Differential delay of packet arrival is a new problem in codec design. Some issues relevant to mutual design of the transmission networks and compression algorithms will be presented. An assumption is that the packet network will maintain packet sequence integrity. For variable-rate transmission, a reasonable definition of peak data rate is necessary. Rate constraints may be necessary to encourage instituting a variable-rate service on the networks. The charging algorithm for network use will have an effect on selection of compression algorithm. Some values of and procedures for implementing packet priorities are discussed. Packet length has only a second-order effect on packet-TV considerations. Some examples of a range of codecs for differing data rates and picture quality are given. These serve to illustrate sensitivities to the various characteristics of packet networks. Perhaps more important, we talk about what we do not know about the design of such systems.

Fixed Distortion, Variable Rate Subband Coding Of Images

John C. Darragh, Richard L. Baker

Show abstract

Most image coding systems are designed for transmission over constant bit rate channels. Such systems usually achieve varying coded image quality, depending on the inherent "compressibility" of the transmitted image. From a user's perspective it is desirable to have a codec which achieves a prescribed level of fidelity regardless of the image sent. The encoder output in this case is necessarily variable rate, making it difficult to reconcile with constant bit rate channels. However, the impending availability of high performance packet switched networks compatible with variable rate sources will make variable rate coding feasible. The codec design in this paper is based on a new method of allocating distortion rather than bits among the subbands. Judicious selection of subband quantizers compatible with the allocation procedure produces a simple four band encoder structure. Several four band structures are then nested in a hierarchical fashion for better compression performance; the resulting image codec achieves mean square distortion within 1.5 dB of the user specified value for a variety of images and over a wide range of distortions. Its rate-distortion performance rivals fixed rate systems of similar complexity. The design can be configured to take best advantage of datagram networks or prioritized packet-switched networks and the encoder output is suitable for progressive image transmission applications.

Variable-Bit-Rate Coding Capable Of Compensating For Packet Loss

Kazunori Shimamura, Yasuhito Hayashi, Fumio Kishino

Show abstract

Asynchronous Transfer Mode (ATM) is expected to be one of the important variable-bit-rate methods for video transmission. Packet loss has the greatest influence on picture quality in video network. This paper proposes a layered coding technique suitable for ATM using discrete cosine transform (DCT) cod-ing. The proposed layered coding separates coded information into most significant parts (MSPs) and least significant parts (LSPs) and gives MSP packets priority over LSP packets to reduce the influence of packet loss on picture quality. The influence of packet loss on picture quality is also described, and the effectiveness of the proposed layered coding is confirmed with decoded pictures.

Investigation Of Threshold Dependence In Adaptive Vector Quantization For Image Transmission In Packet Switched Networks

C. N. Manikopoulos, H. Sun, H. Hsu

Show abstract

In heavily congested packet-switched networks, instead of trying, a priory, to predict the load to the network for a particular set of users, a preferred approach may be to endeavor to control the load. In order to effect dynamic control of the network load, the user (video encoder), and the service provider (packet-switched network), must be interactively linked, in marked departure of traditional source - network separation. Moreover, the resulting source-network system must be capable to dynamically adjust the values of the average bit rate (B) and/or distortion (D) of the variable bit rate transmission source. Adaptive vector quantization has been employed to encode the image source with constant local picture quality: An activity index, A, has been used to classify image areas into two groups, active and non-active, according to whether A>T or A<T, respectively; T is a heuristic threshold value. For non-active areas, large block size is used, while for active areas the block size is small. Two codebooks are generated corresponding to each of the two groups of blocks formed. Linearized parametric expressions for B and D have been derived for 4x4 blocks and for several 256x256 8-bit images, of the form: B-∝=a+b*T and D=c+d*T. The value of m has been found to be approximately equal to 4.0 for all images studied, while a, b, c, and d vary with the image and can be initially estimated by sampling. By allowing the determination of the proper operating point value for T, these relations may provide the required "handle" to the source-network sytem to effect dynamic adjustments to the distortion, and thus the average picture quality level, and/or the average bit rate, and thus the source load to the network.

Segmentation-Based Image Coding In A Packet-Switched Network Environment

Sarah A. Rajala, Wonrae M. Lee

Show abstract

In this paper, a segmentation-based image coding technique incorporating properties of the human visual system is presented. The approach taken in this research is to first segment an image into regions with spatial similarity and then to define an efficient method for encoding the segmented image data. Furthermore, specific attention is given in this paper to defining the requirements for using a segmentation-based image coder for video transmission over a packet-switched network.

Statistical Analysis Of The Output Rate Of A Sub-Band Video Coder

Paul Douglas, Gunnar Karlsson, Martin Vetterli

Show abstract

The output rates of a sub-band video coder are studied. The coder has been used to compress ten sequences of two minutes each. The resulting bit-rates are analyzed and the issues of statistical multiplexing, source modeling and resource allocation are discussed in light of the simulation results. A model is suggested in which a bit-rate sequence is approximated by the sum of a slowly varying non-stationary process and a stationary noise-like one. A dynamic resource allocation could thus be based on the slowly varying process while the noise-like process is used to set an error margin on the short-term variations.

Variable Rate, Adaptive Transform Tree Coding Of Images

William A. Pearlman

Show abstract

A tree code, asymptotically optimal for stationary Gaussian sources and squared error distortion [2], is used to encode transforms of image sub-blocks. The variance spectrum of each sub-block is estimated and specified uniquely by a set of one-dimensional auto-regressive parameters. The expected distortion is set to a constant for each block and the rate is allowed to vary to meet the given level of distortion. Since the spectrum and rate are different for every block, the code tree differs for every block. Coding simulations for target block distortion of 15 and average block rate of 0.99 bits per pel (bpp) show that very good results can be obtained at high search intensities at the expense of high computational complexity. The results at the higher search intensities outperform a parallel simulation with quantization replacing tree coding. Comparative coding simulations also show that the reproduced image with variable block rate and average rate of 0.99 bpp has 2.5 dB less distortion than a similarly reproduced image with a constant block rate equal to 1.0 bpp.

Orientation-Selective VLSI Retina

Tim Allen, Carver Mead, Federico Faggin, et al.

Show abstract

In both biological and artificial pattern-recognition systems, the detection of oriented light-intensity edges is an important preprocessing step. We have constructed a silicon VLSI device containing an array of photoreceptors with additional hardware for computing center-surround (edge-enhanced) response as well as edge orientation at every point in the receptor lattice. Because computing the edge orientations in the array local to each photoreceptor would have made each pixel-computation unit too large (thereby reducing the resolution of the device), we devised a novel technique for computing the orientations outside of the array. All the transducers and computational elements are analog circuits made with a conventional CMOS process.

Neural Networks For Visual Telephony

A. M. Gottlieb, J. Alspector, P. Huang, et al.

Show abstract

By considering how an image is processed by the eye and brain, we may find ways to simplify the task of transmitting complex video images over a telecommunication channel. Just as the retina and visual cortex reduce the amount of information sent to other areas of the brain, electronic systems can be designed to compress visual data, encode features, and adapt to new scenes for video transmission. In this talk, we describe a system inspired by models of neural computation that may, in the future, augment standard digital processing techniques for image compression. In the next few years it is expected that a compact low-cost full motion video telephone operating over an ISDN basic access line (144 KBits/sec) will be shown to be feasible. These systems will likely be based on a standard digital signal processing approach. In this talk, we discuss an alternative method that does not use standard digital signal processing but instead uses eletronic neural networks to realize the large compression necessary for a low bit-rate video telephone. This neural network approach is not being advocated as a near term solution for visual telephony. However, low bit rate visual telephony is an area where neural network technology may, in the future, find a significant application.

Relaxation Neural Network For Complete Discrete 2-D Gabor Transforms

John G. Daugman

Show abstract

It is often desirable in image processing to represent image structure in terms of a set of coefficients on a family of expansion functions. For example, familiar approaches to image coding, feature extraction, image segmentation, statistical and spectral analysis, and compression, involve such methods. It has invariably been necessary that the expansion functions employed comprise an orthogonal basis for the image space, because the problem of obtaining the correct coefficients on a non-orthogonal set of expansion functions is usually arduous if not impossible. Oddly enough, image coding in biological visual systems clearly involves non-orthogonal expansion functions. The receptive field profiles of visual neu-rons with linear response properties have large overlaps and large inner products, and are suggestive of a conjoint (spatial and spectral) "2-D Gabor representation" (Daugman 1980, 1985). The 2-D Gabor transform has useful decorrelating properties and provides a conjoint image description resembling a speech spectrogram, in which local 2-D image regions are analyzed for orientation and spatial frequency content, but its expansion functions are non-orthogonal. This paper describes a three-layered relaxation "neural network" that efficiently computes the correct coefficients for this and other, non-orthogonal, image transforms. Examples of applications which are illustrated include: (1) image compression to below 1.0 bit/pixel, and (2) textural image segmentation based upon the statistics of the 2-D Gabor coefficients found by the relaxation network.

Digital Image Halftoning Using Neural Networks

Dimitris Anastassiou, Stefanos Kollias

Show abstract

A novel technique for digital image halftoning is presented, performing nonstandard quantization subject to a fidelity criterion. Massively parallel artificial symmetric neural networks are used for this purpose, minimizing a frequency weighted mean squared error between the continuous-tone input and the bilevel output image. The weights of these networks can be selected, so that the generated halftoned images are of good quality. A symmetric formulation of the error diffusion halftoning technique is also presented in the form of a massively parallel network. This network contains a nonmonotonic nonlinearity in lieu of the sigmoid function and is shown to be appropriate for effective halftoning of images.

Principal Components Analysis Of Images Via Back Propagation

Garrison W. Cottrell, Paul Munro

Show abstract

The recent discovery of powerful learning algorithms for parallel distributed networks has made it possible to program computation in a new way, by example rather than algorithm. The back propagation algorithm is a gradient descent technique for training such networks. The problem posed to the researcher using such an algorithm is discovering how it did solve the problem if a solution is found. In this paper we apply back propagation to a well understood problem in image analysis, i.e., bandwidth compression, and analyze the internal representation developed by the network. The network used consists of nonlinear units that compute a sigmoidal function of their inputs. It is found that the learning algorithm produces a nearly linear transformation of a Principal Components Analysis of the image, and the units in the network tend to stay in the linear range of the sigmoid function. The particular transform found departs from the standard Principal Components solution in that near-equal variance of the coefficients results, depending on the encoding used. While the solution found is basically linear, such networks can also use the nonlinearity to solve encoding problems where the Principal Components solution is degenerate.

Mapping Rule-Based And Stochastic Constraints To Connection Architectures: Implication For Hierarchical Image Processing

Michael I. Miller, Badrinath Roysam, Kurt R. Smith

Show abstract

Essential to the solution of ill posed problems in vision and image processing is the need to use object constraints in the reconstruction. While Bayesian methods have shown the greatest promise, a fundamental difficulty has persisted in that many of the available constraints are in the form of deterministic rules rather than as probability distributions and are thus not readily incorporated as Bayesian priors. In this paper, we propose a general method for mapping a large class of rule-based constraints to their equivalent stochastic Gibbs' distribution representation. This mapping allows us to solve stochastic estimation problems over rule-generated constraint spaces within a Bayesian framework. As part of this approach we derive a method based on Langevin's stochastic differential equation and a regularization technique based on the classical autologistic transfer function that allows us to update every site simultaneously regardless of the neighbourhood structure. This allows us to implement a completely parallel method for generating the constraint sets corresponding to the regular grammar languages on massively parallel networks. We illustrate these ideas by formulating the image reconstruction problem based on a hierarchy of rule-based and stochastic constraints, and derive a fully parallelestimator structure. We also present results computed on the AMT DAP500 massively parallel digital computer, a mesh-connected 32x32 array of processing elements which are configured in a Single-Instruction, Multiple Data stream architecture.

A Neural Network Implementation For Real-Time Scene Analysis

R. Booth, C. R. Allen

Show abstract

A prototype neural network of LSI electronic logic, couple4l with a matrix of photodetectors is proposed to implement the pre-processing functions of image capture and region extraction. The structure is described, with its likely implementation strategy, and the full on-line scene analysis algorithm is explained, with some simulation results made on real and 3D computer generated images.

The Neural Analog Diffusion-Enhancement Layer (NADEL) And Early Visual Processing

Allen M. Waxman, Michael Seibert, Robert Cunningham, et al.

Show abstract

We describe a new class of neural network aimed at early visual processing; we call it a Neural Analog Diffusion-Enhancement Layer or NADEL. The network consists of two levels which are coupled through nonlinear feedback. The lower level is a two-dimensional diffusion map which accepts binary visual features as input (e.g. edges and points), and spreads activity over larger scales as a function of time. The upper layer is fed the activity from the diffusion layer and serves to locate local maxima in it (an extreme form of contrast enhancement). These local maxima are fed back to the diffusion layer using an on-center/off-surround shunting anatomy. The maxima are also available as output of the network. The network dynamics serves to cluster features on multiple scales as a function of time, and can be used to support a large variety of early visual processing tasks such as: extraction of corners and high curvature points along edge contours, line end detection, filling gaps and completing contour boundaries, generating saccadic eye motion sequences, perceptual grouping on multiple scales, correspondence and path impletion in long-range apparent motion, and building 2-D shape representations that are invariant to location, orientation and scale on the visual field. The NADEL is now being designed for implementation in Analog VLSI wafer-scale circuits.

A Circuit For Evaluating 64Kbit/S Encoding Procedures

J. C. Candy, R. L. Schmidt, H. M. Hang, et al.

Show abstract

A prototype encoder-decoder that incorporates special VLSI circuits has been set up to demonstrate various interframe encoding techniques in real time. A Video processing chip converts the signal from R-G-B to Y-U-V format and then filters the signal both horizontally and vertically before subsampling it in various ways. A memory control chip reforms the raster into a sequence of 8*8 blocks, at 15 fields per second. A predictor chip derives frame-to-frame difference signals for each block in every other field and interpolates the remaining fields from the predicted ones. Both prediction and interpolation error signals can be encoded using block encoding techniques in a digital signal processor, or by quantizing their DCTs that are generated in a special orthogonal-transform chip. Equivalent inverse processing is available at the receiver. The system is controlled by general purpose microcomputers, one at the transmitter and one at the receiver. The circuit provides a means for evaluating various block encoding techniques and serves as a base for finding suitable subcircuits that can be integrated.

VLSI Architectures For Image Filtering

F. Jutand, A. Artieri, G. Concordel, et al.

Show abstract

This paper presents a study of problems encountered when implementing real time image filters. First, the complexity of the algorithms is studied, according to the peculiarities of applications. Then the problem of the system design and its organisation is presented and differents chip architectures are described for various tradeoffs between the architectural parameters. A new architecture used also for motion estimation is mentioned. In the last chapter the problems of the computation operator design is addressed, with elements on different choices for internal oganization of the computation, pipelining and computation format. This study is based on many applications for which VLSI architectures have already been designed and reported. For some of them chip are currently implemented in our laboratories. Linear filtering has been selected for illustration purpose, but this study is also concerned with many other problems of real time image processing, all exhibiting the common feature of a great importance in the architecture of the global managament of data communication and storage.

Visual Communications and Image Processing '88: Third in a Series

Volume Details

Table of Contents

Table of Contents