Proceedings Volume 6064

Image Processing: Algorithms and Systems, Neural Networks, and Machine Learning

cover
Proceedings Volume 6064

Image Processing: Algorithms and Systems, Neural Networks, and Machine Learning

View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 7 February 2006
Contents: 12 Sessions, 55 Papers, 0 Presentations
Conference: Electronic Imaging 2006 2006
Volume Number: 6064

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Image Processing Algorithms
  • Efficient Algorithms
  • Image Processing Systems
  • Image Processing Methods
  • Biomedical Image Processing
  • Algorithms and Systems
  • Kernel-based Learning for Detection and Shape Analysis
  • Fuzzy Clustering
  • Independent Component Analysis, Adaboost for Recognition
  • Neural Networks Applications for Manifold Learning, Recognition, Color Perception, and Compression
  • Support Vector Machine and Neural Networks for Face Recognition, Detection, and Classification
  • Applications of Neural Networks for text spotting, character and face recognition
Image Processing Algorithms
icon_mobile_dropdown
Affine invariant surface evolutions for 3D image segmentation
Yogesh Rathi, Peter Olver, Guillermo Sapiro, et al.
In this paper we present an algorithm for 3D medical image segmentation based on an affine invariant flow. The algorithm is simple to implement and semi-automatic. The technique is based on active contours evolving in time according to intrinsic geometric measures of the image. The surface flow is obtained by minimizing a global energy with respect to an affine invariant metric. Affine invariant edge detectors for 3-dimensional objects are also computed which have the same qualitative behavior as the Euclidean edge detectors. Results on artificial and real MRI images show that the algorithm performs well, both in terms of accuracy and robustness to noise.
Iterative Markovian estimation of mass functions in Dempster Shafer evidence theory: application to multisensor image segmentation
Layachi Bentabet, Maodong Jiang
Mass functions estimation is a key issue in evidence theory-based segmentation of multisensor images. In this paper, we generalize the statistical mixture modeling and the Bayesian inference approach in order to quantify the confidence level in the context of Dempster-Shafer theory. We demonstrate that our model assigns confidence levels in a relevant manner. Contextual information is integrated using a Markovian field that is adapted to handle compound hypotheses. The multiple sensors are assumed to be corrupted by different noise models. In this case, we show the interest of using a flexible Dirichlet distribution to model the data. The effectiveness of our method is demonstrated on synthetic and radar and SPOT images.
Progressive halftoning by Perona-Malik error diffusion and stochastic flipping
Halftoning has been a significant topic in image processing due to many emerging applications, various diversified approaches, and challenging theoretical analysis. Inspired by the wealthy literature on halftoning, as well as the recent PDE (partial differential equations) approach in image processing, the current work proposes a novel progressive halftoning algorithm by empolying the celebrated anisotropic diffusion model of Perona and Malik (IEEE Trans. Pattern Anal. Machine Intell., 12:629-639, 1990), and a properly designed stochastic strategy for binary flipping. The halftone outputs from the proposed model are typical samples of some random fields, which share many virtues of existent deterministic halftone algorithms, as well as show many interesting features like the blue noise behavior. The new model is independent of traditional windows, tiles, or paths, and allows direct parallel implementation.
Multiple wavelet coherence analysis
Sofia C. Olhede, Georgios Metikas
We propose a method for analysis of localised relationships between multiple two-dimensional signals, or images, that naturally treats the local phase structure and local orientation of any variation in the observed images. The method is based on using several non-separable wavelet decompositions of the images. The set of mother wavelets used are optimally concentrated isotropic orthogonal wavelet functions extended to a triplet of functions using the Riesz transform, so that directional structure can be detected. The full set of triplet wavelet transform coefficients of two images can be used to extract local oscillatory components of the images present at a given spatial and scale point, and subsequently used to determine the local coherence and phase shift between the two images. The determination of the local phase and orientation involves calculating the continuous wavelet transform (CWT) of the images, then forming the scalogram matrix from these CWTs, and calculating the wavelet coherence. Robust estimates can be constructed by averaging over wavelet coefficients, extending Thomson's method to isotropically localised two-dimensional decompositions.
Interpolation using cosine transforms generalized to Lie groups
J. Patera, A. Zaratsyan, H. Zhu
Interpolation methods are often used in many applications for image generation and processing, such as image compression and resampling. This paper introduces a new family of interpolation algorithms to dimensions n greater-than or equal to 1. Each version of the method is based on a compact semisimple Lie groups of rank n, although here we explore mainly the cases n=2. The approach can be viewed as a generalization of discrete and continuous cosine transform.
Optimization procedures for the estimation of phase portrait parameters of orientation fields
Fábio J. Ayres, Rangaraj M. Rangayyan
Oriented patterns in an image often convey important information regarding the scene or the objects contained. Given an image presenting oriented texture, the orientation field of the image is a map that depicts the orientation angle of the texture at each pixel. Rao and Jain developed a method to describe oriented patterns in an image based on the association between the orientation field of a textured image and the phase portrait generated by a pair of linear first-order differential equations. The estimation of the model parameters is a nonlinear, nonconvex optimization problem, and practical experience shows that irrelevant local minima can lead to convergence to inappropriate results. We investigated the performance of four optimization algorithms for the estimation of the optimal phase portrait parameters for a given orientation field. The investigated algorithms are: nonlinear least-squares, linear least-squares, iterative linear least-squares, and simulated annealing. The algorithms are evaluated and compared in terms of the error between the estimated parameters and the parameters known by design, in the presence of noise in the orientation field and imprecision in the initialization of the parameters. The computational effort required by each algorithm is also assessed. Individually, the simulated annealing procedure yielded low fixed-point and parameter errors over the entire range of noise tested, whereas the performance of the other methods deteriorated with higher levels of noise. The use of the result of simulated annealing for the initialization of the nonlinear least-squares method led to further improvement upon the simulated annealing results.
Optimized gradient filters for hexagonal matrices
Digital images are represented nowadays as square lattices. Everyday items, such as digital cameras, displays, as well as many systems for vision or image processing use square lattices to represent an image. However, as the distance between adjacent pixels is not constant, any filter based on square lattices presents inherent anisotropy. Ando introduced consistent gradient filters to cope with this problem, with filters derived in order to get the minimum inconsistency. Square lattices are not, however, the only way to order pixels. Another placement method can be found, for example, in the human retina, where receptors adopt an hexagonal structure. In contrast to square lattices, the distance between adjacent pixels is a constant for such structures. The principal advantage of filters based on hexagonal matrices is, then, their isotropy. In this paper, we derive consistent gradient filters of hexagonal matrices following Ando's method to derive consistent gradient filters of square matrices. The resultant hexagonal consistent gradient filters are compared with square ones. The results indicate that the hexagonal filters derived in this paper are superior to square ones in consistency, in proportion of consistency to output power, and in localization.
Noisy image enhancement on smooth function spaces
We introduce a novel method for restoring noisy images that requires no noisy model. The proposed method is inspired by a wavelet-based switching smoothness description of Hoelder function spaces. Specifically, we lower the smoothness of the blurred image locally or globally to obtain an enhanced image. Despite the simplicity of our method, significant improvement is reported in the experiment results in terms of image fidelity measure and visual effect.
Color transient improvement via range detection
In broadcast system, the image information is transmitted in the form of luminance and color difference signals. The color difference signals usually undergo blurs by the several reasons and result in smooth transition. It is important for the CTI algorithm not to produce color mismatch in the smooth transition as well as to make the transition sharp. In this paper, the new CTI algorithm which only needs to determine the transition range is proposed. Since the corrected signal does not rely on the high-frequency values, it does not reveal over- and undershoot near edges. To prevent the color mismatch, transition range is found on only one color difference channel. Experimental results show that our algorithm corrects blurred color edges well and is robust to the input images.
Segmentation of microspheres in ultrahigh density multiplexed microsphere-based assays
Abhishek Mathur, David M. Kelso
We have developed a method to identify and localize luminescent microspheres in dense images of microsphere-based assays. Application of this algorithm to the images of densely packed microspheres would aid in increasing the number of assays per unit target sample volume by several orders of magnitude. We immobilize or sediment microspheres on microscope slides and read luminescence from these randomly arrayed microspheres with a digital imaging microscope equipped with a cooled CCD camera. Our segmentation algorithm, which is based on marker-controlled watershed transformation, is then implemented to segment the microsphere clusters in the luminescent images acquired at different wavelengths. This segmentation algorithm is fully automated and require no manual intervention or training sets for optimizing the parameters and is much more accurate than previously proposed algorithms. Using this algorithm, we have accurately segmented more than 97% of the microspheres in dense images.
A multiscale approach to contour detection by texture suppression
Giuseppe Papari, Patrizio Campisi, Nicolai Petkov, et al.
In this paper we propose a multiscale biologically motivated technique for contour detection by texture suppression. Standard edge detectors react to all the local luminance changes, irrespective whether they are due to the contours of the objects represented in the scene, rather than to natural texture like grass, foliage, water, etc. Moreover, edges due to texture are often stronger than edges due to true contours. This implies that further processing is needed to discriminate true contours from texture edges. In this contribution we exploit the fact that, in a multiresolution analysis, at coarser scales, only the edges due to object contours are present while texture edges disappear. This is used in combination with surround inhibition, a biologically motivated technique for texture suppression, in order to build a contour detector which is insensitive to texture. The experimental results show that our approach is also robust to additive noise.
Phase unwrapping by means of finite differences
L. I. Olivos-Pérez, Enrique de la Rosa Miranda, Luis Raúl Berriel Valdos, et al.
Many problems in metrology and optical tomography have to recover information from the wrapping phase. In most of the cases, phase, that is associate to a physical magnitude, is continuous and generally, varies smoothly. Therefore, we can say that the problem in these cases is reduced to find a continuous phase. Considering this, many solutions to this kind of problems have been proposed, from the use of local planes to the implementation of most robust algorithms. However, these methods are also very slow. That is why the unwrapping problem is an open subject research in optics. We propose a phase unwrapping method based on finite differences that is fast and robust. In addition, it is easy to program.
Efficient Algorithms
icon_mobile_dropdown
Super-fast Fourier transform
In this paper, we have developed the recursive fast orthogonal mapping algorithms based fast Fourier transforms. Particularly, we introduced a new fast Fourier transform algorithm with linear multiplicative complexity. The proposed algorithms not only reduces the multiplicative complexity, but also is comparable to the existing methods such as Duhamel, Heideman, Burrus, Vetterli, Wang [11,15,16,19,21,27,28] in the total number of operations (arithmetic complexity, or the number of multiplications and additions).
A high-speed rotation method for binary document images based on coordinate operation of run data
Yoshihiro Shima, Hiroshi Ohya
The rotation of an image is one of the fundamental functions in image processing and is applied to document image processing in the office. A method of image rotation based on digital image data has been developed. This paper assumes the binary digital data, and proposes a method which is different from the traditional one based on pixel data. This method can execute a high-speed rotation of binary image based on coordinate data for the start and the end of the run. Using the proposed method, the image rotation at an arbitary angle can be realized by the real number operation on the run data, which is suited to the general-purpose processor. It is a practically useful method since the processing is fast and less memory capacity is required. In this paper, a discussion is made first on the format of the run data, the number of runs and the data complexity for the binary data. Then the newly devised rotation for the binary image is described. The rotation method is used to perform successively the skew coordinate transformations in the vertical and horizontal directions, to determine the rotated images. Finally, a document image is actually rotated on a conputer. The processing time was examined to demonstrate experimentally the usefulness of the proposed method.
A hardware implementation of the discrete Pascal transform for image processing
Thomas J. Goodman, Maurice F. Aburdene
The discrete Pascal transform is a polynomial transform with applications in pattern recognition, digital filtering, and digital image processing. It already has been shown that the Pascal transform matrix can be decomposed into a product of binary matrices. Such a factorization leads to a fast and efficient hardware implementation without the use of multipliers, which consume large amounts of hardware. We recently developed a field-programmable gate array (FPGA) implementation to compute the Pascal transform. Our goal was to demonstrate the computational efficiency of the transform while keeping hardware requirements at a minimum. Images are uploaded into memory from a remote computer prior to processing, and the transform coefficients can be offloaded from the FPGA board for analysis. Design techniques like as-soon-as-possible scheduling and adder sharing allowed us to develop a fast and efficient system. An eight-point, one-dimensional transform completes in 13 clock cycles and requires only four adders. An 8x8 two-dimensional transform completes in 240 cycles and requires only a top-level controller in addition to the one-dimensional transform hardware. Finally, through minor modifications to the controller, the transform operations can be pipelined to achieve 100% utilization of the four adders, allowing one eight-point transform to complete every seven clock cycles.
The discrete Gould transform and its applications
We present a new discrete transform, the Gould transform (DGT). The transform has many interesting mathematical properties. For example, the forward and inverse transform matrices are both lower triangular, with constant diagonals and sub-diagonals and both can be factored into the product of binary matrices. The forward transform can be used to detect edges in digital images. If G is the forward transform matrix and y is the image, then the two dimensional DGT, GyGT can be used directly to detect edges. Ways to improve the edge detection technique is to use the "combination of forward and backward difference", GT(Gy) to better identify the edges. For images that tend to have vertical and horizontal edges, we can further improve the technique by shifting rows (or columns), and then use the technique to detect edges, essentially applying the transform in the diagonal directions.
Image Processing Systems
icon_mobile_dropdown
Using clustering for document reconstruction
Anna Ukovich, Alessandra Zacchigna, Giovanni Ramponi, et al.
In the forensics and investigative science fields there may arise the need of reconstructing documents which have been destroyed by means of a shredder. In a computer-based reconstruction, the pieces are described by numerical features, which represent the visual content of the strips. Usually, the pieces of different pages have been mixed. We propose an approach for the reconstruction which performs a first clustering on the strips to ease the successive matching, be it manual (with the help of a computer) or automatic. A number of features, extracted by means of image processing algorithms, have been selected for this aim. The results show the effectiveness of the features and of the proposed clustering algorithm.
Automatic detection and tracking of reappearing targets in forward-looking infrared imagery
Target detection and tracking algorithms deal with the recognition of a variety of target images obtained from a multitude of sensor types, such as forward-looking infrared (FLIR), synthetic aperture radar and laser radar.1,2 Temporary disappearance and then reappearance of the target(s) in the field-of-view may be encountered during the tracking processes. To accommodate this problem, training based techniques have been developed using combination of two techniques; tuned basis functions (TBF) and correlation based template matching (TM) techniques. The TBFs are used to detect possible tentative target images. The detected candidate target images are then introduced into the second algorithm, called clutter rejection module, to determine the target reentering frame and location of the target. The performance of the proposed TBF-TM based reappeared target detection and tracking algorithm has been tested using real-world forward looking infrared video sequences.
Robust human motion detection via fuzzy-set-based image understanding
This paper presents an image understanding approach to monitor human movement and identify the abnormal circumstance by robust motion detection for the care of the elderly in a home-based environment. In contrast to the conventional approaches which apply either a single feature extraction scheme or a fixed object model for motion detection and tracking, we introduce a multiple feature extraction scheme for robust motion detection. The proposed algorithms include 1) multiple image feature extraction including the fuzzy compactness based detection of interesting points and fuzzy blobs, 2) adaptive image segmentation via multiple features, 3) Hierarchical motion detection, 4) a flexible model of human motion adapted in both rigid and non-rigid conditions, and 5) Fuzzy decision making via multiple features.
K-max: segmentation based on selection of max-tree deep nodes
Alexandre G. Silva, Siovani C. Felipussi, Roberto de Alencar Lotufo, et al.
This work proposes the segmentation of grayscale image from of its hierarchical region based representation. The Maxtree structure has demonstrated to be useful for this purpose, offering a semantic vision of the image, therefore, reducing the number of elements to process in relation to the pixel based representation. In this way, a particular searching in this tree can be used to determine regions of interest with lesser computational effort. A generic application of detection of peaks is proposed through searching nodes to kup steps from leaves in the Max-tree (this operator will be called k-max), being each node corresponds to a connected component. The results are compared with the optimal thresholding and the H-maxima technique.
Image Processing Methods
icon_mobile_dropdown
Shape-adaptive DCT for denoising and image reconstruction
Alessandro Foi, Kostadin Dabov, Vladimir Katkovnik, et al.
The shape-adaptive DCT (SA-DCT) can be computed on a support of arbitrary shape, but retains a computational complexity comparable to that of the usual separable block DCT. Despite the near-optimal decorrelation and energy compaction properties, application of the SA-DCT has been rather limited, targeted nearly exclusively to video compression. It has been recently proposed by the authors8 to employ the SA-DCT for still image denoising. We use the SA-DCT in conjunction with the directional LPA-ICI technique, which defines the shape of the transform's support in a pointwise adaptive manner. The thresholded or modified SA-DCT coefficients are used to reconstruct a local estimate of the signal within the adaptive-shape support. Since supports corresponding to different points are in general overlapping, the local estimates are averaged together using adaptive weights that depend on the region's statistics. In this paper we further develop this novel approach and extend it to more general restoration problems, with particular emphasis on image deconvolution. Simulation experiments show a state-of-the-art quality of the final estimate, both in terms of objective criteria and visual appearance. Thanks to the adaptive support, reconstructed edges are clean, and no unpleasant ringing artifacts are introduced by the fitted transform.
Anisotropic filtering with nonlinear structure tensors
Carlos-Alberto Castaño-Moraga, Juan Ruiz-Alzola
We present an anisotropic filtering scheme which uses a nonlinear version of the local structure tensor to dynamically adapt the shape of the neighborhood used to perform the estimation. In this way, only the samples along the orthogonal direction to that of maximum signal variation are chosen to estimate the value at the current position, which helps to better preserve boundaries and structure information. This idea sets the basis of an anisotropic filtering framework which can be applied for different kinds of linear filters, such as Wiener or LMMSE, among others. In this paper, we describe the underlying idea using anisotropic gaussian filtering which allows us, at the same time, to study the influence of nonlinear structure tensors in filtering schemes, as we compare the performance to that obtained with classical definitions of the structure tensor.
A modified wavelet-transformation-based-method of linear object extraction
Edges can be characterized through the evolution of a wavelet transformation at different scale levels. A two-dimensional wavelet transformation of a given image is proportional to the gradient of a corresponding smoothed image. Each component of a normal two-dimensional wavelet transformation is in fact a one-dimensional wavelet transformation in one variable followed by a smoothing process in the other variable. The modified wavelet transformation of the given image gets rid of the smoothing process in each component since the magnitude of the wavelet transformation in the center part of a linear object may be increased by the big magnitudes of the wavelet transformation along the edges if the smoothing process is adopted, which makes it hard to isolate the centerline of the linear object. The modified wavelet transformation gives high magnitudes along the edges and low magnitudes in the center part of the linear objects in the wavelet-transformed image. In the image showing the magnitude of the wavelet transformation, there are high ridges along the edges of the linear objects and low grey level valleys bounded by the ridges. A suitable threshold can be used to extract the low grey level part of the image, such that the center parts of the linear objects are included. Since they are separated from other objects, they can be easily extracted in a post-processing.
2D approaches to 3D watermarking: state of the art and perspectives
M. Mitrea, S. Duţă, F. Prêteux
With the advent of the Information Society, video, audio, speech, and 3D media represent the source of huge economic benefits. Consequently, there is a continuously increasing demand for protecting their related intellectual property rights. The solution can be provided by robust watermarking, a research field which exploded in the last 7 years. However, the largest part of the scientific effort was devoted to video and audio protection, the 3D objects being quite neglected. In the absence of any standardisation attempt, the paper starts by summarising the approaches developed in this respect and by further identifying the main challenges to be addressed in the next years. Then, it describes an original oblivious watermarking method devoted to the protection of the 3D objects represented by NURBS (Non uniform Rational B Spline) surfaces. Applied to both free form objects and CAD models, the method exhibited very good transparency (no visible differences between the marked and the unmarked model) and robustness (with respect to both traditional attacks and to NURBS processing).
Region-based perceptual grouping: a cooperative approach based on the Dempster-Shafer theory
Nicolas Zlatoff, Bruno Tellez, Atilla Baskurt
As segmentation step does not allow recovering semantic objects, perceptual grouping is often used to overcome segmentation's lacks. This refers to the ability of human visual system to impose structure and regularity over signal-based data. Gestalt psychologists have exhibited some properties which seem to be at work for perceptual grouping and some implementations have been proposed by computer vision. However, few of these works model the use of several properties in order to trigger a grouping, even if it can lead to an increase in robustness. We propose a cooperative approach for perceptual grouping by combining the influence of several Gestalt properties for each hypothesis. We make use of Dempster-Shafer formalism, as it can prevent conflicting hypotheses from jamming the grouping process.
Fast classification and segmentation of high-resolution images of multiple and complicated colonies
Weixing Wang, Lei Li
This paper presents a methodology for high resolution image classification and segmentation. The size and information volume of the images, taken by a high resolution digital camera, will be tens to hundreds times as the ones taken by an ordinary CCD camera. In order to speed up the image segmentation process of the large images, we classify the images first by using a low resolution image, then, segment them by a fast segmentation algorithm. The algorithm is studied mainly based on multi-resolution technique and the fusion of edge detection result and similarity segmentation result. By use this methodology, the whole image segmentation process time is reduced by tens' times than traditional segmentation methods. And the accuracy of the image segmentation is not decreased.
Spatially adaptive multi-resolution multi-spectral image fusion based on Bayesian approach
In this paper, we propose two new spatially adaptive image fusion algorithms based on Bayesian approach for merging remotely sensed panchromatic and multi-spectral images. The two complementary images are modeled as correlated two dimensional stochastic signals and the high-resolution multi-spectral image is estimated by minimizing the mean squared error between the original high-resolution image and the estimated image. We assume that the estimator is locally linear and obtain the local linear minimum mean square error (MMSE) estimator for image fusion. Two MMSE image fusion algorithms are derived on different assumptions of the images. If we assume that pixels in the images are uncorrelated with their neighbors, the estimator becomes a point processor which is controlled by an adaptive gain expressed by the ratio of local cross-covariance between the two images and the local variance of the panchromatic image. On the other hand, if we assume that pixels in a small block are considered stationary and correlated with one another, the estimator uses the locally stationary cross-covariance matrix between the two images and auto-covariance matrix of the panchromatic image. For the second algorithm, we take Fast Fourier Transform (FFT) based approach in order to avoid complex matrix computations and achieve a fast algorithm. Experimental results show that the proposed algorithms are superior to conventional algorithms according to visual and quantitative comparisons.
Biomedical Image Processing
icon_mobile_dropdown
Study of muscular deformation based on surface slope estimation
M. Carli, M. Goffredo, M. Schmid, et al.
During contraction and stretching, muscles change shape and size, and produce a deformation of skin tissues and a modification of the body segment shape. In human motion analysis, it is very important to take into account these phenomena. The aim of this work is the evaluation of skin and muscular deformation, and the modeling of body segment elastic behavior obtained by analysing video sequences that capture a muscle contraction. The soft tissue modeling is accomplished by using triangular meshes that automatically adapt to the body segment during the execution of a static muscle contraction. The adaptive triangular mesh is built on reference points whose motion is estimated by using non linear operators. Experimental results, obtained by applying the proposed method to several video sequences, where biceps brachial isometric contraction was present, show the effectiveness of this technique.
Variational segmentation of x-ray image with overlapped objects
Image segmentation is a classical and challenging problem in image processing and computer vision. Most of the segmentation algorithms, however, do not consider overlapped objects. Due to the special characteristics of X-ray imaging, the overlapping of objects is very commonly seen in X-ray images and needs to be carefully dealt with. In this paper, we propose a novel energy functional to solve this problem. The Euler-Lagrange equation is derived and the segmentation is converted to a front propagating problem that can be efficiently solved by level set methods. We noticed that the proposed energy functional has no unique extremum and the solution relies on the initialization. Thus, an initialization method is proposed to get satisfying results. The experiment on real data validated our proposed method.
Image segmentation for automated dental identification
Dental features are one of few biometric identifiers that qualify for postmortem identification; therefore, creation of an Automated Dental Identification System (ADIS) with goals and objectives similar to the Automated Fingerprint Identification System (AFIS) has received increased attention. As a part of ADIS, teeth segmentation from dental radiographs films is an essential step in the identification process. In this paper, we introduce a fully automated approach for teeth segmentation with goal to extract at least one tooth from the dental radiograph film. We evaluate our approach based on theoretical and empirical basis, and we compare its performance with the performance of other approaches introduced in the literature. The results show that our approach exhibits the lowest failure rate and the highest optimality among all full automated approaches introduced in the literature.
Ad hoc segmentation pipeline for microarray image analysis
Microarray is a new class of biotechnologies able to help biologist researches to extrapolate new knowledge from biological experiments. Image Analysis is devoted to extrapolate, process and visualize image information. For this reason it has found application also in Microarray, where it is a crucial step of this technology (e.g. segmentation). In this paper we describe MISP (Microarray Image Segmentation Pipeline), a new segmentation pipeline for Microarray Image Analysis. The pipeline uses a recent segmentation algorithm based on statistical analysis coupled with K-Means algorithm. The Spot masks produced by MISP are used to determinate spots information and quality measures. A software prototype system has been developed; it includes visualization, segmentation, information and quality measure extraction. Experiments show the effectiveness of the proposed pipeline both in terms of visual accuracy and measured quality values. Comparisons with existing solutions (e.g. Scanalyze) confirm the improvement with respect to previously published works.
A heuristic approach for the extraction of region and boundary of mammalian cells in bio-electric images
A robust segmentation and boundary tracking technique for the extraction of the region and boundary of mammalian cells in bioelectric images is presented. The proposed algorithm consists of four steps. The first step is an image enhancement process composed of low-pass filtering and local contrast enhancement. The second step employs recursive global adaptive thresholding method based on the statistical information of the contrast enhanced image to separate cells and some other image features from the background. Due to the effective image enhancement produced in the previous step, global adaptive thresholding is sufficient to provide satisfactory image thresholding results. The third step in the segmentation process is composed of boundary tracking and morphological measurement for cell detection. A new efficient boundary tracking scheme is proposed. In the last step non-cell objects are found and removed from the segmented image based on the morphological information obtained in the last step.
An efficient multi-resolution GA approach to dental image alignment
Diaa Eldin Nassar, Mythili Ogirala, Donald Adjeroh, et al.
Automating the process of postmortem identification of individuals using dental records is receiving an increased attention in forensic science, especially with the large volume of victims encountered in mass disasters. Dental radiograph alignment is a key step required for automating the dental identification process. In this paper, we address the problem of dental radiograph alignment using a Multi-Resolution Genetic Algorithm (MR-GA) approach. We use location and orientation information of edge points as features; we assume that affine transformations suffice to restore geometric discrepancies between two images of a tooth, we efficiently search the 6D space of affine parameters using GA progressively across multi-resolution image versions, and we use a Hausdorff distance measure to compute the similarity between a reference tooth and a query tooth subject to a possible alignment transform. Testing results based on 52 teeth-pair images suggest that our algorithm converges to reasonable solutions in more than 85% of the test cases, with most of the error in the remaining cases due to excessive misalignments.
Algorithms and Systems
icon_mobile_dropdown
Deblending of the UV photometry in GALEX deep surveys using optical priors in the visible wavelengths
M. Guillaume, A. Llebaria, D. Aymeric, et al.
The GALEX mission of NASA, is collecting an unprecedent set of astronomical UV data in the far and the near UV range. The telescope measures the full sky in a continuous automatic scan. Knowing the attitude data, local images are simultaneously extracted and corrected for smearing and instrumental effects. Final UV images show, by far, a lower resolution than their visible counterpart. It originates blends, ambiguities and missidentifications of the astronomical sources. Our purpose is to deduce from the UV image the UV photometry of the visible objets through a bayesian approach, using the visible data (catalog and image) as the starting reference for the UV analysis. For the feasibility reasons as the deep field images are very large, a segmentation procedure has been defined to manage the analysis in a tractable form. The present paper discusses all these aspects and details the full method and performances.
Comparative study of logarithmic enhancement algorithms with performance measure
Performance measures of image enhancement are traditionally subjective and have difficulty quantifying the improvement made by the algorithm. In this paper, we present the image enhancement measures and show how utilizing logarithmic arithmetic based addition, subtraction, and multiplication provides better results than previously used measures. In addition, for illustration of the performance of developed measures, we present a comprehensive study of several image enhancement algorithms from all three domains, including spatial, transform, and logarithmic algorithms.
Image denoising with block-matching and 3D filtering
Kostadin Dabov, Alessandro Foi, Vladimir Katkovnik, et al.
We present a novel approach to still image denoising based on effective filtering in 3D transform domain by combining sliding-window transform processing with block-matching. We process blocks within the image in a sliding manner and utilize the block-matching concept by searching for blocks which are similar to the currently processed one. The matched blocks are stacked together to form a 3D array and due to the similarity between them, the data in the array exhibit high level of correlation. We exploit this correlation by applying a 3D decorrelating unitary transform and effectively attenuate the noise by shrinkage of the transform coefficients. The subsequent inverse 3D transform yields estimates of all matched blocks. After repeating this procedure for all image blocks in sliding manner, the final estimate is computed as weighed average of all overlapping blockestimates. A fast and efficient algorithm implementing the proposed approach is developed. The experimental results show that the proposed method delivers state-of-art denoising performance, both in terms of objective criteria and visual quality.
An algorithm for the enhancement of images of large dynamic range
This paper introduces a novel algorithm for the enhancement of images of large dynamic range. In addition to dynamic range compression, the method provides control of brightness and contrast. The dynamic range compression is done in the spatial domain using the log transformation. Brightness and contrast control are done in the biorthogonal 9/7 wavelet transform domain. The algorithm can be easily added on to a JPEG2000 image compression system with only a modest increase in the computational complexity. Experimental results have shown comparable visual quality with the previously published Retinex algorithm.
Nonlinear image enhancement to improve face detection in complex lighting environment
Li Tao, Ming-Jung Seow, Vijayan K. Asari
A robust and efficient image enhancement technique has been developed to improve the visual quality of digital images that exhibit dark shadows due to the limited dynamic ranges of imaging and display devices which are incapable of handling high dynamic range scenes. The proposed technique processes images in two separate steps: dynamic range compression and local contrast enhancement. Dynamic range compression is a neighborhood dependent intensity transformation which is able to enhance the luminance in dark shadows while keeping the overall tonality consistent with that of the input image. The image visibility can be largely and properly improved without creating unnatural rendition in this manner. A neighborhood dependent local contrast enhancement method is used to enhance the images contrast following the dynamic range compression. Experimental results on the proposed image enhancement technique demonstrates strong capability to improve the performance of convolutional face finder compared to histogram equalization and multiscale Retinex with color restoration without compromising the false alarm rate.
The application of image filters combined with the nonlinear regression analysis on optical autofocusing
Meng-En Lee, Wen-Jun Hsu, Tsung-Nan Lin
This paper presents an optical auto-focusing system that is implemented by integrating a real-time auto-focusing algorithm, an image capturing and processing module and a stepper motor. Several image filters are tested and compared through the system for their effects on suppressing noise to accelerate the auto-focusing procedure. Besides, a nonlinear regression method is applied in the data analysis for the system to quickly move the stepper motor to the focus. The concise and effective algorithm can be applied on digital cameras for auto-focusing with noise reduction.
Lip segmentation and tracking for facial palsy
MinJae Park, JongMo Seo, KwangSuk Park
We developed the asymmetry analyzing system for facial palsy patient's rehabilitation progress study. Using PC standard imaging device, captured 640*480 RGB image is converted into HSV space. A Lip-shape mask is extracted by thresholding. By taking 5 regions consisted in one region on lip and four regions on face skin, reasonable thresholds are determined by Fuzzy C-Means clustering. The extreme points on the lip shape mask are extracted to get the seeds for tracking. Segmented seed points are tracking by Iterative Lucas-Kanade tracking method in pyramids at 30 fps and recording simultaneously. To reduce the disk writing load on computer, we use asynchronous mode file writing, which is going to transfer to and review by clinician. Tracking shows quite reliable results, but sometimes the tracked points are following along the lip line because of the similar contrasts. Therefore, the first strategy to improve the reliability of tracking is using the high contrast points, such as left and right maximal point of lip shape. The second is clustering some points near the maximal points and eliminating outlying tracking points. The third is rechecking the lip shape using lip segmentation when the operator confirms that subject's maximal lip moving. Left and right tracking points are compared in forms of trajectory plot.
Kernel-based Learning for Detection and Shape Analysis
icon_mobile_dropdown
Nonlinear shape prior from kernel space for geometric active contours
Samuel Dambreville, Yogesh Rathi, Allen Tannenbaum
The Geometric Active Contour (GAC) framework, which utilizes image information, has proven to be quite valuable for performing segmentation. However, the use of image information alone often leads to poor segmentation results in the presence of noise, clutter or occlusion. The introduction of shapes priors in the contour evolution proved to be an effective way to circumvent this issue. Recently, an algorithm was proposed, in which linear PCA (principal component analysis) was performed on training sets of data and the shape statistics thus obtained were used in the segmentation process. This approach was shown to convincingly capture small variations in the shape of an object. However, linear PCA assumes that the distribution underlying the variation in shapes is Gaussian. This assumption can be over-simplifying when shapes undergo complex variations. In the present work, we derive the steps for using Kernel PCA to in the GAC framework to introduce prior shape knowledge. Several experiments were performed using different training-sets of shapes. Starting with any initial contour, we show that the contour evolves to adopt a shape that is faithful to the elements of the training set. The proposed shape prior method leads to better performances than the one involving linear PCA.
Kernel subspace matched target detectors
In this paper, we compare several detection algorithms that are based on spectral matched (subspace) filters. Nonlinear (kernel) versions of these spectral matched (subspace) detectors are also discussed and their performance is compared with the linear versions. These kernel-based detectors exploit the nonlinear correlations between the spectral bands that are ignored by the conventional detectors. Several well-known matched detectors, such as matched subspace detector, orthogonal subspace detector, spectral matched filter and adaptive subspace detector (adaptive cosine estimator) are extended to their corresponding kernel versions by using the idea of kernel-based learning theory. In kernel-based detection algorithms the data is implicitly mapped into a high dimensional kernel feature space by a nonlinear mapping which is associated with a kernel function. The detection algorithm is then derived in the feature space which is kernelized in terms of the kernel functions in order to avoid explicit computation in the high dimensional feature space. Experimental results based on simulated toy-examples and real hyperspectral imagery show that the kernel versions of these detectors outperform the conventional linear detectors.
Statistical shape analysis using kernel PCA
Yogesh Rathi, Samuel Dambreville, Allen Tannenbaum
Mercer kernels are used for a wide range of image and signal processing tasks like de-noising, clustering, discriminant analysis etc. These algorithms construct their solutions in terms of the expansions in a high-dimensional feature space F. However, many applications like kernel PCA (principal component analysis) can be used more effectively if a pre-image of the projection in the feature space is available. In this paper, we propose a novel method to reconstruct a unique approximate pre-image of a feature vector and apply it for statistical shape analysis. We provide some experimental results to demonstrate the advantages of kernel PCA over linear PCA for shape learning, which include, but are not limited to, ability to learn and distinguish multiple geometries of shapes and robustness to occlusions.
Fuzzy Clustering
icon_mobile_dropdown
Segmentation and enhancement of digital copies using a new fuzzy clustering method
In this paper, we introduce a new system to segment and label document images into text, halftoned images, and background using a modified fuzzy c-means (FCM) algorithm. Each pixel is assigned a feature vector, extracted from edge information and gray level distribution. The feature pattern is then assigned to a specific region using the modified fuzzy c-means approach. In the process of minimizing the new objective function, the neighborhood effect acts as a regularizer and biases the solution towards piecewise-homogeneous labelings. Such a regularization is useful in segmenting scans corrupted by scanner noise.
Independent Component Analysis, Adaboost for Recognition
icon_mobile_dropdown
2D/3D facial feature extraction
Hatice Çinar Akakin, Albert Ali Salah, Lale Akarun, et al.
We propose and compare three different automatic landmarking methods for near-frontal faces. The face information is provided as 480x640 gray-level images in addition to the corresponding 3D scene depth information. All three methods follow a coarse-to-fine suite and use the 3D information in an assist role. The first method employs a combination of principal component analysis (PCA) and independent component analysis (ICA) features to analyze the Gabor feature set. The second method uses a subset of DCT coefficients for template-based matching. These two methods employ SVM classifiers with polynomial kernel functions. The third method uses a mixture of factor analyzers to learn Gabor filter outputs. We contrast the localization performance separately with 2D texture and 3D depth information. Although the 3D depth information per se does not perform as well as texture images in landmark localization, the 3D information has still a beneficial role in eliminating the background and the false alarms.
Neural Networks Applications for Manifold Learning, Recognition, Color Perception, and Compression
icon_mobile_dropdown
Manifold of color perception: color constancy using a nonlinear line of attraction
Ming-Jung Seow, Vijayan K. Asari
In this paper, we propose the concept of manifold of color perception based on an observation that the perceived color in a set of similar color images defines a manifold in the high dimensional space. Such a manifold representation can be learned from a few images of similar color characteristics. This learned manifold can then be used as a basis for color correction of the images having different color perception to the previously learned color. To learn the manifold for color perception, we propose a novel learning algorithm based on a recurrent neural network. Unlike the conventional recurrent neural network model in which the memory is stored in an attractive fixed point at discrete locations in the state space, the dynamics of the proposed learning algorithm represents memory as a line of attraction. The region of convergence at the line of attraction is defined by the statistical characteristics of the training data. We demonstrate experimentally how we can use the proposed manifold to color-balance the common lighting variations in the environment.
Toward content-based object recognition with image primitive
Guisong Wang, Jason Kinser
Content-based object recognition is very useful in many applications, such as medical image processing and diagnosis, target identification with satellite remote sensing. For content-based object recognition, the representation of image segments is critical. Although there are already some approaches to represent image shapes, many of them have limitations because of their insensitivity to the deviations of object appearance. In this paper, an approach is proposed by constructing an image primitive database and representing image with a basis set extracted from the database. The cortical modeling is used here to extract the basis set by isolating the inherent shapes within each image from an image database and defines shapes from this basis set. In our approach, image segments are clustered based on similarity in perimeter and size instead of centroid based metrics by employing the fractional power filter, and the clusters are represented in descriptive vectors as signatures and form basis for shape representation. This approach has advantages in sensitivity to the idiosyncratic nature of the distribution of shapes and efficiency. For validation, we selected a large number of images from web sites randomly. The experiments indicate that describing shapes from this basis set is robust to alterations of the shape such as small occlusions, limited skew, and limited range.
Translation invariance in a network of oscillatory units
A. Ravishankar Rao, Guillermo A. Cecchi, Charles C. Peck, et al.
One of the important features of the human visual system is that it is able to recognize objects in a scale and translational invariant manner. However, achieving this desirable behavior through biologically realistic networks is a challenge. The synchronization of neuronal firing patterns has been suggested as a possible solution to the binding problem (where a biological mechanism is sought to explain how features that represent an object can be scattered across a network, and yet be unified). This observation has led to neurons being modeled as oscillatory dynamical units. It is possible for a network of these dynamical units to exhibit synchronized oscillations under the right conditions. These network models have been applied to solve signal deconvolution or blind source separation problems. However, the use of the same network to achieve properties that the visual sytem exhibits, such as scale and translational invariance have not been fully explored. Some approaches investigated in the literature (Wallis, 1996) involve the use of non-oscillatory elements that are arranged in a hierarchy of layers. The objects presented are allowed to move, and the network utilizes a trace learning rule, where a time averaged output value is used to perform Hebbian learning with respect to the input value. This is a modification of the standard Hebbian learning rule, which typically uses instantaneous values of the input and output. In this paper we present a network of oscillatory amplitude-phase units connected in two layers. The types of connections include feedforward, feedback and lateral. The network consists of amplitude-phase units that can exhibit synchronized oscillations. We have previously shown that such a network can segment the components of each input object that most contribute to its classification. Learning is unsupervised and based on a Hebbian update, and the architecture is very simple. We extend the ability of this network to address the problem of translational invariance. We show that by adopting a specific treatment of the phase values of the output layer, the network exhibits translational invariant object representation. The scheme used in training is as follows. The network is presented with an input, which then moves. During the motion the amplitude and phase of the upper layer units is not reset, but continues with the past value before the introduction of the object in the new position. Only the input layer is changed instantaneously to reflect the moving object. The network behavior is such that it categorizes the translated objects with the same label as the stationary object, thus establishing an invariant categorization with respect to translation. This is a promising result as it uses the same framework of oscillatory units that achieves synchrony, and introduces motion to achieve translational invariance.
Support Vector Machine and Neural Networks for Face Recognition, Detection, and Classification
icon_mobile_dropdown
Support vector machine as digital image watermark detector
Patrick H. H. Then, Y. C. Wang
We perceive the digital watermark detection as classification problem in image processing. We classify watermarked images as positive class whilst unwatermarked images as negative class. Support Vector Machine (SVM) is used as classifier of the watermarked and unwatermarked digital images. Two watermarking schemes i.e. Cox's spread spectrum (SS) and Single Value Decomposition (SVD) are used to embed watermark into digital images. These algorithms are selected based on their different level of robustness to Stirmark attacks. The payload of the watermark used for both algorithms is consistent at certain number of bits. SVM is trained with both the watermarked and unwatermarked images. Receiver Operating Characteristics (ROC) graphs are plotted to assess the statistical detection behavior of both the correlation detector and SVM classifier. We found that straight forward application of SVM leads to generalization problem. We suggest remedies to preprocess the training data in order to achieve substantially better performance from SVM classifier than those resulting from the straightforward application of SVM. Both watermarked and unwatermarked images are attacked under Stirmark and are then tested with the correlation detectors and SVM classifier. A comparison of the ROC of the correlation detectors and SVM classifier is performed to assess the accuracy of SVM classifier relative to correlation detectors. We found that SVM classifier has higher robustness to Stirmark attacks.
Neural networks approach to high vertical resolution atmospheric temperature profile retrieval from spaceborne high spectral resolution infrared sounder measurements
Deming Jiang, Chaohua Dong, Weisong Lu
AIRS (Atmospheric Infra-Red Sounder) as NASA's first high spectral resolution sounding instrument provides both new and improved measurements of clouds, atmosphere, and land and oceans, with higher accuracy and higher resolution required by future weather and climate models. It will largely improve the deficiencies of the inability of current sounders (e.g. HIRS-3) to obtain high vertical resolution of retrieved atmosphere profiles. In this paper, temperature profiles with 1km vertical resolution at 100 pressure layers, from surface up to 0.005 hPa, were retrieved on different spectral bands and on different types of terrain in the middle latitude area by using a three-layered feed-forward neural networks with back-propagation algorithm. Results show that temperature profiles with accuracy of less than 1K in 1 km thick tropospheric layers can be achieved by using AIRS data and neural networks method. And the Qinghai-Tibet Plateau has a measurably impact on the retrieval accuracy which is corresponding to the spectral bands used in performing retrievals. A promising approach to the elimination of this effect is to apply additional predictors which are non-satellite observed (e.g. surface altitude).
Probabilistic multi-resolution human classification
Jun Tu, H. Ran
Recently there has been some interest in using infrared cameras for human detection because of the sharply decreasing prices of infrared cameras. The training data used in our work for developing the probabilistic template consists images known to contain humans in different poses and orientation but having the same height. Multiresolution templates are performed. They are based on contour and edges. This is done so that the model does not learn the intensity variations among the background pixels and intensity variations among the foreground pixels. Each template at every level is then translated so that the centroid of the non-zero pixels matches the geometrical center of the image. After this normalization step, for each pixel of the template, the probability of it being pedestrian is calculated based on the how frequently it appears as 1 in the training data. We also use periodicity gait to verify the pedestrian in a Bayesian manner for the whole blob in a probabilistic way. The videos had quite a lot of variations in the scenes, sizes of people, amount of occlusions and clutter in the backgrounds as is clearly evident. Preliminary experiments show the robustness.
Applications of Neural Networks for text spotting, character and face recognition
icon_mobile_dropdown
Key-text spotting in documentary videos using Adaboost
M. Lalonde, L. Gagnon
This paper presents a method for spotting key-text in videos, based on a cascade of classifiers trained with Adaboost. The video is first reduced to a set of key-frames. Each key-frame is then analyzed for its text content. Text spotting is performed by scanning the image with a variable-size window (to account for scale) within which simple features (mean/variance of grayscale values and x/y derivatives) are extracted in various sub-areas. Training builds classifiers using the most discriminant spatial combinations of features for text detection. The text-spotting module outputs a decision map of the size of the input key-frame showing regions of interest that may contain text suitable for recognition by an OCR system. Performance is measured against a dataset of 147 key-frames extracted from 22 documentary films of the National Film Board (NFB) of Canada. A detection rate of 97% is obtained with relatively few false alarms.
Research on classifying performance of SVMs with basic kernel in HCCR
Limin Sun, Zhaoxin Gai
It still is a difficult task for handwritten chinese character recognition (HCCR) to put into practical use. An efficient classifier occupies very important position for increasing offline HCCR rate. SVMs offer a theoretically well-founded approach to automated learning of pattern classifiers for mining labeled data sets. As we know, the performance of SVM largely depends on the kernel function. In this paper, we investigated the classification performance of SVMs with various common kernels in HCCR. We found that except for sigmoid kernel, SVMs with polynomial kernel, linear kernel, RBF kernel and multi-quadratic kernel are all efficient classifier for HCCR, their behavior has a little difference, taking one with another, SVM with multi-quadratic kernel is the best.
Face recognition based on HMM in compressed domain
In this paper we present an approach for face recognition based on Hidden Markov Model (HMM) in compressed domain. Each individual is regarded as an HMM which consists of several face images. A set of DCT coefficients as observation vectors obtained from original images by a window are clustered by K-means method using to be the feature of face images. These classified features are applied to train HMMs, so as to get the parameters of systems. Based on the proposed method, both Yale face database and ORL face database are tested. Compared to the other methods relevant to HMM methods reported so far on the two face databases, experimental results by proposed method have shown a better recognition rate and lower computational complexity cost.
Artificial neural networks and decision tree classifier performance on medium resolution ASTER data to detect gully networks in southern Italy
A. Ghaffari, G. Priestnall, M. L. Clarke
Gully erosion has the potential to cause significant land degradation, yet the scale of gully features means that changes are difficult to map. Here we describe the application of ASTER imagery, surface modelling and land cover information to detect gully erosion networks with maximum obtainable accuracy. A grey level co-occurrence matrix (GLCM) texture analysis method was applied to ASTER bands as one of the input layers. GLCM outputs were combined with geomorphological input layers such as flow accumulation, slope angle and aspect, which were derived from an ASTER-based digital elevation model (DEM). The ASTER-based DEM with 15-meter resolution was prepared from L1A. Artificial neural networks (ANN) and decision tree (DT) approaches have been used to classify input layers for five sample areas. This differentiates gullies from landscape areas with no gullies. We found that DT methods classified the image with the highest accuracy (85% overall) in comparison with the ANN.