Image Processing: Algorithms and Systems XII

Front Matter: Volume 9019

Show abstract

This PDF file contains the front matter associated with SPIE Proceedings Volume 9019, including the Title Page, Copyright information, Table of Contents, Invited Panel Discussion, and Conference Committee listing.

On the passband of head-parallax displays

Atanas Boev, Robert Bregovic

Show abstract

We present a methodology to assess objective visual quality of multi-view and light-field displays. We consider the display as a signal processing channel and study its ability to deliver a signal while introducing negligible distortions. We start by creating a model of a display, which represents its output as a set of rays in a specific (x, y, o) coordinate space. We created a simulation framework that can use the model and render the expected output of the display for a given observation position. The framework employs an image analysis block, which aims to predict the perceptual effect of the introduced distortions and judge if the original signal is still predominant in the output. Using the framework, we can try a large set of test signals against the display model and find the ones, which are represented with sufficiently low distortion levels. We use test signals, which contain gradually changing frequency components, and use the results of the tests to build the so-called 3D passband of the display. The 3D passband can be used as a quantitative measure of the display’s ability to faithfully represent image details. The size of the passband is indicative of the spatial and angular resolution of the display. We created two display models to serve as an example case for out framework. One model represents a typical multiview display, and the other is representing a typical projection-based light-field display. We estimate the passband for each display model and present the results. The resulting passbands suggest, that for a given “ray-budget”, the ray distribution typical for light-field displays results on a wider and more uniform passband than in the case with multiview displays.

A novel method of filtration by the discrete heap transforms

Artyom M. Grigoryan, Mehdi Hajinoroozi

Show abstract

In this paper, we describe the method of filtering the frequency components of the signals and images, by using the discrete signal-induced heap transforms (DsiHT), which are composed by elementary rotations or Givens transformations. The transforms are fast, because of a simple form of decomposition of their matrices, and they can be applied for signals of any length. Fast algorithms of calculation of the direct and inverse heap transforms do not depend on the length of the processed signals. Due to construction of the heap transform, if the input signal contains an additive component which is similar to the generator, this component is eliminated in the transform of this signal, while preserving the remaining components of the signal. The energy of this component is preserved in the first point, only. In particular case, when such component is the wave of a given frequency, this wave is eliminated in the heap transform. Different examples of the filtration over signals and images by the DsiHT are described and compared with the known method of the Fourier transform.

Alpha-rooting method of color image enhancement by discrete quaternion Fourier transform

Artyom M. Grigoryan, Sos S. Agaian

Show abstract

This paper presents a novel method for color image enhancement based on the discrete quaternion Fourier transform. We choose the quaternion Fourier transform, because it well-suited for color image processing applications, it processes all 3 color components (R,G,B) simultaneously, it capture the inherent correlation between the components, it does not generate color artifacts or blending , finally it does not need an additional color restoration process. Also we introduce a new CEME measure to evaluate the quality of the enhanced color images. Preliminary results show that the α-rooting based on the quaternion Fourier transform enhancement method out-performs other enhancement methods such as the Fourier transform based α-rooting algorithm and the Multi scale Retinex. On top, the new method not only provides true color fidelity for poor quality images but also averages the color components to gray value for balancing colors. It can be used to enhance edge information and sharp features in images, as well as for enhancing even low contrast images. The proposed algorithms are simple to apply and design, which makes them very practical in image enhancement.

Edge preserving motion estimation with occlusions correction for assisted 2D to 3D conversion

Petr Pohl, Michael Sirotenko, Ekaterina Tolstaya, et al.

Show abstract

In this article we propose high quality motion estimation based on variational optical flow formulation with non-local regularization term. To improve motion in occlusion areas we introduce occlusion motion inpainting based on 3-frame motion clustering. Variational formulation of optical flow proved itself to be very successful, however a global optimization of cost function can be time consuming. To achieve acceptable computation times we adapted the algorithm that optimizes convex function in coarse-to-fine pyramid strategy and is suitable for modern GPU hardware implementation. We also introduced two simplifications of cost function that significantly decrease computation time with acceptable decrease of quality. For motion clustering based motion inpaitning in occlusion areas we introduce effective method of occlusion aware joint 3-frame motion clustering using RANSAC algorithm. Occlusion areas are inpainted by motion model taken from cluster that shows consistency in opposite direction. We tested our algorithm on Middlebury optical flow benchmark, where we scored around 20th position, but being one of the fastest method near the top. We also successfully used this algorithm in semi-automatic 2D to 3D conversion tool for spatio-temporal background inpainting, automatic adaptive key frame detection and key points tracking.

Exemplar-based inpainting using local binary patterns

V. V. Voronin, V. I. Marchuk, N. V. Gapon, et al.

Show abstract

This paper focuses on novel image reconstruction method based on modified exemplar-based technique. The basic idea is to find an example (patch) from an image using local binary patterns, and replacing non-existed (‘lost’) data with it. We propose to use multiple criteria for a patch similarity search since often in practice existed exemplar-based methods produce unsatisfactory results. The criteria for searching the best matching uses several terms, including Euclidean metric for pixel brightness and Chi-squared histogram matching distance for local binary patterns. A combined use of textural geometric characteristics together with color information allows to get more informative description of the patches. Texture synthesis method proposed by Efros and Freeman for patch restoration is utilized in the proposed method. It allows optimizing an overlap region between patches using minimum error boundary cut. Several examples considered in this paper show the effectiveness of the proposed approach for large objects removal as well as recovery of small regions on several test images.

Local feature descriptor based on 2D local polynomial approximation kernel indices

A. I. Sherstobitov, V. I. Marchuk, D. V. Timofeev, et al.

Show abstract

A texture descriptor based on a set of indices of degrees of local approximating polynomials is proposed in this paper. First, a method to construct 2D local polynomial approximation kernels (k-LPAp) for arbitrary polynomials of degree p is presented. An image is split into non-overlapping patches, reshaped into one-dimensional source vectors and convolved with the polynomial approximation kernels of various degrees. As a result, a set of approximations is obtained. For each element of the source vector, these approximations are ranked according to the difference between the original and approximated values. A set of indices of polynomial degrees form a local feature. This procedure is repeated for each pixel. Finally, a proposed texture descriptor is obtained from the frequency histogram of all obtained local features. A nearest neighbor classifier utilizing Chi-square distance metric is used to evaluate a performance of the introduced descriptor. An accuracy of texture classification is evaluated on the following datasets: Brodatz, KTH-TIPS, KTH-TIPS2b and Columbia-Utrecht (CUReT) with respect to different methods of texture analysis and classification. The results of this comparison show that the proposed method is competitive with the recent statistical approaches such as local binary patterns (LBP), local ternary patterns, completed LBP, Weber’s local descriptor, and VZ algorithms (VZMR8 and VZ-Joint). At the same time, on KTH-TIPS2-b and KTH-TIPS datasets, the proposed method is slightly inferior to some of the state-of-the-art methods.

Metric performance in similar blocks search and their use in collaborative 3D filtering of grayscale images

Aleksey S. Rubel, Vladimir V. Lukin, Karen O. Egiazarian

Show abstract

Similar blocks (patches) search plays an important role in image processing. However, there are many factors making this search problematic and leading to errors. Noise in images that arises due to bad acquisition conditions or other sources is one of the main factors. Performance of similar patch search might make worse dramatically if noise level is high and/or if noise is not additive, white and Gaussian. In this paper, we consider the influence of similarity metrics (distances) on search performance. We demonstrate that robustness of similarity metrics is a crucial issue for performance of similarity search. Two models of additive noise are used: AWGN and spatially correlated noise with a wide set of noise standard deviations. To investigate metric performance, five test images are used for artificially inserted group of identical blocks. Metric effectiveness evaluation is carried out for nine different metric (including several unconventional ones) in three domains (one spatial and two spectral). It is shown that conventional Euclidian metric might be not the best choice which depends upon noise properties and data processing domain. After establishing the best metrics, they are exploited within non-local image denoising, namely the BM3D filter. This filter is applied to intensity images of the database TID2008. It is demonstrated that the use of more robust metrics instead of classical ones (Euclidean) in BM3D filter allows improving similar block search and, as a result, provides better results of image denoising for the case of spatially correlated noise.

Generalized non-local means filtering for image denoising

Sudipto Dolui, Iván C. Salgado Patarroyo, Oleg V. Michailovich

Show abstract

Non-local means (NLM) filtering has been shown to outperform alternative denoising methodologies under the model of additive white Gaussian noise contamination. Recently, several theoretical frameworks have been developed to extend this class of algorithms to more general types of noise statistics. However, many of these frameworks are specifically designed for a single noise contamination model, and are far from optimal across varying noise statistics. The NLM filtering techniques rely on the definition of a similarity measure, which quantifies the similarity of two neighbourhoods along with their respective centroids. The key to the unification of the NLM filter for different noise statistics lies in the definition of a universal similarity measure which is guaranteed to provide favourable performance irrespective of the statistics of the noise. Accordingly, the main contribution of this work is to provide a rigorous statistical framework to derive such a universal similarity measure, while highlighting some of its theoretical and practical favourable characteristics. Additionally, the closed form expressions of the proposed similarity measure are provided for a number of important noise scenarios and the practical utility of the proposed similarity measure is demonstrated through numerical experiments.

Calibration of a dual-PTZ-camera system for stereo vision based on parallel particle swarm optimization method

Yau-Zen Chang, Huai-Ming Wang, Shih-Tseng Lee M.D., et al.

Show abstract

This work investigates the calibration of a stereo vision system based on two PTZ (Pan-Tilt-Zoom) cameras. As the accuracy of the system depends not only on intrinsic parameters, but also on the geometric relationships between rotation axes of the cameras, the major concern is the development of an effective and systematic way to obtain these relationships. We derived a complete geometric model of the dual-PTZ-camera system and proposed a calibration procedure for the intrinsic and external parameters of the model. The calibration method is based on Zhang’s approach using an augmented checkerboard composed of eight small checkerboards, and is formulated as an optimization problem to be solved by an improved particle swarm optimization (PSO) method. Two Sony EVI-D70 PTZ cameras were used for the experiments. The root-mean-square errors (RMSE) of corner distances in the horizontal and vertical direction are 0.192 mm and 0.115 mm, respectively. The RMSE of overlapped points between the small checkerboards is 1.3958 mm.

Probabilistic person identification in TV news programs using image web database

F. Battisti, M. Carli, M. Leo, et al.

Show abstract

The automatic labeling of faces in TV broadcasting is still a challenging problem. The high variability in view points, facial expressions, general appearance, and lighting conditions, as well as occlusions, rapid shot changes, and camera motions, produce significant variations in image appearance. The application of automatic tools for face recognition is not yet fully established and the human intervention is needed. In this paper, we deal with the automatic face recognition in TV broadcasting programs. The target of the proposed method is to identify the presence of a specific person in a video by means of a set of images downloaded from Web using a specific search key.

Spatial-temporal features of thermal images for Carpal Tunnel Syndrome detection

Kevin Estupinan Roldan, Marco A. Ortega Piedrahita, Hernan D. Benitez

Show abstract

Disorders associated with repeated trauma account for about 60% of all occupational illnesses, Carpal Tunnel Syndrome (CTS) being the most consulted today. Infrared Thermography (IT) has come to play an important role in the field of medicine. IT is non-invasive and detects diseases based on measuring temperature variations. IT represents a possible alternative to prevalent methods for diagnosis of CTS (i.e. nerve conduction studies and electromiography). This work presents a set of spatial-temporal features extracted from thermal images taken in healthy and ill patients. Support Vector Machine (SVM) classifiers test this feature space with Leave One Out (LOO) validation error. The results of the proposed approach show linear separability and lower validation errors when compared to features used in previous works that do not account for temperature spatial variability.

A speed-optimized RGB-Z capture system with improved denoising capabilities

Aleksandra Chuchvara, Mihail Georgiev, Atanas Gotchev

Show abstract

We have developed an end-to-end system for 3D scene sensing which combines a conventional high-resolution RGB camera with a low-resolution Time-of-Flight (ToF) range sensor. The system comprises modules for range data denoising, data re-projection and non-uniform to uniform up-sampling and aims at composing high-resolution 3D video output for driving auto-stereoscopic 3D displays in real-time. In our approach, the ToF sensor is set to work with short integration time with the aim to increase the capture speed and decrease the amount of motion artifacts. However, reduced integration time leads to noisy range images. We specifically address the noise reduction problem by performing a modification of the non-local means filtering in spatio-temporal domain. Time-consecutive range images are utilized not only for efficient de-noising but also for accurate non-uniform to uniform up-sampling on the high-resolution RGB grid. Use is made of the reflectance signal of the ToF sensor for providing a confidence-type of feedback to the denosing module where a new adaptive averaging is proposed to effectively handle motion artifacts. As of the non-uniform to uniform resampling of range data is concerned, we have developed two alternative solutions; one relying entirely on the GPU power and another being applicable to any general platform. The latter method employs an intermediate virtual range camera recentering after with the resamploing process degrades to a 2D interpolation performed within the lowresolution grid. We demonstrate a real-time performance of the system working in low-power regime.

Tri and tetrachromatic metamerism

Alfredo Restrepo

Show abstract

Two light beams that are seen of the same colour, but possibly having different spectra, are said to be metameric. The colour of a light beam is based on the outputs of a set of photodetectors with different spectral responses, and metamerism results when such a set of photodetectors is unable to resolve two spectra. Metamerism can be characterized in terms of the L, M and S responses of the receptoral layer in the retina, or in terms of the CIE X, Y and Z curves, etc. Working with linear spaces, metamerism is mathematized in terms of the kernel of a certain linear transformation; we derive a basis of localized support functions for such a kernel. We consider metamerism in general and we consider metamerism in the trichromatic and tetrachromatic cases, in particular. Applications are in computer vision, computational photography and satellite imaginery, for example. We make a case for hue metamerism, where the luminance and saturation of two colours of the same hue may be different.

Refractory neural nets and vision

Thomas C. Fall

Show abstract

Biological understandings have served as the basis for new computational approaches. A prime example is artificial neural nets which are based on the biological understanding of the trainability of neural synapses. In this paper, we will investigate features of the biological vision system to see if they can also be exploited. These features are 1) the neuron’s refractory period - the period of time after the neuron fires before it can fire again and 2) the ocular microtremor which moves the retinal neural array relative to the image. The short term memory due to the refractory period allows the before and after movement views to be compared. This paper will discuss the investigation of the implications of these two features.

Statistical shape analysis for image understanding and object recognition

Peter F. Stiller

Show abstract

In order to analyze the effects of noise on certain recognition and reconstruction algorithms, including the sensitivity of the so-called object/image equations and object/image metrics, one needs to study probability and statistics on shape spaces. Work along these lines was pioneered by Kendall and has been the subject of many papers over the last twenty years. In this paper we extend some of those results to affine shape spaces and then use them to relate distributions on object shapes to corresponding distributions on image shapes.

2D-fractal based algorithms for nanoparticles characterization

Giuseppe Bonifazi, Silvia Serranti

Show abstract

Fractal geometry concerns the study of non-Euclidean geometrical figures generated by a recursive sequence of mathematical operations. The proposed 2D-fractal approach was applied to characterise the image structure and texture generated by fine and ultra-fine particles when impacting on a flat surface. The work was developed with reference to particles usually produced by ultra-fine milling addressed to generate nano-particles population. In order to generate different particle populations to utilize in the study, specific milling actions have been thus performed adopting different milling actions and utilising different materials, both in terms of original size class distribution and chemical-physical attributes. The aim of the work was to develop a simple, reliable and low cost analytical set of procedures with the ability to establish correlations between particles detected by fractal characteristics and their milled-induced-properties (i.e. size class distribution, shape, surface properties, etc.). Such logic should constitute the core of a control engine addressed to realize a full monitoring of the milling process as well as to establish correlation between operative parameters, fed and resulting products characteristics.

Non-stationary noise estimation using dictionary learning and Gaussian mixture models

James M. Hughes, Daniel N. Rockmore, Yang Wang

Show abstract

Stationarity of the noise distribution is a common assumption in image processing. This assumption greatly simplifies denoising estimators and other model parameters and consequently assuming stationarity is often a matter of convenience rather than an accurate model of noise characteristics. The problematic nature of this assumption is exacerbated in real-world contexts, where noise is often highly non-stationary and can possess time- and space-varying characteristics. Regardless of model complexity, estimating the parameters of noise dis- tributions in digital images is a difficult task, and estimates are often based on heuristic assumptions. Recently, sparse Bayesian dictionary learning methods were shown to produce accurate estimates of the level of additive white Gaussian noise in images with minimal assumptions. We show that a similar model is capable of accu- rately modeling certain kinds of non-stationary noise processes, allowing for space-varying noise in images to be estimated, detected, and removed. We apply this modeling concept to several types of non-stationary noise and demonstrate the model’s effectiveness on real-world problems, including denoising and segmentation of images according to noise characteristics, which has applications in image forensics.

Weighted denoising for phase unwrapping

Satoshi Tomioka, Shusuke Nishiyama

Show abstract

In order to measure the optical distance of the object that changes rapidly over time, Fourier transform method is appropriate because it requires only a single interferogram. In the measurements of such fast phenomena, the thermal noise by the camera to record the interferogram results in a significant error and the signal becomes weak owing to the short exposure time of the camera. When the noise level is high, a process to denoise wrapped phase should be added before phase unwrapping in order to obtain an optical distance distribution. The thermal noise has a uniform spatial distribution; however, the signal depends on a profile of the incident wave to the interferometer. This means that the signal to noise ratio has a spatial distribution. This paper proposes the denoising method that can take account of the weight of the data that depends on the signal intensity distribution. In order to determine the denoised phase, two cost functions are examined. One is a complex-valued cost function that can ensure convergence of iterative method to obtain the stationary point; however, it is not proved that both the real part and the imaginary part are minimized at the stationary point. The other is a real-valued cost function that cannot ensure the convergence but it minimizes the cost function at the stationary point. The numerical simulation demonstrates the validity of the weighted denoising and the applicability of the cost functions.

A sliding-window transform-domain technique for denoising of DSPI phase maps

Assen A. Shulev, Atanas Gotchev

Show abstract

We have developed a technique for denoising speckle pattern fringes, which makes use of an overcomplete expansion in transform domain combined with suitable thresholding of transform-domain coefficients related with the speckle size. In this paper, we modify the technique to work on noisy phase maps obtained by Phase Shifting Digital Speckle Pattern Interferometry (PSDSPI). The modified version utilizes a complex-valued representation for the phase maps and consequently employs Discrete Fourier transform in sliding window mode for obtaining the sought overcomplete expansion. We discuss issues related of the window size and local threshold value selection. We compare this approach with two state of the art denoising techniques on simulated speckle pattern phase maps. Furthermore, we demonstrate the performance of our technique for denoising of real phase maps, obtained through PSDSPI in an out-of-plane sensitive set-up.

Alternating direction optimization for image segmentation using hidden Markov measure field models

José Bioucas-Dias, Filipe Condessa, Jelena Kovačević

Show abstract

Image segmentation is fundamentally a discrete problem. It consists of finding a partition of the image domain such that the pixels in each element of the partition exhibit some kind of similarity. The solution is often obtained by minimizing an objective function containing terms measuring the consistency of the candidate partition with respect to the observed image, and regularization terms promoting solutions with desired properties. This formulation ends up being an integer optimization problem that, apart from a few exceptions, is NP-hard and thus impossible to solve exactly. This roadblock has stimulated active research aimed at computing “good” approximations to the solutions of those integer optimization problems. Relevant lines of attack have focused on the representation of the regions (i.e., the partition elements) in terms of functions, instead of subsets, and on convex relaxations which can be solved in polynomial time. In this paper, inspired by the “hidden Markov measure field” introduced by Marroquin et al. in 2003, we sidestep the discrete nature of image segmentation by formulating the problem in the Bayesian framework and introducing a hidden set of real-valued random fields determining the probability of a given partition. Armed with this model, the original discrete optimization is converted into a convex program. To infer the hidden fields, we introduce the Segmentation via the Constrained Split Augmented Lagrangian Shrinkage Algorithm (SegSALSA). The effectiveness of the proposed methodology is illustrated with simulated and real hyperspectral and medical images.

Multispectral imaging and image processing

Julie Klein

Show abstract

The color accuracy of conventional RGB cameras is not sufficient for many color-critical applications. One of these applications, namely the measurement of color defects in yarns, is why Prof. Til Aach and the Institute of Image Processing and Computer Vision (RWTH Aachen University, Germany) started off with multispectral imaging. The first acquisition device was a camera using a monochrome sensor and seven bandpass color filters positioned sequentially in front of it. The camera allowed sampling the visible wavelength range more accurately and reconstructing the spectra for each acquired image position. An overview will be given over several optical and imaging aspects of the multispectral camera that have been investigated. For instance, optical aberrations caused by filters and camera lens deteriorate the quality of captured multispectral images. The different aberrations were analyzed thoroughly and compensated based on models for the optical elements and the imaging chain by utilizing image processing. With this compensation, geometrical distortions disappear and sharpness is enhanced, without reducing the color accuracy of multispectral images. Strong foundations in multispectral imaging were laid and a fruitful cooperation was initiated with Prof. Bernhard Hill. Current research topics like stereo multispectral imaging and goniometric multispectral measure- ments that are further explored with his expertise will also be presented in this work.

On the performance of multirate filterbanks: Quantification of shift variance and cyclostationarity in the works of Til Aach

Robert Bregovic, Atanas Gotchev

Show abstract

The paper discusses the issues of shift variance and cyclostationarity in multirate filterbanks as investigated in a series of articles by Til Aach. In its first part, the paper overviews the most important properties of multirate filterbanks such as perfect reconstruction, sampling rate conversion factors, number and type of subbands and subdivisions, orthogonality and bio-orthogonality, and frequency selectivity and preservation of polynomials. This part in intended introduce the reader to the topic and make a bridge to the properties of shift variance and cyclostationarity discussed next. Criteria for shift (in)variance and cyclostationarity as derived by Til Aach are presented and commented and conclusions about their importance are made.

Fibonacci thresholding: signal representation and morphological filters

Artyom M. Grigoryan, Sos S. Agaian

Show abstract

A new weighted thresholding concept is presented, which is used for the set-theoretical representation of signals, the producing new signals containing a large number of key features that are in the original signals and the design new morphological filters. Such representation maps many operations of non binary signal and image processing to the union of the simple operations over the binary signals and images. The weighted thresholding is invariant under the morphological transformations, including the basic ones, erosion and dilation. The main idea of using the weighted thresholding is in the choice of the special level of thresholding on which we can concentrate all our attention for the future processing. Together with arithmetical thresholding the so-called Fibonacci levels are chosen because of many interesting properties; one of them is the effective decomposition of the median filter. Experimental results show that the Fibonacci thresholding is much promised and can be used for many applications, including the image enhancement, segmentation, and edge detection.

Parametric rational unsharp masking for image enhancement

Changzhe Yin, Yicong Zhou, Sos Agaian, et al.

Show abstract

Unsharp masking is an effective enhancement tool to improve the visual quality of fine details in images. However, it also amplifies noisy and over-enhances steep edges. To address this problem, this paper proposes a parametric rational unsharp masking. It utilizes the horizontal and vertical gain factors to enhance image details in two directions independently. Experiments and comparisons are provided to demonstrate its excellent enhancement performance.

Sparse presentation based classification with position-weighted block dictionary

Jun He, Tian Zuo, Bo Sun, et al.

Show abstract

This paper is aiming at applying sparse representation based classification (SRC) on general objects of a certain scale. Authors analyze the characteristics of general object recognition and propose a position-weighted block dictionary (PWBD) based on sparse presentation and design a framework of SRC with it (PWBD-SRC). Principle and implementation of PWBD-SRC have been introduced in the article, and experiments on car models have been given in the article. From experimental results, it can be seen that with position-weighted block dictionary (PWBD) not only the dictionary scale can be effectively reduced, but also roles of image blocks taking in representing a whole image can be embodied to a certain extent. In reorganization application, an image only containing partial objects can be identified with PWBD-SRC. Besides, rotation and perspective robustness can be achieved. Finally, a brief description on some remaining problems has been proposed in the article.

Image Processing: Algorithms and Systems XII

Volume Details

Table of Contents

Table of Contents