Proceedings Volume 9020

Computational Imaging XII

cover
Proceedings Volume 9020

Computational Imaging XII

View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 17 March 2014
Contents: 9 Sessions, 34 Papers, 0 Presentations
Conference: IS&T/SPIE Electronic Imaging 2014
Volume Number: 9020

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Front Matter: Volume 9020
  • Computational Imaging for Consumer Electronics
  • Inverse Problems
  • Modeling and Analysis of Multidimensional Data
  • Tomographic Estimation
  • Inverse Problems in Materials and Security
  • Image Enhancement and Denoising
  • Light Field Cameras and Algorithms
  • Interactive Paper Session
Front Matter: Volume 9020
icon_mobile_dropdown
Front Matter: Volume 9020
This PDF file contains the front matter associated with SPIE Proceedings Volume 9020 including the Title Page, Copyright information, Table of Contents, Introduction, and Conference Committee listing.
Computational Imaging for Consumer Electronics
icon_mobile_dropdown
Video colorization based on optical flow and edge-oriented color propagation
Mayu Otani, Hirohisa Hioki
We propose a novel video colorization method based on sparse optical flow and edge-oriented color propagation. Colorization is a process of adding color to monochrome images or videos. In our video colorization method, it is assumed that key frames are appropriately selected out of a grayscale video stream and are properly colorized in advance. Once key frames are colorized, our method colorizes all the remaining grayscale frames automatically. It is also possible to colorize key frames semi-automatically by our method. For colorizing a grayscale frame between a pair of colorized key frames, sparse optical flow is computed first. The optical flow consists of reliable motion vectors around strong feature points. The colors around feature points of key frames are then copied to the grayscale frame according to the estimated motion vectors. Those colors are then propagated to the rest of the grayscale frame. Colors are blended appropriately during the propagation process. A pair of accuracy and priority measures is introduced to control how the color propagation proceeds. To successfully propagate colors, it is important not to wrongly spread colors across edges. For this purpose, a set of neighboring pixels is adaptively selected not to include edge-like areas and thus not to spread colors across edges. To evaluate effectiveness of our method, image colorization and video colorization were performed. Experimental results show that our method can colorize images and videos better than previous methods when there are edges. We also show that the proposed method enables us to easily modify colors in colored video streams.
Image enhancement with blurred and noisy image pairs using a dual edge-preserving filtering technique
Yuushi Toyoda, Hiroyasu Yoshikawa, Masayoshi Shimizu
Taking a satisfactory image using a hand-held camera under dim lighting conditions is difficult because there is a tradeoff between exposure time and motion blur. If the image is taken with a short exposure time, the image is noisy, although it preserves sharp details. On the other hand, an image taken with a long exposure time suffers from motion blur but has good intensity. In this paper, we propose a two-image approach that adaptively combines a short-exposure image and a long-exposure image for the image stabilization function on digital cameras. Our method combines only the desirable properties of the two images to obtain a less-blurry and less-noisy image. Furthermore, the proposed approach uses a dual edge-preserving filtering technique for the edge areas of the long-exposure image, which are more affected by motion blur.
Computational efficiency improvements for image colorization
Chao Yu, Gaurav Sharma, Hussein Aly
We propose an efficient algorithm for colorization of greyscale images. As in prior work, colorization is posed as an optimization problem: a user specifies the color for a few scribbles drawn on the greyscale image and the color image is obtained by propagating color information from the scribbles to surrounding regions, while maximizing the local smoothness of colors. In this formulation, colorization is obtained by solving a large sparse linear system, which normally requires substantial computation and memory resources. Our algorithm improves the computational performance through three innovations over prior colorization implementations. First, the linear system is solved iteratively without explicitly constructing the sparse matrix, which significantly reduces the required memory. Second, we formulate each iteration in terms of integral images obtained by dynamic programming, reducing repetitive computation. Third, we use a coarseto- fine framework, where a lower resolution subsampled image is first colorized and this low resolution color image is upsampled to initialize the colorization process for the fine level. The improvements we develop provide significant speedup and memory savings compared to the conventional approach of solving the linear system directly using off-the-shelf sparse solvers, and allow us to colorize images with typical sizes encountered in realistic applications on typical commodity computing platforms.
Structured light 3D depth map enhancement and gesture recognition using image content adaptive filtering
Vikas Ramachandra, James Nash, Kalin Atanassov, et al.
A structured-light system for depth estimation is a type of 3D active sensor that consists of a structured-light projector that projects an illumination pattern on the scene (e.g. mask with vertical stripes) and a camera which captures the illuminated scene. Based on the received patterns, depths of different regions in the scene can be inferred. In this paper, we use side information in the form of image structure to enhance the depth map. This side information is obtained from the received light pattern image reflected by the scene itself. The processing steps run real time. This post-processing stage in the form of depth map enhancement can be used for better hand gesture recognition, as is illustrated in this paper.
Inverse Problems
icon_mobile_dropdown
Architectures and algorithms for x-ray diffraction imaging
Ke Chen, David A. Castañón
X-ray imaging is the predominant modality used in luggage inspection systems for explosives detection. Conventional or dual energy X-ray computed tomography imaging reconstructs the X-ray absorption characteristics of luggage contents at the different energies; however, material characterization based on absorption characteristics at these energies is often ambiguous. X-ray diffraction imaging (XDI) measures coherently scattered X-rays to construct diffraction profiles of materials that can provide additional molecular signature information to improve the identification of specific materials. In this paper, we present recent work on developing XDI algorithms for different architectures, which include limited angle tomography and the use of coded aperture masks. We study the potential benefits of fusion of dual energy CT information with X-ray diffraction imaging. We illustrate the performance of different approaches using Monte Carlo propagation simulations through 3-D media.
Joint metal artifact reduction and material discrimination in x-ray CT using a learning-based graph-cut method
Limor Martin, Ahmet Tuysuzoglu, Prakash Ishwar, et al.
X-ray Computed Tomography (CT) is an effective nondestructive technology used for security applications. In CT, three-dimensional images of the interior of an object are generated based on its X-ray attenuation. Multi-energy CT can be used to enhance material discrimination. Currently, reliable identification and segmentation of objects from CT data is challenging due to the large range of materials which may appear in baggage and the presence of metal and high clutter. Conventionally reconstructed CT images suffer from metal induced streaks and artifacts which can lead to breaking of objects and inaccurate object labeling. We propose a novel learning-based framework for joint metal artifact reduction and direct object labeling from CT derived data. A material label image is directly estimated from measured effective attenuation images. We include data weighting to mitigate metal artifacts and incorporate an object boundary-field to reduce object splitting. The overall problem is posed as a graph optimization problem and solved using an efficient graphcut algorithm. We test the method on real data and show that it can produce accurate material labels in the presence of metal and clutter.
Digital filter based on the Fisher linear discriminant to reduce dead-time paralysis in photon counting
Shane Z. Sullivan, Paul D. Schmitt, Emma L. DeWalt, et al.
Photon counting represents the Poisson limit in signal to noise, but can often be complicated in imaging applications by detector paralysis, arising from the finite rise / fall time of the detector upon photon absorption. We present here an approach for reducing dead-time by generating a deconvolution digital filter based on optimizing the Fisher linear discriminant. In brief, two classes are defined, one in which a photon event is initiated at the origin of the digital filter, and one in the photon event is non-coincident with the filter origin. Linear discriminant analysis (LDA) is then performed to optimize the digital filter that best resolves the coincident and non-coincident training set data.1 Once trained, implementation of the filter can be performed quickly, significantly reducing dead-time issues and measurement bias in photon counting applications. Experimental demonstration of the LDA-filter approach was performed in fluorescence microscopy measurements using a highly convolved impulse response with considerable ringing. Analysis of the counts supports the capabilities of the filter in recovering deconvolved impulse responses under the conditions considered in the study. Potential additional applications and possible limitations are also considered.
Magnified neutron radiography with coded sources
A coded source imaging system has been developed to improve resolution for neutron radiography through magnification and demonstrated at the High Flux Isotope Reactor (HFIR) CG-1D instrument. Without magnification, the current resolution at CG-1D is 80μm using a charge-coupled device (CCD) equipped with a lens. As for all neutron imaging instruments, magnification is limited by a large source size. At CG-1D the size is currently limited to 12mm with a circular aperture. Coded source imaging converts this large aperture into a coded array of smaller apertures to achieve high resolution without the loss of flux for a single pinhole aperture, but requires a decoding step. The developed system has demonstrated first magnified radiographic imaging at magnifications as high as 25x using coded apertures with holes as small as 10μm. Such a development requires a team with a broad base of expertise including imaging systems design, neutron physics, microelectronics manufacturing methods, reconstruction algorithms, and high performance computing. The paper presents the system design, discusses implementation challenges, and presents imaging results.
Modeling and Analysis of Multidimensional Data
icon_mobile_dropdown
A super-resolution algorithm for enhancement of flash lidar data: flight test results
Alexander Bulyshev, Farzin Amzajerdian, Eric Roback, et al.
This paper describes the results of a 3D super-resolution algorithm applied to the range data obtained from a recent Flash Lidar helicopter flight test. The flight test was conducted by the NASA’s Autonomous Landing and Hazard Avoidance Technology (ALHAT) project over a simulated lunar terrain facility at NASA Kennedy Space Center. ALHAT is developing the technology for safe autonomous landing on the surface of celestial bodies: Moon, Mars, asteroids. One of the test objectives was to verify the ability of 3D super-resolution technique to generate high resolution digital elevation models (DEMs) and to determine time resolved relative positions and orientations of the vehicle. 3D super-resolution algorithm was developed earlier and tested in computational modeling, and laboratory experiments, and in a few dynamic experiments using a moving truck. Prior to the helicopter flight test campaign, a 100mX100m hazard field was constructed having most of the relevant extraterrestrial hazard: slopes, rocks, and craters with different sizes. Data were collected during the flight and then processed by the super-resolution code. The detailed DEM of the hazard field was constructed using independent measurement to be used for comparison. ALHAT navigation system data were used to verify abilities of super-resolution method to provide accurate relative navigation information. Namely, the 6 degree of freedom state vector of the instrument as a function of time was restored from super-resolution data. The results of comparisons show that the super-resolution method can construct high quality DEMs and allows for identifying hazards like rocks and craters within the accordance of ALHAT requirements.
Automatic image assessment from facial attributes
Raymond Ptucha, David Kloosterman, Brian Mittelstaedt, et al.
Personal consumer photography collections often contain photos captured by numerous devices stored both locally and via online services. The task of gathering, organizing, and assembling still and video assets in preparation for sharing with others can be quite challenging. Current commercial photobook applications are mostly manual-based requiring significant user interactions. To assist the consumer in organizing these assets, we propose an automatic method to assign a fitness score to each asset, whereby the top scoring assets are used for product creation. Our method uses cues extracted from analyzing pixel data, metadata embedded in the file, as well as ancillary tags or online comments. When a face occurs in an image, its features have a dominating influence on both aesthetic and compositional properties of the displayed image. As such, this paper will emphasize the contributions faces have on affecting the overall fitness score of an image. To understand consumer preference, we conducted a psychophysical study that spanned 27 judges, 5,598 faces, and 2,550 images. Preferences on a per-face and per-image basis were independently gathered to train our classifiers. We describe how to use machine learning techniques to merge differing facial attributes into a single classifier. Our novel methods of facial weighting, fusion of facial attributes, and dimensionality reduction produce stateof- the-art results suitable for commercial applications.
Closely spaced object resolution using a quantum annealing model
J. J. Tran, R. F. Lucas, K. J. Scully, et al.
One of the challenges of automated target recognition and tracking on a two-dimensional focal plane is the ability to resolve closely spaced objects (CSO). To date, one of the best CSO-resolution algorithms first subdivides a cluster of image pixels into equally spaced grid points; then it conjectures that K targets are located at the centers of those sub-pixels and, for each set of such locations, calculates the associated irradiance values that minimizes the sum of squares of the residuals. The set of target locations that leads to the minimal residual becomes the initial starting point to a non-linear least-squares fit (e.g. Levenberg-Marquardt, Nelder-Mead, trust-region, expectation-maximization, etc.), which completes the estimation. The overall time complexity is exponential in K. Although numerous strides have been made over the years vis-`a-vis heuristic optimization techniques, the CSO resolution problem remains largely intractable, due to its combinatoric nature. We propose a novel approach to address this computational obstacle, employing a technique that maps the CSO resolution algorithm to a quantum annealing model which can then be programmed on an adiabatic quantum optimization device, e.g., the D-Wave architecture.
3D quantitative microwave imaging from sparsely measured data with Huber regularization
Funing Bai, Aleksandra Pižurica
Reconstructing complex permittivity profiles of dielectric objects from measurements of the microwave scattered field is a non-linear ill posed inverse problem. We analyze the performance of the Huber regularizer in the application, studying the influence of the parameters under different noise levels. Moreover, we evaluate the whole approach on real 3D electromagnetic measurements. Our focus is on reconstructions from relatively few measurements (sparse measurements) to speed up the reconstruction process.
Tomographic Estimation
icon_mobile_dropdown
Novel tensor transform-based method of image reconstruction from limited-angle projection data
The tensor representation is an effective way to reconstruct the image from a finite number of projections, especially, when projections are limited in a small range of angles. The image is considered in the image plane and reconstruction is in the Cartesian lattice. This paper introduces a new approach for calculating the splittingsignals of the tensor transform of the discrete image f(xi, yj ) from a fine number of ray-integrals of the real image f(x, y). The properties of the tensor transform allows for calculating a large part of the 2-D discrete Fourier transform in the Cartesian lattice and obtain high quality reconstructions, even when using a small range of projections, such as [0°, 30°) and down to [0°, 20°). The experimental results show that the proposed method reconstructs images more accurately than the known method of convex projections and filtered backprojection.
Statistical x-ray computed tomography imaging from photon-starved measurements
Dose reduction in clinical X-ray computed tomography (CT) causes low signal-to-noise ratio (SNR) in photonsparse situations. Statistical iterative reconstruction algorithms have the advantage of retaining image quality while reducing input dosage, but they meet their limits of practicality when significant portions of the sinogram near photon starvation. The corruption of electronic noise leads to measured photon counts taking on negative values, posing a problem for the log() operation in preprocessing of data. In this paper, we propose two categories of projection correction methods: an adaptive denoising filter and Bayesian inference. The denoising filter is easy to implement and preserves local statistics, but it introduces correlation between channels and may affect image resolution. Bayesian inference is a point-wise estimation based on measurements and prior information. Both approaches help improve diagnostic image quality at dramatically reduced dosage.
Model-based iterative tomographic reconstruction with adaptive sparsifying transforms
Model based iterative reconstruction algorithms are capable of reconstructing high-quality images from lowdose CT measurements. The performance of these algorithms is dependent on the ability of a signal model to characterize signals of interest. Recent work has shown the promise of signal models that are learned directly from data. We propose a new method for low-dose tomographic reconstruction by combining adaptive sparsifying transform regularization within a statistically weighted constrained optimization problem. The new formulation removes the need to tune a regularization parameter. We propose an algorithm to solve this optimization problem, based on the Alternating Direction Method of Multipliers and FISTA proximal gradient algorithm. Numerical experiments on the FORBILD head phantom illustrate the utility of the new formulation and show that adaptive sparsifying transform regularization outperforms competing dictionary learning methods at speeds rivaling total-variation regularization.
Structured illumination for compressive x-ray diffraction tomography
Coherent x-ray scatter (also know as x-ray diffraction) has long been used to non-destructively investigate the molecular structure of materials for industrial, medical, security, and fundamental purposes. Unfortunately, molecular tomography based on coherent scatter typically requires long scan times and/or large incident fluxes, which has limited the practical applicability of such schemes. One can overcome the conventional challenges by employing compressive sensing theory to optimize the information obtained per incident photon. We accomplish this in two primary ways: we use a coded aperture to structure the incident illumination and realize massive measurement parallelization and use photon-counting, energy-sensitive detection to recover maximal information from each detected photon. We motivate and discuss here the general imaging principles, investigate different coding and sampling strategies, and provide results from theoretical studies for our structured illumination scheme. We find that this approach promises real-time molecular tomography of bulk objects without a loss in imaging performance.
Inverse Problems in Materials and Security
icon_mobile_dropdown
Effects of powder morphology and particle size on CT number estimates
Jeffrey S. Kallman, Sabrina dePiero, Stephen Azevedo, et al.
We performed experiments and data analysis to determine how powder morphology and particle size affect X-ray attenuation (CT number or CTN). These experiments were performed on a CT system with an isotropic resolution of (0.15 mm)3, and an endpoint energy of 160kV. Powders with effective atomic number (Ze) within ±0.2 of water were found to have CTN more directly related to electron density than to bulk physical density. Variations in mean particle size ranging between 2 μm and 44 μm were found to have no effect on specimen mean CTN.
Coded aperture x-ray scatter tomography
Andrew D. Holmgren, Kenneth P. MacCabe, Martin P. Tornai, et al.
We present a system for X-ray tomography using a coded aperture. A fan beam illuminates a 2D cross-section of an object and our coded aperture system produces a tomographic image from each static snapshot; as such, we can reconstruct either a static object scanned in 3D or an x-ray video of a non-static object.
Model-based, one-sided, time-of-flight terahertz image reconstruction
Stephen M. Schmitt, Jeffrey A. Fessler, Greg D. Fichter, et al.
In the last decade, terahertz-mode imaging has received increased attention for non-destructive testing applica- tions due to its ability to penetrate many materials while maintaining a small wavelength. This paper describes a model-based reconstruction algorithm that is able to image defects in the spray-on foam insulation (SOFI) used in aerospace applications that has been sprayed on a re ective metal hull. In this situation, X-ray based imaging is infeasible since only one side of the hull is accessible in ight. This paper models the object as a grid of materials, each section of which has a constant index of refraction. The delay between the transmission and reception of a THz pulse is related to the integral of the index of refraction along the pulse's path, and we adapt computed tomography (CT) methods to reconstruct an image of an object's index of refraction. We present the results of our reconstruction method using real data of the timing of THz pulses passing through a block of SOFI with holes of a known location and radius. The resulting image of the block has a low level of noise, but contains artifacts due to the limited angular range of one-sided imaging and due to the narrow beam approximation used in the forward model.
Image Enhancement and Denoising
icon_mobile_dropdown
Fast edge-preserving image denoising via group coordinate descent on the GPU
Madison G. McGaffin, Jeffrey A. Fessler
We present group coordinate descent algorithms for edge-preserving image denoising that are particularly well-suited to the graphics processing unit (GPU). The algorithms decouple the denoising optimization problem into a set of iterated, independent one-dimensional problems. We provide methods to handle both differentiable regularizers and the absolute value function using the majorize-minimize technique. Specifically, we use quadratic majorizers with Huber curvatures for differentiable potentials and a duality approach for the absolute value function. Preliminary experimental results indicate that the algorithms converge remarkably quickly in time.
Light Field Cameras and Algorithms
icon_mobile_dropdown
Light field panorama by a plenoptic camera
Zhou Xue, Loic Baboulaz, Paolo Prandoni, et al.
Consumer-grade plenoptic camera Lytro draws a lot of interest from both academic and industrial world. However its low resolution in both spatial and angular domain prevents it from being used for fine and detailed light field acquisition. This paper proposes to use a plenoptic camera as an image scanner and perform light field stitching to increase the size of the acquired light field data. We consider a simplified plenoptic camera model comprising a pinhole camera moving behind a thin lens. Based on this model, we describe how to perform light field acquisition and stitching under two different scenarios: by camera translation or by camera translation and rotation. In both cases, we assume the camera motion to be known. In the case of camera translation, we show how the acquired light fields should be resampled to increase the spatial range and ultimately obtain a wider field of view. In the case of camera translation and rotation, the camera motion is calculated such that the light fields can be directly stitched and extended in the angular domain. Simulation results verify our approach and demonstrate the potential of the motion model for further light field applications such as registration and super-resolution.
Efficient volumetric estimation from plenoptic data
Paul Anglin, Stanley J. Reeves, Brian S. Thurow
The commercial release of the Lytro camera, and greater availability of plenoptic imaging systems in general, have given the image processing community cost-effective tools for light-field imaging. While this data is most commonly used to generate planar images at arbitrary focal depths, reconstruction of volumetric fields is also possible. Similarly, deconvolution is a technique that is conventionally used in planar image reconstruction, or deblurring, algorithms. However, when leveraged with the ability of a light-field camera to quickly reproduce multiple focal planes within an imaged volume, deconvolution offers a computationally efficient method of volumetric reconstruction. Related research has shown than light-field imaging systems in conjunction with tomographic reconstruction techniques are also capable of estimating the imaged volume and have been successfully applied to particle image velocimetry (PIV). However, while tomographic volumetric estimation through algorithms such as multiplicative algebraic reconstruction techniques (MART) have proven to be highly accurate, they are computationally intensive. In this paper, the reconstruction problem is shown to be solvable by deconvolution. Deconvolution offers significant improvement in computational efficiency through the use of fast Fourier transforms (FFTs) when compared to other tomographic methods. This work describes a deconvolution algorithm designed to reconstruct a 3-D particle field from simulated plenoptic data. A 3-D extension of existing 2-D FFT-based refocusing techniques is presented to further improve efficiency when computing object focal stacks and system point spread functions (PSF). Reconstruction artifacts are identified; their underlying source and methods of mitigation are explored where possible, and reconstructions of simulated particle fields are provided.
Computationally efficient background subtraction in the light field domain
In this paperwe present a novel approach for depth estimation and background subtraction in light field images. our approach exploits the regularity and the internal structure of the light field signal in order to extract an initial depth map of the captured scene and uses the extracted depth map as the input to a final segmentation algorithm which finely isolates the background in the image. Background subtraction is natural application of the light field information since it is highly involved with depth information and segmentation. However many of the approaches proposed so far are not optimized specifically for background subtraction and are highly computationally expensive. Here we propose an approach based on a modified version of the well-known Radon Transform and not involving massive matrix calculations. It is therefore computationally very efficient and appropriate for real-time use. Our approach exploits the structured nature of the light field signal and the information inherent in the plenoptic space in order to extract an initial depth map and background model of the captured scene. We apply a modified the Radon transform and the gradient operator to horizontal slices of the light field signal to infer the initial depth map. The initial depth estimated are further refined to a precise background using a series of depth thresholding and segmentation in ambiguous areas. We test on method on various types real and synthetic of light field images. Scenes with different levels of clutter and also various foreground object depth have been considered in the experiments. The results of our experiments show much better computational complexity while retaining comparable performance to similar more complex methods.
Interactive Paper Session
icon_mobile_dropdown
Texture mapping 3D models of indoor environments with noisy camera poses
Peter Cheng, Michael Anderson, Stewart He, et al.
Automated 3D modeling of building interiors is used in applications such as virtual reality and environment mapping. Texturing these models allows for photo-realistic visualizations of the data collected by such modeling systems. While data acquisition times for mobile mapping systems are considerably shorter than for static ones, their recovered camera poses often suffer from inaccuracies, resulting in visible discontinuities when successive images are projected onto a surface for texturing. We present a method for texture mapping models of indoor environments that starts by selecting images whose camera poses are well-aligned in two dimensions. We then align images to geometry as well as to each other, producing visually consistent textures even in the presence of inaccurate surface geometry and noisy camera poses. Images are then composited into a final texture mosaic and projected onto surface geometry for visualization. The effectiveness of the proposed method is demonstrated on a number of different indoor environments.
Reconstruction of compressively sampled ray space by using DCT basis and statistically weighted L1 norm optimization
Qiang Yao, Keita Takahashi, Toshiaki Fujii
In recent years, ray space (or light field in other literatures) photography has gained a great popularity in the area of computer vision and image processing, and an efficient acquisition of a ray space is of great significance in the practical application. In order to handle the huge data problem in the acquisition process, in this paper, we propose a method of compressively sampling and reconstructing one ray space. In our method, one weighted matrix which reflects the amplitude structure of non-zero coefficients in 2D-DCT domain is designed and generated by using statistics from available data set. The weighted matrix is integrated in ι1 norm optimization to reconstruct the ray space, and we name this method as statistically-weighted ι1 norm optimization. Experimental result shows that the proposed method achieves better reconstruction result at both low (0.1 of original sampling rate) and high (0.5 of original sampling rate) subsampling rates. In addition, the reconstruction time is also reduced by 25% compared to the reconstruction time by plain ι1 norm optimization.
Image matching in Bayer raw domain to de-noise low-light still images, optimized for real-time implementation
Temporal accumulation of images is a well-known approach to improve signal to noise ratios of still images taken in a low light conditions. However, the complexity of known algorithms often leads to high hardware resource usage, increased memory bandwidth and computational complexity, making their practical use impossible. In our research we attempt to solve this problem with an implementation of a practical spatial-temporal de-noising algorithm, based on image accumulation. Image matching and spatial-temporal filtering was performed in Bayer RAW data space, which allowed us to benefit from predictable sensor noise characteristics, thus allowing using a range of algorithmic optimizations. The proposed algorithm accurately compensates for global and local motion and efficiently removes different kinds of noise in noisy images taken in low light conditions. In our algorithm we were able to perform global and local motion compensation in Bayer RAW data space, while preserving the resolution and effectively improving signal to noise ratios of moving objects as well as non-stationary background. The proposed algorithm is suitable for implementation in commercial grade FPGA’s and capable of processing 16MP images at capturing rate (10 frames per second). The main challenge for matching between still images is the compromise between the quality of the motion prediction and the complexity of the algorithm and required memory bandwidth. Still images taken in a burst sequence must be aligned to compensate for background motion and foreground objects movements in a scene. High resolution still images coupled with significant time between successive frames can produce large displacements between images, which creates additional difficulty for image matching algorithms. In photo applications it is very important that the noise is efficiently removed in both static, and non-static background as well as in a moving objects, maintaining the resolution of the image. In our proposed algorithm we solved the issue of matching current image with accumulated image data in Bayer RAW data space in order to efficiently perform spatio-temporal noise reduction and reduce the computational requirements. In this paper we provide subjective experimental results to demonstrate the ability of the proposed method to match noisy still images in order to perform efficient de-noising and avoid motion artefacts in resulting still images.
Real-time focal stack compositing for handheld mobile cameras
Mashhour Solh
Extending the depth of field using a single lens camera on a mobile device can be achieved by capturing a set of images each focused at a different depth or focal stack then combine these samples of the focal stack to form a single all-in-focus image or an image refocused at a desired depth of field. Focal stack compositing in real time for a handheld mobile camera has many challenges including capturing, processing power, handshaking, rolling shutter artifacts, occlusion, and lens zoom effect. In this paper, we describe a system for a real time focal stack compositing system for handheld mobile device with an alignment and compositing algorithms. We will also show all-in-focus images captured and processed by a cell phone camera running on Android OS.
Image deblurring using the direction dependence of camera resolution
Yukio Hirai, Hiroyasu Yoshikawa, Masayoshi Shimizu
The blurring that occurs in the lens of a camera has a tendency to further degrade in areas away from the on-axis of the image. In addition, the degradation of the blurred image in an off-axis area exhibits directional dependence. Conventional methods have been known to use the Wiener filter or the Richardson–Lucy algorithm to mitigate the problem. These methods use the pre-defined point spread function (PSF) in the restoration process, thereby preventing an increase in the noise elements. However, the nonuniform degradation that depends on the direction is not improved even though the edges are emphasized by these conventional methods. In this paper, we analyze the directional dependence of resolution based on the modeling of an optical system using a blurred image. We propose a novel image deblurring method that employs a reverse filter based on optimizing the directional dependence coefficients of the regularization term in the maximum a posterior probability (MAP) algorithm. We have improved the directional dependence of resolution by optimizing the weight coefficients of the direction in which the resolution is degraded.
Illumination modelling and optimization for indoor video surveillance
Krishna Reddy Konda, Nicola Conci
Illumination is one of the most important aspects of any surveillance system. The quality of images or videos captured by the cameras heavily depends on the positioning and the intensity of the light sources in the environment. However, exhaustive visualization of the illumination for different placement configurations is next to impossible due to the sheer number of possible combinations. In this paper we propose a novel 3D modelling of a given environment in a synthetic domain, combined with a generic quality metric, is based on entropy measurement in a given image. The synthetic modelling of the environment allows us to evaluate the optimization problem a priori before the physical deployment of the light sources. Entropy is a general measure of the amount of information in an image, so we propose to maximize the entropy out of all possible light placement configurations. In order to model the environment in the virtual domain, we use the POVRAY software, a tool based on ray tracing. Particle swarm optimization is then adopted to find the optimal solution. The total entropy of the system is measured as sum of entropy of the virtual snapshots in the camera system.
Nonlinear and non-Gaussian Bayesian based handwriting beautification
Cao Shi, Jianguo Xiao, Canhui Xu, et al.
A framework is proposed in this paper to effectively and efficiently beautify handwriting by means of a novel nonlinear and non-Gaussian Bayesian algorithm. In the proposed framework, format and size of handwriting image are firstly normalized, and then typeface in computer system is applied to optimize vision effect of handwriting. The Bayesian statistics is exploited to characterize the handwriting beautification process as a Bayesian dynamic model. The model parameters to translate, rotate and scale typeface in computer system are controlled by state equation, and the matching optimization between handwriting and transformed typeface is employed by measurement equation. Finally, the new typeface, which is transformed from the original one and gains the best nonlinear and non-Gaussian optimization, is the beautification result of handwriting. Experimental results demonstrate the proposed framework provides a creative handwriting beautification methodology to improve visual acceptance.
LCAV-31: a dataset for light field object recognition
Alireza Ghasemi, Nelly Afonso, Martin Vetterli
We present LCAV-31, a multi-view object recognition dataset designed specifically for benchmarking light field image analysis tasks. The principal distinctive factor of LCAV-31 compared to similar datasets is its design goals and availability of novel visual information for more accurate recognition (i.e. light field information). The dataset is composed of 31 object categories captured from ordinary household objects. We captured the color and light field images using the recently popularized Lytro consumer camera. Different views of each object have been provided as well as various poses and illumination conditions. We explain all the details of different capture parameters and acquisition procedure so that one can easily study the effect of different factors on the performance of algorithms executed on LCAV-31. Moreover, we apply a set of basic object recognition algorithms on LCAV-31. The results of these experiments can be used as a baseline for further development of novel algorithms.
Scale-invariant representation of light field images for object recognition and tracking
Alireza Ghasemi, Martin Vetterli
We propose a scale-invariant feature descriptor for representation of light-field images. The proposed descriptor can significantly improve tasks such as object recognition and tracking on images taken with recently popularized light field cameras. We test our proposed representation using various light field images of different types, both synthetic and real. Our experiments showvery promising results in terms of retaining invariance under various scaling transformations.
Comparative analysis of the speed performance of texture analysis algorithms on a graphic processing unit (GPU)
J. Triana-Martinez, S. A. Orjuela-Vargas, W. Philips
This paper compares the speed performance of a set of classic image algorithms for evaluating texture in images by using CUDA programming. We include a summary of the general program mode of CUDA. We select a set of texture algorithms, based on statistical analysis, that allow the use of repetitive functions, such as the Coocurrence Matrix, Haralick features and local binary patterns techniques. The memory allocation time between the host and device memory is not taken into account. The results of this approach show a comparison of the texture algorithms in terms of speed when executed on CPU and GPU processors. The comparison shows that the algorithms can be accelerated more than 40 times when implemented using CUDA environment.