Proceedings Volume 9023

Digital Photography X

Nitin Sampat, Radka Tezaur, Sebastiano Battiato, et al.
cover
Proceedings Volume 9023

Digital Photography X

Nitin Sampat, Radka Tezaur, Sebastiano Battiato, et al.
View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 19 March 2014
Contents: 10 Sessions, 36 Papers, 0 Presentations
Conference: IS&T/SPIE Electronic Imaging 2014
Volume Number: 9023

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Front Matter: Volume 9023
  • Computational Photography
  • Mobile Photography
  • Image Quality Evaluation Methods/Standards for Mobile and Digital Photography: Joint Session with Conferences 9016 and 9023
  • Blur
  • Image Processing Pipeline and Camera Characterization
  • Computer Vision and Applications
  • Color
  • HDR
  • Interactive Paper Session
Front Matter: Volume 9023
icon_mobile_dropdown
Front Matter: Volume 9023
This PDF file contains the front matter associated with SPIE Proceedings Volume 9023 including the Title Page, Copyright information, Table of Contents, Introduction, and Conference Committee listing.
Computational Photography
icon_mobile_dropdown
All-glass wafer-level lens technology for array cameras
We present a novel all-glass wafer-level lens manufacturing technology. Compared to existing wafer-level lens manufacturing technologies, we realize lenses all in glass, which has a number of distinct advantages, including the availability of different glass types with largely varying dispersion for efficient achromatic lens design. Another advantage of all-glass solutions is the ability to dice the lens stack to match the form factor of a rectangular sensor area without compromising the optical performance of the lens, thereby allowing to significantly reducing the footprint of an array camera.
Real time algorithm invariant to natural lighting with LBP techniques through an adaptive thresholding implemented in GPU processors
S. A. Orjuela-Vargas, J. Triana-Martinez, J. P. Yañez, et al.
Video analysis in real time requires fast and efficient algorithms to extract relevant information from a considerable number, commonly 25, of frames per second. Furthermore, robust algorithms for outdoor visual scenes may retrieve correspondent features along the day where a challenge is to deal with lighting changes. Currently, Local Binary Pattern (LBP) techniques are widely used for extracting features due to their robustness to illumination changes and the low requirements for implementation. We propose to compute an automatic threshold based on the distribution of the intensity residuals resulting from the pairwise comparisons when using LBP techniques. The intensity residuals distribution can be modelled by a Generalized Gaussian Distribution (GGD). In this paper we compute the adaptive threshold using the parameters of the GGD. We present a CUDA implementation of our proposed algorithm. We use the LBPSYM technique. Our approach is tested on videos of four different urban scenes with mobilities captured during day and night. The extracted features can be used in a further step to determine patterns, identify objects or detect background. However, further research must be conducted for blurring correction since the scenes at night are commonly blurred due to artificial lighting.
Embedded FIR filter design for real-time refocusing using a standard plenoptic video camera
Christopher Hahne, Amar Aggoun
A novel and low-cost embedded hardware architecture for real-time refocusing based on a standard plenoptic camera is presented in this study. The proposed layout design synthesizes refocusing slices directly from micro images by omitting the process for the commonly used sub-aperture extraction. Therefore, intellectual property cores, containing switch controlled Finite Impulse Response (FIR) filters, are developed and applied to the Field Programmable Gate Array (FPGA) XC6SLX45 from Xilinx. Enabling the hardware design to work economically, the FIR filters are composed of stored product as well as upsampling and interpolation techniques in order to achieve an ideal relation between image resolution, delay time, power consumption and the demand of logic gates. The video output is transmitted via High-Definition Multimedia Interface (HDMI) with a resolution of 720p at a frame rate of 60 fps conforming to the HD ready standard. Examples of the synthesized refocusing slices are presented.
Mobile Photography
icon_mobile_dropdown
Mobile multi-flash photography
Xinqing Guo, Jin Sun, Zhan Yu, et al.
Multi-flash (MF) photography offers a number of advantages over regular photography including removing the effects of illumination, color and texture as well as highlighting occlusion contours. Implementing MF photography on mobile devices, however, is challenging due to their restricted form factors, limited synchronization capabilities, low computational power and limited interface connectivity. In this paper, we present a novel mobile MF technique that overcomes these limitations and achieves comparable performance as conventional MF. We first construct a mobile flash ring using four LED lights and design a special mobile flash-camera synchronization unit. The mobile device’s own flash first triggers the flash ring via an auxiliary photocell. The mobile flashes are then triggered consecutively in sync with the mobile camera’s frame rate, to guarantee that each image is captured with only one LED flash on. To process the acquired MF images, we further develop a class of fast mobile image processing techniques for image registration, depth edge extraction, and edge-preserving smoothing. We demonstrate our mobile MF on a number of mobile imaging applications, including occlusion detection, image thumbnailing, image abstraction and object category classification.
Comparison of approaches for mobile document image analysis using server supported smartphones
Suleyman Ozarslan, P. Erhan Eren
With the recent advances in mobile technologies, new capabilities are emerging, such as mobile document image analysis. However, mobile phones are still less powerful than servers, and they have some resource limitations. One approach to overcome these limitations is performing resource-intensive processes of the application on remote servers. In mobile document image analysis, the most resource consuming process is the Optical Character Recognition (OCR) process, which is used to extract text in mobile phone captured images. In this study, our goal is to compare the in-phone and the remote server processing approaches for mobile document image analysis in order to explore their trade-offs. For the inphone approach, all processes required for mobile document image analysis run on the mobile phone. On the other hand, in the remote-server approach, core OCR process runs on the remote server and other processes run on the mobile phone. Results of the experiments show that the remote server approach is considerably faster than the in-phone approach in terms of OCR time, but adds extra delays such as network delay. Since compression and downscaling of images significantly reduce file sizes and extra delays, the remote server approach overall outperforms the in-phone approach in terms of selected speed and correct recognition metrics, if the gain in OCR time compensates for the extra delays. According to the results of the experiments, using the most preferable settings, the remote server approach performs better than the in-phone approach in terms of speed and acceptable correct recognition metrics.
UV curing adhesives optimized for UV replication processes used in micro optical applications
Andreas Kraft, Markus Brehm, Kilian Kreul
Key success of today’s mobile handsets and tablets is the increase of functionality by implementation of new sensors into the devices. Optical and imaging functions are considered core features and are realized using miniaturized or micro optics. In the last decade, micro optics manufacturing has reached a high level of confidence and is supposed to be one of the manufacturing technologies to address future requirements for even smaller and thinner devices. In this paper we will look at new processes like UV replication of micro lenses and precise placement and alignment technologies for imaging and LED optics. Manufacturing can be done on wafer level, as well as by a pick and place process. Results of printed micro lenses, as well as placement accuracy in active alignment processes will be shown.
Mobile microscopy on the move
W. M. Lee, A. Upadhya, Tri Phan
In this paper, we demonstrate the application of low cost light weight imaging device that amplifies the imaging resolution of a smartphone camera by three orders of magnitude from millimeters to sub-micrometers. We attached the lens onto a commercial smartphone camera and imaged micrometer graticules, pathological biological tissue slides and skin which validate the imaging quality of lenses.
Image Quality Evaluation Methods/Standards for Mobile and Digital Photography: Joint Session with Conferences 9016 and 9023
icon_mobile_dropdown
No training blind image quality assessment
State of the art blind image quality assessment (IQA) methods generally extract perceptual features from the training images, and send them into support vector machine (SVM) to learn the regression model, which could be used to further predict the quality scores of the testing images. However, these methods need complicated training and learning, and the evaluation results are sensitive to image contents and learning strategies. In this paper, two novel blind IQA metrics without training and learning are firstly proposed. The new methods extract perceptual features, i.e., the shape consistency of conditional histograms, from the joint histograms of neighboring divisive normalization transform coefficients of distorted images, and then compare the length attribute of the extracted features with that of the reference images and degraded images in the LIVE database. For the first method, a cluster center is found in the feature attribute space of the natural reference images, and the distance between the feature attribute of the distorted image and the cluster center is adopted as the quality label. The second method utilizes the feature attributes and subjective scores of all the images in the LIVE database to construct a dictionary, and the final quality score is calculated by interpolating the subjective scores of nearby words in the dictionary. Unlike the traditional SVM based blind IQA methods, the proposed metrics have explicit expressions, which reflect the relationships of the perceptual features and the image quality well. Experiment results in the publicly available databases such as LIVE, CSIQ and TID2008 had shown the effectiveness of the proposed methods, and the performances are fairly acceptable.
Description of texture loss using the dead leaves target: current issues and a new intrinsic approach
Leonie Kirk, Philip Herzer, Uwe Artmann, et al.
The computing power in modern digital imaging devices allows complex denoising algorithms. The negative in uence of denoising on the reproduction of low contrast, ne details is also known as texture loss. Using the dead leaves structure is a common technique to describe the texture loss which is currently discussed as a standard method in workgroups of ISO and CPIQ. We present our experience using this method. Based on real camera data of several devices, we can point out where weak points in the SFRDeadLeaves method are and why results should be interpreted carefully. The SFRDeadLeaves approach follows the concept of a semi-reference method, so statistical characteristics of the target are compared to statistical characteristics in the image. In the case of SFRDeadLeaves, the compared characteristic is the power spectrum. The biggest disadvantage of using the power spectrum is that phase information is ignored, as only the complex modulus is used. We present a new approach, our experience with it and compare it to the SFR Dead Leaves method. The new method follows the concept of a full-reference method, which is an intrinsic comparison of image data to reference data.
Electronic trigger for capacitive touchscreen and extension of ISO 15781 standard time lag measurements to smartphones
François-Xavier Bucher, Frédéric Cao, Clément Viard, et al.
We present in this paper a novel capacitive device that stimulates the touchscreen interface of a smartphone (or of any imaging device equipped with a capacitive touchscreen) and synchronizes triggering with the DxO LED Universal Timer to measure shooting time lag and shutter lag according to ISO 15781:2013. The device and protocol extend the time lag measurement beyond the standard by including negative shutter lag, a phenomenon that is more and more commonly found in smartphones. The device is computer-controlled, and this feature, combined with measurement algorithms, makes it possible to automatize a large series of captures so as to provide more refined statistical analyses when, for example, the shutter lag of “zero shutter lag” devices is limited by the frame time as our measurements confirm.
Blur
icon_mobile_dropdown
Space-varying blur kernel estimation and image deblurring
In recent years, we have seen highly successful blind image deblurring algorithms that can even handle large motion blurs. Most of these algorithms assume that the entire image is blurred with a single blur kernel. This assumption does not hold if the scene depth is not negligible or when there are multiple objects moving differently in the scene. In this paper, we present a method for space-varying point spread function (PSF) estimation and image deblurring. Regarding the PSF estimation, we do not make any restrictions on the type of blur or how the blur varies spatially. That is, the blur might be, for instance, a large (non-parametric) motion blur in one part of an image and a small defocus blur in another part without any smooth transition. Once the space-varying PSF is estimated, we perform space-varying image deblurring, which produces good results even for regions where it is not clear what the correct PSF is at first. We provide experimental results with real data to demonstrate the effectiveness of our method.
Super-resolution restoration of motion blurred images
In this paper, we investigate super-resolution image restoration from multiple images, which are possibly degraded with large motion blur. The blur kernel for each input image is separately estimated. This is unlike many existing super-resolution algorithms, which assume identical blur kernel for all input images. We also do not make any restrictions on the motion fields among images; that is, we estimate dense motion field without simplifications such as parametric motion. We present a two-step algorithm: In the first step, each input image is deblurred using the estimated blur kernel. In the second step, super-resolution restoration is applied to the deblurred images. Because the estimated blur kernels may not be accurate, we propose a weighted cost function for the super-resolution restoration step, where a weight associated with an input image reflects the reliability of the corresponding kernel estimate and the deblurred image. We provide experimental results from real video data captured with a hand-held camera, and show that the proposed weighting scheme is robust to motion deblurring errors.
To denoise or deblur: parameter optimization for imaging systems
Kaushik Mitra, Oliver Cossairt, Ashok Veeraraghavan
In recent years smartphone cameras have improved a lot but they still produce very noisy images in low light conditions. This is mainly because of their small sensor size. Image quality can be improved by increasing the aperture size and/or exposure time however this make them susceptible to defocus and/or motion blurs. In this paper, we analyze the trade-off between denoising and deblurring as a function of the illumination level. For this purpose we utilize a recently introduced framework for analysis of computational imaging systems that takes into account the effect of (1) optical multiplexing, (2) noise characteristics of the sensor, and (3) the reconstruction algorithm, which typically uses image priors. Following this framework, we model the image prior using Gaussian Mixture Model (GMM), which allows us to analytically compute the Minimum Mean Squared Error (MMSE). We analyze the specific problem of motion and defocus deblurring, showing how to find the optimal exposure time and aperture setting as a function of illumination level. This framework gives us the machinery to answer an open question in computational imaging: To deblur or denoise?.
Depth from defocus using the mean spectral ratio
Depth from defocus aims to estimate scene depth from two or more photos captured with differing camera parameters, such as lens aperture or focus, by characterizing the difference in image blur. In the absence of noise, the ratio of Fourier transforms of two corresponding image patches captured under differing focus conditions reduces to the ratio of the optical transfer functions, since the contribution from the scene cancels. For a focus or aperture bracket, the shape of this spectral ratio depends on object depth. Imaging noise complicates matters, introducing biases that vary with object texture, making extraction of a reliable depth value from the spectral ratio difficult. We propose taking the mean of the complex valued spectral ratio over an image tile as a depth measure. This has the advantage of cancelling much of the effect of noise and significantly reduces depth bias compared to characterizing only the modulus of the spectral ratio. This method is fast to calculate and we do not need to assume any shape for the optical transfer function, such as a Gaussian approximation. Experiments with real world photographic imaging geometries show our method produces depth maps with greater tolerance to varying object texture than several previous depth from defocus methods.
An extensive empirical evaluation of focus measures for digital photography
Hashim Mir, Peter Xu, Peter van Beek
Automatic focusing of a digital camera in live preview mode, where the camera’s display screen is used as a viewfinder, is done through contrast detection. In focusing using contrast detection, a focus measure is used to map an image to a value that represents the degree of focus of the image. Many focus measures have been proposed and evaluated in the literature. However, previous studies on focus measures have either used a small number of benchmarks images in their evaluation, been directed at microscopy and not digital cameras, or have been based on ad hoc evaluation criteria. In this paper, we perform an extensive empirical evaluation of focus measures for digital photography and advocate using three standard statistical measures of performance— precision, recall, and mean absolute error—as evaluation criteria. Our experimental results indicate that (i) some popular focus measures perform poorly when applied to autofocusing in digital photography, and (ii) simple focus measures based on taking the first derivative of an image perform exceedingly well in digital photography.
Out-of-focus point spread functions
There are many ways in which the performance of a lens can be characterized, most of which measure properties of the in-focus image. The current work instead centers on measuring properties of an out-of-focus (OOF) image. The image created by imaging a point of light is commonly known as the point spread function (PSF). We have found that by measuring the OOF PSF a great deal of otherwise unavailable information about the lens can be obtained. The current work presents observations and sample images from measurements made on a collection of over 125 lenses. A variety of the attributes than can be obtained from study of OOF PSFs, and some of their applications, are discussed.
Image Processing Pipeline and Camera Characterization
icon_mobile_dropdown
Automating the design of image processing pipelines for novel color filter arrays: local, linear, learned (L3) method
Qiyuan Tian, Steven Lansel, Joyce E. Farrell, et al.
The high density of pixels in modern color sensors provides an opportunity to experiment with new color filter array (CFA) designs. A significant bottleneck in evaluating new designs is the need to create demosaicking, denoising and color transform algorithms tuned for the CFA. To address this issue, we developed a method(local, linear, learned or L3) for automatically creating an image processing pipeline. In this paper we describe the L3 algorithm and illustrate how we created a pipeline for a CFA organized as a 2×2 RGB/Wblock containing a clear (W) pixel. Under low light conditions, the L3 pipeline developed for the RGB/W CFA produces images that are superior to those from a matched Bayer RGB sensor. We also use L3 to learn pipelines for other RGB/W CFAs with different spatial layouts. The L3 algorithm shortens the development time for producing a high quality image pipeline for novel CFA designs.
Minimized-Laplacian residual interpolation for color image demosaicking
A color difference interpolation technique is widely used for color image demosaicking. In this paper, we propose a minimized-laplacian residual interpolation (MLRI) as an alternative to the color difference interpolation, where the residuals are differences between observed and tentatively estimated pixel values. In the MLRI, we estimate the tentative pixel values by minimizing the Laplacian energies of the residuals. This residual image transfor- mation allows us to interpolate more easily than the standard color difference transformation. We incorporate the proposed MLRI into the gradient based threshold free (GBTF) algorithm, which is one of current state-of- the-art demosaicking algorithms. Experimental results demonstrate that our proposed demosaicking algorithm can outperform the state-of-the-art algorithms for the 30 images of the IMAX and the Kodak datasets.
Image sensor noise profiling by voting based curve fitting
S. Battiato, G. Puglisi, R. Rizzo, et al.
The output quality of an image filter for reducing noise without damaging the underlying signal, strongly depends on the accuracy of the noise model in characterizing the noise introduced by the acquisition device. In this paper we provide a solution for characterizing signal dependent noise injected at shot time by the image sensor. Different fitting models describing the behavior of noise samples are analyzed, with the aim of finding a model that offers the most accurate coverage of the sensor noise under any of its operating conditions. The noise fitting equation minimizing the residual error is then identified. Moreover, a novel algorithm able to obtain the noise profile of a generic image sensor without the need of a controlled environment is proposed. Starting from a set of heterogeneous CFA images, by using a voting based estimator, the parameters of the noise model are estimated.
Analysis of a 64x64 matrix of direct color sensors based on spectrally tunable pixels
A. Caspani, G. Langfelder, A. Longoni, et al.
In the past years we conceived and developed the concept of spectrally tunable direct color sensors, based on the principle of the Transverse Field Detector. In this work we analyze the performance of a 64x64 (x3 colors) matrix of such a sensor built in a 150-nm CMOS standard technology for demonstrative purposes. The matrix is mounted on an electronic board that provides the biasing; the board is inserted into a suitably arranged film back slot (magazine) of a Hasselblad 500C camera. The camera is aligned in front of a transparency Macbeth Color Checker uniformly illuminated by an integrating sphere. Electrical, noise and colorimetric performance are the object of the ongoing analysis. In particular, the color reconstruction results will be compared to those obtainable through large-area devices based on the same concept.
Computer Vision and Applications
icon_mobile_dropdown
Light transport matrix recovery for nearly planar objects
Niranjan Thanikachalam, Loic Baboulaz, Paolo Prandoni, et al.
The light transport matrix has become a powerful tool for scene relighting, owing to the versatility of its representational power of various light transport phenomenon. We argue that scenes with an almost planar surface geometry, even with significant amounts of surface roughness, have a banded structure in the light transport matrix. In this paper, we propose a method that exploits this structure of the light transport matrix and provide significant savings in terms of both acquisition time and computation time, while retaining a high accuracy. We validate the proposed algorithm, by recovering the light transport of real objects that exhibit multiple scattering and with rendered scenes exhibiting inter-reflections.
The color of water: using underwater photography to estimatewater quality
John Breneman IV, Henryk Blasinski, Joyce Farrell
We describe a model for underwater illumination that is based on how light is absorbed and scattered by water, phytoplankton and other organic and inorganic matter in the water. To test the model, we built a color rig using a commercial point-and-shoot camera in an underwater housing and a calibrated color target. We used the measured spectral reflectance of the calibration color target and the measured spectral sensitivity of the camera to estimate the spectral power of the illuminant at the surface of the water. We then used this information, along with spectral basis functions describing light absorbance by water, phytoplankton, non-algal particles (NAP) and colored dissolved organic matter (CDOM), to estimate the spectral power of the illuminant and the amount of scattered light at each depth. Our results lead to insights about color correction, as well as the limitations of consumer digital cameras for monitoring water quality.
Surveillance system of power transmission line via object recognition and 3D vision computation
Surveillance systems have been widely applied on power transmission system for security precaution of the major risk factor that is the construction activity in the vicinity of power transmission line. However, currently used automatic object detection in surveillance systems suffers from high error rate and has at least two limitations: first, the type of the object can’t be recognized; second, the dangerous strength of the object cannot be identified. In this paper, we propose a video surveillance method for the security precaution of power transmission line via the techniques of object recognition and 3D spatial location detection so that the motion objects are recognized and the position and size of the object are determined to identify the dangerous strength. Experimental results show that the developed system based on our proposed method is feasible and practical.
Color
icon_mobile_dropdown
Metamer density estimation using an identical ellipsoidal Gaussian mixture prior
Yusuke Murayama, Pengchang Zhang, Ari Ide-Ektessabi
We proposed an improved method for camera metamer density estimation. Camera metamer is a set of spectral reflectance of object surface which induce an identical RGB response of a color imaging devices such as a digital color camera and scanner. It is desirable for high fidelity color correction to calculate the set of metamers and then choose the optimal value in a standard color space. Previous methods adopted too simple models to represent the constraint of spectral reflectance. The set of metamers were over-estimated and it declined the accuracy of color correction. We modeled the constraint of spectral reflectance as an identical ellipsoidal Gaussian mixture distribution, and tested and compared the proposed model and two conventional models in a numerical experiment. It was found that the proposed model can represent accurately the underlying caved patterns within the given dataset and avoid generating inappropriate camera metamers. The accuracy of color correction was also evaluated supposing two commercial cameras and two standard illuminants. It was shown that higher accuracy color correction was achieved by adopting the proposed model.
Absolute colorimetric characterization of a DSLR camera
Giuseppe Claudio Guarnera, Simone Bianco, Raimondo Schettini
A simple but effective technique for absolute colorimetric camera characterization is proposed. It offers a large dynamic range requiring just a single, off-the-shelf target and a commonly available controllable light source for the characterization. The characterization task is broken down in two modules, respectively devoted to absolute luminance estimation and to colorimetric characterization matrix estimation. The characterized camera can be effectively used as a tele-colorimeter, giving an absolute estimation of the XYZ data in cd=m2. The user is only required to vary the f - number of the camera lens or the exposure time t, to better exploit the sensor dynamic range. The estimated absolute tristimulus values closely match the values measured by a professional spectro-radiometer.
Simultaneous capturing of RGB and additional band images using hybrid color filter array
Extra band information in addition to the RGB, such as the near-infrared (NIR) and the ultra-violet, is valuable for many applications. In this paper, we propose a novel color filter array (CFA), which we call “hybrid CFA," and a demosaicking algorithm for the simultaneous capturing of the RGB and the additional band images. Our proposed hybrid CFA and demosaicking algorithm do not rely on any specific correlation between the RGB and the additional band. Therefore, the additional band can be arbitrarily decided by users. Experimental results demonstrate that our proposed demosaicking algorithm with the proposed hybrid CFA can provide the additional band image while keeping the RGB image almost the same quality as the image acquired by using the standard Bayer CFA.
HDR
icon_mobile_dropdown
Recovering badly exposed objects from digital photos using internet images
Florian M. Savoy, Vassilios Vonikakis, Stefan Winkler, et al.
In this paper we consider the problem of clipped-pixel recovery over an entire badly exposed image region, using two correctly exposed images of the scene that may be captured under different conditions. The first reference image is used to recover texture; feature points are extracted along the boundaries of both the source and reference regions, while a warping function deforms the reference region to fit inside the source. The second reference is used to recover color by replacing the mean and variance of the texture reference image with those of the color reference. A user study conducted with both modified and original images demonstrates the benefits of our method. The results show that a majority of the enhanced images look natural and are preferred to the originals.
Creating cinematic wide gamut HDR-video for the evaluation of tone mapping operators and HDR-displays
Jan Froehlich, Stefan Grandinetti, Bernd Eberhardt, et al.
High quality video sequences are required for the evaluation of tone mapping operators and high dynamic range (HDR) displays. We provide scenic and documentary scenes with a dynamic range of up to 18 stops. The scenes are staged using professional film lighting, make-up and set design to enable the evaluation of image and material appearance. To address challenges for HDR-displays and temporal tone mapping operators, the sequences include highlights entering and leaving the image, brightness changing over time, high contrast skin tones, specular highlights and bright, saturated colors. HDR-capture is carried out using two cameras mounted on a mirror-rig. To achieve a cinematic depth of field, digital motion picture cameras with Super-35mm size sensors are used. We provide HDR-video sequences to serve as a common ground for the evaluation of temporal tone mapping operators and HDR-displays. They are available to the scientific community for further research.
Cost-effective multi-camera array for high quality video with very high dynamic range
Joachim Keinert, Marcus Wetzel, Michael Schöberl, et al.
Temporal bracketing can create images with higher dynamic range than the underlying sensor. Unfortunately, moving objects cause disturbing artifacts. Moreover, the combination with high frame rates is almost unachiev­ able since a single video frame requires multiple sensor readouts. The combination of multiple synchronized side-by-side cameras equipped with different attenuation filters promises a remedy, since all exposures can be performed at the same time with the same duration using the playout video frame rate. However, a disparity correction is needed to compensate the spatial displacement of the cameras. Unfortunately, the requirements for a high quality disparity correction contradict the goal to increase dynamic range. When using two cameras, disparity correction needs objects to be properly exposed in both cameras. In contrast, a dynamic range in­crease needs the cameras to capture different luminance ranges. As this contradiction has not been addressed in literature so far, this paper proposes a novel solution based on a three camera setup. It enables accurate de­ termination of the disparities and an increase of the dynamic range by nearly a factor of two while still limiting costs. Compared to a two camera solution, the mean opinion score (MOS) is improved by 13.47 units in average for the Middleburry images.
The effect of split pixel HDR image sensor technology on MTF measurements
Split-pixel HDR sensor technology is particularly advantageous in automotive applications, because the images are captured simultaneously rather than sequentially, thereby reducing motion blur. However, split pixel technology introduces artifacts in MTF measurement. To achieve a HDR image, raw images are captured from both large and small sub-pixels, and combined to make the HDR output. In some cases, a large sub-pixel is used for long exposure captures, and a small sub-pixel for short exposures, to extend the dynamic range. The relative size of the photosensitive area of the pixel (fill factor) plays a very significant role in the output MTF measurement. Given an identical scene, the MTF will be significantly different, depending on whether you use the large or small sub-pixels i.e. a smaller fill factor (e.g. in the short exposure sub-pixel) will result in higher MTF scores, but significantly greater aliasing. Simulations of split-pixel sensors revealed that, when raw images from both sub-pixels are combined, there is a significant difference in rising edge (i.e. black-to-white transition) and falling edge (white-to-black) reproduction. Experimental results showed a difference of ~50% in measured MTF50 between the falling and rising edges of a slanted edge test chart.
Interactive Paper Session
icon_mobile_dropdown
A method of mobile display (OLED/LCD) sharpness assessment through the perceptual brightness and edge characteristic of display and image
Min Woo Lee, Jee Young Yeom, Jong Ho Kim, et al.
Image quality is a complex task because of the compound impacts of various aspects such as display traits, image contents, and the human visual system. Recently, a variety of portable digital devices has been released and their image quality becomes one of the significant factors for typical consumers; however image quality evaluation for mobile displays has not understood well. This paper aims to propose a method to evaluate sharpness of mobile displays based upon both display and image characteristics such as color gamut, spatial resolution, color appearance attributes
Spatial adaptive upsampling filter for HDR image based on multiple luminance range
Qian Chen, Guan-ming Su, Yin Peng
In this paper, we propose an adaptive upsampling filter to spatially upscale HDR image based on luminance range of the HDR picture in each color channel. It first searches for the optimal luminance range values to partition an HDR image to three different parts: dark, mid-tone and highlight. Then we derive the optimal set of filter coefficients both vertically and horizontally for each part. When the HDR pixel is within the dark area, we apply one set of filter coefficients to vertically upsample the pixel. If the HDR pixel falls in mid-tone area, we apply another set of filter for vertical upsampling. Otherwise the HDR pixel is in highlight area, another set of filter will be applied for vertical upsampling. Horizontal upsampling will be carried out likewise based on its luminance. The inherent idea to partition HDR image to different luminance areas is based on the fact that most HDR images are created from multiple exposures. Different exposures usually demonstrate slight variation in captured signal statistics, such as noise level, subtle misalignment etc. Hence, to group different regions to three luminance partitions actually helps to eliminate the variation between signals, and to derive optimal filter for each group with signals of lesser variation is certainly more efficient than for the entire HDR image. Experimental results show that the proposed adaptive upsampling filter based on luminance ranges outperforms the optimal upsampling filter around 0.57dB for R channel, 0.44dB for G channel and 0.31dB for B channel.
A classification-and-reconstruction approach for a single image super-resolution by a sparse representation
YingYing Fan, Masayuki Tanaka, Masatoshi Okutomi
A sparse representation is known as a very powerful tool to solve image reconstruction problem such as denoising and the single image super-resolution. In the sparse representation, it is assumed that an image patch or data can be approximated by a linear combination of a few bases selected from a given dictionary. A single overcomplete dictionary is usually learned with training patches. Dictionary learning methods almost are concerned about building a general over-complete dictionary on the assumption that the bases in dictionary can represent everything. However, using more appropriate dictionary, the sparse representation of patch can obtain better results. In this paper, we propose a classification-and-reconstruction approach with multiple dictionaries. Before learning dictionary for reconstruction, some representative bases can be used to classify all training patches from database and multiple dictionaries for reconstruction can be learned by classified patches respectively. In reconstruction phase, the patch of input image can be classified and the adaptive dictionary can be selected to use. We demonstrate that the proposed classification-and-reconstruction approach outperforms existing sparse representation with the single dictionary.
LoG acts as a good feature in the task of image quality assessment
In the previous work, the LoG (Laplacian of Gaussian) signal that is the earliest stage output of human visual neural system was suggested to be useful in image quality assessment (IQA) model design. This work considered that LoG signal carried crucial structural information of IQA in the position of its zero-crossing and proposed a Non-shift Edge (NSE) based IQA model. In this study, we focus on another aspect of the properties of the LoG signal, i.e., LoG whitens the power spectrum of natural images. Here our interest is that: when exposed to unnatural images, specifically distorted images, how does the HVS whitening this type of signals? In this paper, we first investigate the whitening filter for natural image and distorted image respectively, and then suggest that the LoG is also a whitening filter for distorted images to some extent. Based on this fact, we deploy the LOG signal in the task of IQA model design by applying two very simple distance metrics, i.e., the MSE (mean square error) and the correlation. The proposed models are analyzed according to the evaluation performance on three subjective databases. The experimental results validate the usability of the LoG signal in IQA model design and that the proposed models stay in the state-of-the-art IQA models.
White constancy method for mobile displays
Ji Young Yum, Hyun Hee Park, Seul Ki Jang, et al.
In these days, consumer’s needs for image quality of mobile devices are increasing as smartphone is widely used. For example, colors may be perceived differently when displayed contents under different illuminants. Displayed white in incandescent lamp is perceived as bluish, while same content in LED light is perceived as yellowish. When changed in perceived white under illuminant environment, image quality would be degraded. Objective of the proposed white constancy method is restricted to maintain consistent output colors regardless of the illuminants utilized. Human visual experiments are performed to analyze viewers’perceptual constancy. Participants are asked to choose the displayed white in a variety of illuminants. Relationship between the illuminants and the selected colors with white are modeled by mapping function based on the results of human visual experiments. White constancy values for image control are determined on the predesigned functions. Experimental results indicate that propsed method yields better image quality by keeping the display white.