Proceedings Volume 7242

Image Quality and System Performance VI

cover
Proceedings Volume 7242

Image Quality and System Performance VI

View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 19 January 2009
Contents: 14 Sessions, 48 Papers, 0 Presentations
Conference: IS&T/SPIE Electronic Imaging 2009
Volume Number: 7242

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Front Matter: Volume 7242
  • Image Quality Standards for Print
  • Image Quality Standards for Capture and Display
  • Subjective Image Quality Evaluation Methodology I
  • Subjective Image Quality Evaluation Methodology II
  • Image Quality Attributes Characterization and Measurement I
  • Image Quality Attributes Characterization and Measurement II
  • Objective Metrics of Perceptual Image Quality I
  • Objective Metrics of Perceptual Image Quality II
  • System Performance: Advanced Display Technologies
  • System Performance: Capture and Display
  • System Performance: Mobile Phones and CMOS Cameras
  • System Performance: Video
  • Interactive Paper Session
Front Matter: Volume 7242
icon_mobile_dropdown
Front Matter: Volume 7242
This PDF file contains the front matter associated with SPIE Proceedings Volume 7242, including the Title Page, Copyright information, Table of Contents, and the Conference Committee listing.
Image Quality Standards for Print
icon_mobile_dropdown
Characteristic measurements for the qualification of reflection scanners in the evaluation of image quality attributes
Eric K. Zeise
The claimed specifications of reflection scanners make their utilization for analytic measurement very tempting. This paper summarizes an effort by Working Group 4 of ISO/IEC JTC-1 SC28 to develop evaluation methods that can be used to characterize the performance of reflection scanners with the goal of developing a sufficient characterization set to serve as evaluation methods in conformance testing for future image quality standards. Promising evaluation methods for tone-scale, spatial and temporal uniformity, spatial distortion, SNR, dynamic range and flare characteristics will be described.
INCITS W1.1 standards for perceptual evaluation of text and line quality
INCITS W1.1 is a project chartered to develop an appearance-based image quality standard. This paper summarizes the work to date of the W1.1 Text and Line Quality ad hoc team, and describes the progress made in developing a Text Quality test pattern and an analysis procedure based on experience with previous perceptual rating experiments.
W1.1 macro uniformity
D. René Rasmussen, Frans Gaykema, Yee S. Ng, et al.
The INCITS W1.1 macro-uniformity team works towards the development of a standard for evaluation of perceptual image quality of color printers. The team specifically addresses the types of defects that fall in the category of macrouniformity, such as streaks, bands and mottle. This paper provides a brief summary of the status of this work, and describes recent results regarding the precision of the macro-uniformity quality ruler for assessment of typical printer defects.
Measurement of contributing attributes of perceived printer resolution.
Eric K. Zeise, Sang Ho Kim, Brian E. Cooper, et al.
Several measurable image quality attributes contribute to the perceived resolution of a printing system. These contributing attributes include addressability, sharpness, raggedness, spot size, and detail rendition capability. This paper summarizes the development of evaluation methods that will become the basis of ISO 29112, a standard for the objective measurement of monochrome printer resolution.
Image Quality Standards for Capture and Display
icon_mobile_dropdown
Softcopy quality ruler method: implementation and validation
Elaine W Jin, Brian W. Keelan, Junqing Chen, et al.
A softcopy quality ruler method was implemented for the International Imaging Industry Association (I3A) Camera Phone Image Quality (CPIQ) Initiative. This work extends ISO 20462 Part 3 by virtue of creating reference digital images of known subjective image quality, complimenting the hardcopy Standard Reference Stimuli (SRS). The softcopy ruler method was developed using images from a Canon EOS 1Ds Mark II D-SLR digital still camera (DSC) and a Kodak P880 point-and-shoot DSC. Images were viewed on an Apple 30in Cinema Display at a viewing distance of 34 inches. Ruler images were made for 16 scenes. Thirty ruler images were generated for each scene, representing ISO 20462 Standard Quality Scale (SQS) values of approximately 2 to 31 at an increment of one just noticeable difference (JND) by adjusting the system modulation transfer function (MTF). A Matlab GUI was developed to display the ruler and test images side-by-side with a user-adjustable ruler level controlled by a slider. A validation study was performed at Kodak, Vista Point Technology, and Aptina Imaging in which all three companies set up a similar viewing lab to run the softcopy ruler method. The results show that the three sets of data are in reasonable agreement with each other, with the differences within the range expected from observer variability. Compared to previous implementations of the quality ruler, the slider-based user interface allows approximately 2x faster assessments with 21.6% better precision.
Correlating objective and subjective evaluation of texture appearance with applications to camera phone imaging
Jonathan B. Phillips, Stephen M. Coppola, Elaine W. Jin, et al.
Texture appearance is an important component of photographic image quality as well as object recognition. Noise cleaning algorithms are used to decrease sensor noise of digital images, but can hinder texture elements in the process. The Camera Phone Image Quality (CPIQ) initiative of the International Imaging Industry Association (I3A) is developing metrics to quantify texture appearance. Objective and subjective experimental results of the texture metric development are presented in this paper. Eight levels of noise cleaning were applied to ten photographic scenes that included texture elements such as faces, landscapes, architecture, and foliage. Four companies (Aptina Imaging, LLC, Hewlett-Packard, Eastman Kodak Company, and Vista Point Technologies) have performed psychophysical evaluations of overall image quality using one of two methods of evaluation. Both methods presented paired comparisons of images on thin film transistor liquid crystal displays (TFT-LCD), but the display pixel pitch and viewing distance differed. CPIQ has also been developing objective texture metrics and targets that were used to analyze the same eight levels of noise cleaning. The correlation of the subjective and objective test results indicates that texture perception can be modeled with an objective metric. The two methods of psychophysical evaluation exhibited high correlation despite the differences in methodology.
Imaging performance taxonomy
A significant challenge in the adoption of today's digital imaging standards is a clear connection to how they relate to today's vernacular digital imaging vocabulary. Commonly used terms like resolution, dynamic range, delta E, white balance, exposure, or depth of focus are mistakenly considered measurements in their own right and are frequently depicted as a disconnected shopping list of individual metrics with little common foundation. In fact many of these are simple summary measures derived from more fundamental imaging science/engineering metrics, adopted in existing standard protocols. Four important underlying imaging performance metrics are; Spatial Frequency Response (SFR), Opto-Electronic Conversion Function (OECF), Noise Power Spectrum (NPS), and Spatial Distortion. We propose an imaging performance taxonomy. With a primary focus on image capture performance, our objective is to indicate connections between related imaging characteristics, and provides context for the array of commonly used terms. Starting with the concepts of Signal and Noise, the above imaging performance metrics are related to several simple measures that are compatible with testing for design verification, manufacturing quality assurance, and technology selection evaluation.
Extended use of ISO 15739 incremental signal-to-noise ratio as reliability criterion for multiple-slope wide dynamic range image capture
In the emerging field of automotive vision, video capture is the critical front-end of driver assistance and active safety systems. Previous photospace measurements have shown that light levels in natural traffic scenes may contain an extremely wide intra-scene intensity range. This requires the camera to have a wide dynamic range (WDR) for it to adapt quickly to changing lighting conditions and to reliably capture all scene detail. Multiple-slope CMOS technology offers a cost-effective way of adaptively extending dynamic range by partially resetting (recharging) the CMOS pixel once or more often within each frame time. This avoids saturation and leads to a response curve with piecewise linear slopes of progressively increasing compression. It was observed that the image quality from multiple-slope image capture is strongly dependent on the control (height and time) of each reset barrier. As compression and thus dynamic range increase there is a trade-off against contrast and detail loss. Incremental signal-to-noise ratio (iSNR) is proposed in ISO 15739 for determining dynamic range. Measurements and computer simulations revealed that the observed trade-off between WDR extension and the loss of local detail could be explained by a drop in iSNR at each reset point. If a reset barrier is not optimally placed then iSNR may drop below the detection limit so that an 'iSNR hole' appears in the dynamic range. Thus ISO 15739 iSNR has gained extended utility: it not only measures the dynamic range limits but also defines dynamic range as the intensity range where detail detection is reliable. It has become a critical criterion when designing adaptive barrier control algorithms that maximize dynamic range while maintaining the minimum necessary level of detection reliability.
Subjective Image Quality Evaluation Methodology I
icon_mobile_dropdown
Web-based psychometric evaluation of image quality
Iris Sprow, Zofia Baranczuk, Tobias Stamm, et al.
The measurement of image quality requires the judgement by the human visual system. This paper describes a psycho-visual test technique that uses the internet as a test platform to identify image quality in a more time-effective manner, comparing the visual response data with the results from the same test in a lab-based environment and estimate the usefulness of the internet as a platform for scaling studies.
Development of a balanced test image for visual print quality evaluation
Hanne Salmi, Raisa Halonen, Tuomas Leisti, et al.
A test image for color still image processes was developed. The image is based on general requirements on the content and specific requirements arising from the quality attributes of interest. The quality attributes addressed in the study include sharpness, noise, contrast, colorfulness and gloss. These were chosen based on visual relevance in studies of the influence of paper in digital printing. Further requirements such as arising from the use cases of the image are discussed based on eye tracking data and self-report of the usefulness of different objects for quality evaluation. From the standpoint of being sufficiently sensitive to quality variations of the imaging systems to be measured the reference test image needs to represent quality maxima in terms of the relevant quality parameters. As for different viewing times, no object should be exceedingly salient. The paper presents the procedure of developing the test image and discusses its merits and shortcomings from the standpoint of future development.
Perceptual image attribute scales derived from overall image quality assessments
Kyung Hoon Oh, Sophie Triantaphillidou, Ralph E. Jacobson
Psychophysical scaling is commonly based on the assumption that the overall quality of images is based on the assessment of individual attributes which the observer is able to recognise and separate, i.e. sharpness, contrast, etc. However, the assessment of individual attributes is a subject of debate, since they are unlikely to be independent from each other. This paper presents an experiment that was carried to derive individual perceptual attribute interval scales from overall image quality assessments, therefore examine the weight of each individual attribute to the overall perceived quality. A psychophysical experiment was taken by fourteen observers. Thirty two original images were manipulated by adjusting three physical parameters that altered image blur, noise and contrast. The data were then arranged by permutation, where ratings for each individual attribute were averaged to examine the variation of ratings in other attributes. The results confirmed that one JND of added noise and one JND of added blurring reduced image quality more than did one JND in contrast change. Furthermore, they indicated that the range of distortion that was introduced by blurring covered the entire image quality scale but the ranges of added noise and contrast adjustments were too small for investigating the consequences in the full range of image quality. There were several interesting tradeoffs between noise, blur and changes in contrast. Further work on the effect of (test) scene content was carried out to objectively reveal which types of scenes were significantly affected by changes in each attribute.
Subjective Image Quality Evaluation Methodology II
icon_mobile_dropdown
Subjective experience of image quality: attributes, definitions, and decision making of subjective image quality
Subjective quality rating does not reflect the properties of the image directly, but it is the outcome of a quality decision making process, which includes quantification of subjective quality experience. Such a rich subjective content is often ignored. We conducted two experiments (with 28 and 20 observers), in order to study the effect of paper grade on image quality experience of the ink-jet prints. Image quality experience was studied using a grouping task and a quality rating task. Both tasks included an interview, but in the latter task we examined the relations of different subjective attributes in this experience. We found out that the observers use an attribute hierarchy, where the high-level attributes are more experiential, general and abstract, while low-level attributes are more detailed and concrete. This may reflect the hierarchy of the human visual system. We also noticed that while the observers show variable subjective criteria for IQ, the reliability of average subjective estimates is high: when two different observer groups estimated the same images in the two experiments, correlations between the mean ratings were between .986 and .994, depending on the image content.
Toward an automatic subjective image quality assessment system
M. Chambah, S. Ouni, M. Herbin, et al.
Usually in the field of image quality assessment the terms "automatic" and "subjective" are often incompatible. In fact, when it comes to image quality assessment, we have mostly two kinds of evaluation techniques: subjective evaluation and objective evaluation. Only objective evaluation techniques being automatizable, while subjective evaluation techniques are performed by a series of visual assessment done by expert or non-expert observers. In this paper, we will present a first attempt to an automatic subjective quality assessment system. The system computes some perception correlated color metrics from a learning set of images. During the learning stage a subjective assessment by users is required so that the system matches the subjective opinions with computed metrics on a variety of images. Once the learning process is over, the system operates in an automatic mode using only the learned knowledge and the reference free computed metrics from the images to assess. Results and also future prospects of this work are presented.
Methods for measuring display defects as correlated to human perception
H. Kostal, G. Pedeville, R. Rykowski
Human vision and perception are the ultimate determinants of display quality, however human judgment is variable, making it difficult to define and apply quantitatively in research or production environments. However, traditional methods for automated defect detection do not relate directly to human perception - which is especially an issue in identifying just noticeable differences. Accurately correlating human perceptions of defects with the information that can be gathered using imaging colorimeters offers an opportunity for objective and repeatable detection and quantification of such defects. By applying algorithms for just noticeable differences (JND) image analysis, a means of automated, repeatable, display analysis directly correlated with human perception can be realized. The implementation of this technique and typical results are presented. Initial application of the JND analysis provides quantitative information that allows a quantitative grading of display image quality for FPDs and projection displays, supplementing other defect detection techniques.
Image Quality Attributes Characterization and Measurement I
icon_mobile_dropdown
A strobe-based inspection system for drops-in-flight
Imaging and measurement of drops-in-flight often relies on the measurement system's ability to drive the print head directly in order to synchronize the strobe for repeatable image capture. In addition, many systems do not have the necessary combination of strobe control and image analysis for full drop-in-flight evaluation. This paper includes a discussion of an integrated machine-vision based system for visualization and measurement of drops-in-flight that can be used with any frequency-based jetting system. The strobe is linked to the firing frequency of the print head, so while it is synchronized, it is independent of the specific print head being inspected. The imaging system resolves droplets down to 2 picoliters in volume at the highest zoom level. And an open architecture software package allows for image collection and archiving as well as powerful and flexible image analysis. This paper will give an overview of the details of this system as well as show some of the system capabilities through several examples of drop-in-flight analysis.
Image on paper registration measurement and analysis: determining subsystem contributions from a system level measurement
Rakesh Kulkarni, Abu Islam, Dan Costanza
An important print quality attribute of digital printing equipment deals with the absolute position of the printed image relative to the page. Historically, the most precise method of measuring image to paper (IOP) registration is by scanning a printed sheet on a flatbed scanner. These measurements have been limited to sheets smaller than the full capacity of the printer. In addition, the precision of the measurement has been limited by the accuracy of the scanner itself and the measurement of a few (~4) points on the page have limited the information that can be gathered. The new method proposed in this paper measures IOP registration throughout the sheet in a more precise manner. In a similar fashion, the relative position of the image on both the simplex and duplex side of the print can be determined. In addition, the new method helps link the source of registration errors to individual sub-systems. By generating the individual error sources from a printed sheet enables the understanding of the percentage contribution of each sub-system, prioritizes efforts to obtain better IOP performance, finds initial IOP setup errors of a printing engine, compares different technologies affecting IOP registration in sub-systems and potentially acts as a diagnostic tool for individual sub-systems.
Effect of image path bit depth on image quality
Digital Tone Reproduction Curves (TRCs) are applied to digital images for a variety of purposes including compensation for temporal engine drift, engine-to-engine color balancing, user preference, spatial nonuniformity, and gray balance. The introduction of one or more compensating TRCs can give rise to different types of image quality defects: Tonal errors occur when the printed value differs from the intended value; contours occur when the output step size is larger than the intended step size; pauses occur when two adjacent gray levels map to the same output level. Multiple-stage TRCs are implemented when compensation operations are performed independently, such as independent adjustment for temporal variation and user preference. Multiple TRCs are often implemented as independent operations to avoid complexity within an image path. The effect of each TRC cascades as an image passes through the image path. While the original image possesses given and assumed desirable quantization properties, the image passed through cascaded TRCs can possess tonal errors and gray level step sizes associated with a much lower bit-depth system. In the present study, we quantify errors (tonal errors and changes in gray-level step size) incurred by image paths with cascaded TRCs. We evaluate image paths at various bit depths. We consider real-life scenarios in which the local gray-level slope of cascaded compensating TRCs can implement an increase by as much as 200% and decrease by as much as 66%.
Image Quality Attributes Characterization and Measurement II
icon_mobile_dropdown
Determination of optimal coring values from psychophysical experiments
The use of color electrophotographic (EP) laser printing systems is growing because of their declining cost. Thus, the print quality of color EP laser printers is more important than ever before. Since text and lines are indispensable to print quality, many studies have proposed methods for measuring these print quality attributes. Toner scatter caused by toner overdevelopment in color EP laser printers can significantly impact print quality. A conventional approach to reduce toner overdevelopment is to restrict the color gamut of printers. However, this can result in undesired color shifts and the introduction of halftone texture in light regions. Coring, defined as a process whereby the colorant level is reduced in the interior of text or characters, is a remedy for these shortcomings. The desired amount of reduction for coring depends on line width and overall nominal colorant level. In previous work, these amounts were chosen on the basis of data on the perception of edge blur that was published over 25 years ago.
Detection of worms in error diffusion halftoning
Digital halftoning is used to reproduce a continuous tone image with a printer. One of these halftoning algorithms, error diffusion, suffers from certain artifacts. One of these artifacts is commonly denoted as worms. We propose a simple measure for detection of worm artifacts. The proposed measure is evaluated by a psychophysical experiment, where 4 images were reproduced using 5 different error diffusion algorithms. The results indicate a high correlation between the predicted worms and perceived worms.
Characterization of '2D noise' print defect
Ki-Youn Lee, Yousun Bang, Heui-Keun Choh
Graininess and mottle described by ISO 13660 standard are two image quality attributes which are widely used to evaluate area uniformity in digital prints. In an engineering aspect, it is convenient to classify and analyze high frequency noise and low frequency noise separately. However, it is continuously reported in previous literature that the ISO methods do not properly correlate with our perception. Since area quality is evaluated by observing all the characteristics with a wide range of spectral frequencies in a printed page, it is almost impossible to differentiate between graininess and mottle separately in our percept. In this paper, we characterize '2D noise' print defect based on psychophysical experiments which appear as two dimensional aperiodic fluctuations in digital prints. For each channel of cyan, magenta, and black, our approach is to use two steps of hybrid filtering to remove invisible image components in the printed area. '2D noise' is computed as the weighted sum of the graininess and mottle, which two weighting factors are determined by subjective evaluation experiment. By conducting psychophysical validation experiments, the strong correlation is obtained between the proposed metric and the perceived scales. The correlation coefficients r2 are 0.90, 0.86, and 0.78 for cyan, magenta and black, respectively.
Measurement of printer MTFs
Albrecht J. Lindner, Nicolas Bonnier, Christophe Leynadier, et al.
In this paper we compare three existing methods to measure the Modulation Transfer Function (MTF) of a printing system. Although all three methods use very distinct approaches, the MTF values computed for two of these methods strongly agree, lending credibility to these methods. Additionally, we propose an improvement to one of these two methods, initially proposed by Jang & Allebach. We demonstrate that our proposed modification improves the measurement precision and simplicity of implementation. Finally we discuss the pros and cons of the methods depending on the intended usage of the MTF.
Objective Metrics of Perceptual Image Quality I
icon_mobile_dropdown
Image quality assessment by preprocessing and full reference model combination
S. Bianco, G. Ciocca, F. Marini, et al.
This paper focuses on full-reference image quality assessment and presents different computational strategies aimed to improve the robustness and accuracy of some well known and widely used state of the art models, namely the Structural Similarity approach (SSIM) by Wang and Bovik and the S-CIELAB spatial-color model by Zhang and Wandell. We investigate the hypothesis that combining error images with a visual attention model could allow a better fit of the psycho-visual data of the LIVE Image Quality assessment Database Release 2. We show that the proposed quality assessment metric better correlates with the experimental data.
Image quality assessment with manifold and machine learning
A crucial step in image compression is the evaluation of its performance, and more precisely the available way to measure the final quality of the compressed image. In this paper, a machine learning expert, providing a final class number is designed. The quality measure is based on a learned classification process in order to respect the one of human observers. Instead of computing a final note, our method classifies the quality using the quality scale recommended by the UIT. This quality scale contains 5 ranks ordered from 1 (the worst quality) to 5 (the best quality). This was done constructing a vector containing many visual attributes. Finally, the final features vector contains more than 40 attibutes. Unfortunatley, no study about the existing interactions between the used visual attributes has been done. A feature selection algorithm could be interesting but the selection is highly related to the further used classifier. Therefore, we prefer to perform dimensionality reduction instead of feature selection. Manifold Learning methods are used to provide a low-dimensional new representation from the initial high dimensional feature space. The classification process is performed on this new low-dimensional representation of the images. Obtained results are compared to the one obtained without applying the dimension reduction process to judge the efficiency of the method.
Three-component weighted structural similarity index
The assessment of image quality is very important for numerous image processing applications, where the goal of image quality assessment (IQA) algorithms is to automatically assess the quality of images in a manner that is consistent with human visual judgment. Two prominent examples, the Structural Similarity Image Metric (SSIM) and Multi-scale Structural Similarity (MS-SSIM) operate under the assumption that human visual perception is highly adapted for extracting structural information from a scene. Results in large human studies have shown that these quality indices perform very well relative to other methods. However, the performance of SSIM and other IQA algorithms are less effective when used to rate amongst blurred and noisy images. We address this defect by considering a three-component image model, leading to the development of modified versions of SSIM and MS-SSIM, which we call three component SSIM (3-SSIM) and three component MS-SSIM (3-MS-SSIM). A three-component image model was proposed by Ran and Farvardin, [13] wherein an image was decomposed into edges, textures and smooth regions. Different image regions have different importance for vision perception, thus, we apply different weights to the SSIM scores according to the region where it is calculated. Thus, four steps are executed: (1) Calculate the SSIM (or MS-SSIM) map. (2) Segment the original (reference) image into three categories of regions (edges, textures and smooth regions). Edge regions are found where a gradient magnitude estimate is large, while smooth regions are determined where the gradient magnitude estimate is small. Textured regions are taken to fall between these two thresholds. (3) Apply non-uniform weights to the SSIM (or MS-SSIM) values over the three regions. The weight for edge regions was fixed at 0.5, for textured regions it was fixed at 0.25, and at 0.25 for smooth regions. (4) Pool the weighted SSIM (or MS-SSIM) values, typically by taking their weighted average, thus defining a single quality index for the image (3-SSIM or 3-MS-SSIM). Our experimental results show that 3-SSIM (or 3-MS-SSIM) provide results consistent with human subjectivity when finding the quality of blurred and noisy images, and also deliver better performance than SSIM (and MS-SSIM) on five types of distorted images from the LIVE Image Quality Assessment Database.
An image similarity metric based on quadtree homogeneity analysis
Eric P. Lam, Thai N Luong, Mark P. Miller, et al.
Comparing two similar images is often needed to evaluate the effectiveness of an image processing algorithm. But, there is no one widely used objective measure. In many papers, the mean squared error (MSE) or peak signal to noise ratio (PSNR) are used. These measures rely entirely on pixel intensities. Though these measures are well understood and easy to implement, they do not correlate well with perceived image quality. This paper will present an image quality metric that analyzes image structure rather than entirely on pixels. It extracts image structure with the use of a recursive quadtree decomposition. A similarity comparison function based on contrast, luminance, and structure will be presented.
Objective Metrics of Perceptual Image Quality II
icon_mobile_dropdown
Most apparent distortion: a dual strategy for full-reference image quality assessment
The mainstream approach to image quality assessment has centered around accurately modeling the single most relevant strategy employed by the human visual system (HVS) when judging image quality (e.g., detecting visible differences; extracting image structure/information). In this paper, we suggest that a single strategy may not be sufficient; rather, we advocate that the HVS uses multiple strategies to determine image quality. For images containing near-threshold distortions, the image is most apparent, and thus the HVS attempts to look past the image and look for the distortions (a detection-based strategy). For images containing clearly visible distortions, the distortions are most apparent, and thus the HVS attempts to look past the distortion and look for the image's subject matter (an appearance-based strategy). Here, we present a quality assessment method (MAD: Most Apparent Distortion) which attempts to explicitly model these two separate strategies. Local luminance and contrast masking are used to estimate detection-based perceived distortion in high-quality images, whereas changes in the local statistics of spatial-frequency components are used to estimate appearance-based perceived distortion in low-quality images. We show that a combination of these two measures can perform well in predicting subjective ratings of image quality.
Low level features for image appeal measurement
Image appeal may be defined as the interest that a photograph generates when viewed by human observers, incorporating subjective factors on top of the traditional objective quality measures. User studies were conducted in order to identify the right features to use in an image appeal measure; these studies also revealed that a photograph may be appealing even if only a region/area of the photograph is actually appealing. Due to the importance of faces regarding image appeal, a detailed study of a set of face features is also presented, including face size, color and smile detection. Extensive experimentation helped identify a good set of low level features, which are described in depth. These features were optimized using extensive ground truth generated from sets of consumer photos covering all possible appeal levels, by observers with a range of expertise in photography.
SCID: full reference spatial color image quality metric
S. Ouni, M. Chambah, M. Herbin, et al.
The most used full reference image quality assessments are error-based methods. Thus, these measures are performed by pixel based difference metrics like Delta E ( E), MSE, PSNR, etc. Therefore, a local fidelity of the color is defined. However, these metrics does not correlate well with the perceived image quality. Indeed, they omit the properties of the HVS. Thus, they cannot be a reliable predictor of the perceived visual quality. All this metrics compute the differences pixel to pixel. Therefore, a local fidelity of the color is defined. However, the human visual system is rather sensitive to a global quality. In this paper, we present a novel full reference color metric that is based on characteristics of the human visual system by considering the notion of adjacency. This metric called SCID for Spatial Color Image Difference, is more perceptually correlated than other color differences such as Delta E. The suggested full reference metric is generic and independent of image distortion type. It can be used in different application such as: compression, restoration, etc.
An evaluation of interactive image matting techniques supported by eye tracking
Christoph Rhemann, Margrit Gelautz, Bernhard Fölsner
Recently, the quantitative evaluation of interactive single image matting techniques has become possible by the introduction of high-quality ground truth datasets. However, quantitative comparisons conducted in previous work are based on error metrics (e.g. sum of absolute differences) that are not necessarily correlated to the visual quality of the image as perceived by the user. This motivates research to better understand the perception of errors inherent to matting algorithms, in order to provide the ground for a future design of error metrics that better reflect the subjective impression of the human observer. In this work we gain novel insights into the perception of errors due to imperfect matting results. To investigate these errors, we compare two recent state-of-the-art matting algorithms in a user study. We use an eye-tracker to reveal details of the decision making of the users. The data acquired in the user study show a considerable correlation between expert knowledge in photography and the ability of the user to detect errors in the image. This is also reflected in the eye-tracking data which reveals different types of scanning paths dependent on the experience of the user.
System Performance: Advanced Display Technologies
icon_mobile_dropdown
Perception of detail in 3D images
A lot of current 3D displays suffer from the fact that their spatial resolution is lower compared to their 2D counterparts. One reason for this is that the multiple views needed to generate 3D are often spatially multiplexed. Besides this, imperfect separation of the left- and right-eye view leads to blurring or ghosting, and therefore to a decrease in perceived sharpness. However, people watching stereoscopic videos have reported that the 3D scene contained more details, compared to the 2D scene with identical spatial resolution. This is an interesting notion, that has never been tested in a systematic and quantitative way. To investigate this effect, we had people compare the amount of detail ("detailedness") in pairs of 2D and 3D images. A blur filter was applied to one of the two images, and the blur level was varied using an adaptive staircase procedure. In this way, the blur threshold for which the 2D and 3D image contained perceptually the same amount of detail could be found. Our results show that the 3D image needed to be blurred more than the 2D image. This confirms the earlier qualitative findings that 3D images contain perceptually more details than 2D images with the same spatial resolution.
Perception of time variable quality of scene objects
Leif A. Ronningen, Erlend Heiberg
The paper focuses on testing the user perception of time-variable quality of scene objects. The scenes are generated and presented by the DMP (Distributed Multimedia Plays) packet-based system for futuristic continental multimedia collaboration. DMP guarantees maximum user-to-user delay and minimum scene quality. Time-variable scene quality control is obtained by adapting the composition and object resolution to traffic load in the network, by controlled dropping of sub-objects in network nodes, and by admission control. The perceived quality of scenes was tested using an existing DMP performance (simulation) model that generates two-dimensional random distributions for the frequency of overloads in the network nodes versus packet drop rate and duration. Video clips with time-varying scene quality were synthesized, and showed to test persons. Four sub-objects of spatial resolution 1400 x 1050 pixels, and temporal resolution of 60Hz was applied. Sub-objects were dropped in the network, and missing sub-objects were regenerated by standard linear interpolation techniques. Test persons could perceive a moderate average quality reduction 13 on a 0-100 quality scale when 75% of the sub-objects were dropped and interpolated. To improve the quality, edge detection and correction was added. The test persons could perceive a small average quality reduction of 8 on a 0-100 quality scale when 75% of the sub-objects were dropped.
System Performance: Capture and Display
icon_mobile_dropdown
Scanner image quality profiling
Chengwu Cui
When using a document scanner, scan image quality is often unknown to the end user of the scanned image. Document scanners may employ different imaging technologies that can result in different image characteristics. Variability of scanner parts and the manufacturing process may also create variability of the scanned image quality from machine to machine. Image quality of the same scanner may also change as it ages and becomes contaminated. If the scanned image is used for human viewing, the resulting image quality variability may not be mission critical other than being a visual annoyance because the human visual system has superb adaptation and segmentation capability. However, if the scanned image is used for machine recognition or for printing, the image quality variability may become important and even mission critical. Here we propose a framework to profile the scanner image quality and tag the scanned image with the IQ profile. We review the potential quantified aspects of scan image quality and propose a method of characterization with examples.
Weighting of field heights for sharpness and noisiness
Brian W. Keelan, Elaine W. Jin
Weighting of field heights is important in cases when a single numerical value needs to be calculated that characterizes an attribute's overall impact on perceived image quality. In this paper we report an observer study to derive the weighting of field heights for sharpness and noisiness. One-hundred-forty images were selected to represent a typical consumer photo space distribution. Fifty-three sample points were sampled per image, representing field heights of 0, 14, 32, 42, 51, 58, 71, 76, 86% and 100%. Six observers participated in this study. The field weights derived in this report include both: the effect of area versus field height (which is a purely objective, geometric factor); and the effect of the spatial distribution of image content that draws attention to or masks each of these image structure attributes. The results show that relative to the geometrical area weights, sharpness weights were skewed to lower field heights, because sharpness-critical subject matter was often positioned relatively near the center of an image. Conversely, because noise can be masked by signal, noisiness-critical content (such as blue skies, skin tones, walls, etc.) tended to occur farther from the center of an image, causing the weights to be skewed to higher field heights.
Identification of image attributes that are most affected with changes in displayed image size
This paper describes an investigation of changes in image appearance when images are viewed at different image sizes on a high-end LCD device. Two digital image capturing devices of different overall image quality were used for recording identical natural scenes with a variety of pictorial contents. From each capturing device, a total of sixty four captured scenes, including architecture, nature, portraits, still and moving objects and artworks under various illumination conditions and recorded noise level were selected. The test set included some images where camera shake was purposefully introduced. An achromatic version of the image set that contained only lightness information was obtained by processing the captured images in CIELAB space. Rank order experiments were carried out to determine which image attribute(s) were most affected when the displayed image size was altered. These evaluations were carried out for both chromatic and achromatic versions of the stimuli. For the achromatic stimuli, attributes such as contrast, brightness, sharpness and noisiness were rank-ordered by the observers in terms of the degree of change. The same attributes, as well as hue and colourfulness, were investigated for the chromatic versions of the stimuli. Results showed that sharpness and contrast were the two most affected attributes with changes in displayed image size. The ranking of the remaining attributes varied with image content and illumination conditions. Further, experiments were carried out to link original scene content to the attributes that changed mostly with changes in image size.
Simulation of film media in motion picture production using a digital still camera
The introduction of digital intermediate workflow in movie production has made visualization of the final image on the film set increasingly important. Images that have been color corrected on the set can also serve as a basis for color grading in the laboratory. In this paper we suggest and evaluate an approach that has been used to simulate the appearance of different film stocks. The GretagMacbeth Digital ColorChecker was captured using both a Canon EOS 20D camera as well as an analog camera. The film was scanned using an Arri film scanner. The images of the color chart were then used to perform a colorimetric characterization of these devices using models based on polynomial regression. By using the reverse model of the digital camera and the forward model of the analog film chain, the output of the film scanner was simulated. We also constructed a direct transformation using regression on the RGB values of the two devices. A different color chart was then used as a test set to evaluate the accuracy of the transformations, where the indirect model was found to provide the required performance for our purpose without compromising the flexibility of having an independent profile for each device.
System Performance: Mobile Phones and CMOS Cameras
icon_mobile_dropdown
Method for measuring the objective quality of the TV-out function of mobile handsets
Digital cameras, printers and displays have their own established methods to measure their performance. Different devices have their own special features and also different metrics and measuring methods. The real meaning of measuring data is often not learnt until hands-on experience is available. The goal of this study was to describe a preliminary method and metrics for measuring the objective image quality of the TV-out function of mobile handsets. The TV-out application was image browsing. Image quality is often measured in terms of color reproduction, noise and sharpness and these attributes were also applied in this study. The color reproduction attribute was studied with color depth, hue reproduction and color accuracy metrics. The noise attribute was studied with the SNR (signal to noise ratio) and chroma noise metrics. The sharpness attribute was studied with the SFR (spatial frequency response) and contrast modulation metrics. The measuring data was gathered by using a method which digitized the analog signal of the TV-out device with a frame grabber card. Based on the results, the quantization accuracy, chroma error and spatial reproduction of the signal were the three fundamental factors which most strongly affected the performance of the TV-out device. The quantization accuracy of the device affects the number of tones that can be reproduced in the image. The quantization accuracy also strongly affects the correctness of hue reproduction. According to the results, the color depth metric was a good indicator of quantization accuracy. The composite signal of TV-out devices transmits both chroma and luminance information in a single signal. A change in the luminance value can change the constant chroma value. Based on the results, the chroma noise metric was a good indicator for measuring this phenomenon. There were differences between the spatial reproductions of the devices studied. The contrast modulation was a clear metric for measuring these differences. The signal sharpening of some TV-out devices hindered the interpretation of SFR data.
Applying image quality in cell phone cameras: lens distortion
This paper describes the framework used in one of the pilot studies run under the I3A CPIQ initiative to quantify overall image quality in cell-phone cameras. The framework is based on a multivariate formalism which tries to predict overall image quality from individual image quality attributes and was validated in a CPIQ pilot program. The pilot study focuses on image quality distortions introduced in the optical path of a cell-phone camera, which may or may not be corrected in the image processing path. The assumption is that the captured image used is JPEG compressed and the cellphone camera is set to 'auto' mode. As the used framework requires that the individual attributes to be relatively perceptually orthogonal, in the pilot study, the attributes used are lens geometric distortion (LGD) and lateral chromatic aberrations (LCA). The goal of this paper is to present the framework of this pilot project starting with the definition of the individual attributes, up to their quantification in JNDs of quality, a requirement of the multivariate formalism, therefore both objective and subjective evaluations were used. A major distinction in the objective part from the 'DSC imaging world' is that the LCA/LGD distortions found in cell-phone cameras, rarely exhibit radial behavior, therefore a radial mapping/modeling cannot be used in this case.
Low light performance of digital cameras
Photospace data previously measured on large image sets have shown that a high percentage of camera phone pictures are taken under low-light conditions. Corresponding image quality measurements linked the lowest quality to these conditions, and subjective analysis of image quality failure modes identified image blur as the most important contributor to image quality degradation. Camera phones without flash have to manage a trade-off when adjusting shutter time to low-light conditions. The shutter time has to be long enough to avoid extreme underexposures, but not short enough that hand-held picture taking is still possible without excessive motion blur. There is still a lack of quantitative data on motion blur. Camera phones often do not record basic operating parameters such as shutter speed in their image metadata, and when recorded, the data are often inaccurate. We introduce a device and process for tracking camera motion and measuring its Point Spread Function (PSF). Vision-based metrics are introduced to assess the impact of camera motion on image quality so that the low-light performance of different cameras can be compared. Statistical distributions of user variability will be discussed.
Color-blotch noise characterization for CMOS cameras
Reza Safaee-Rad, M. Aleksic
Color noise in the form of clusters of color non-uniformity is a major negative quality factor in color images. This type of noise is significantly more pronounced in CMOS cameras with increasingly smaller pixel sizes (e.g., 1.75μm and 1.4μm pixel sizes). This paper identifies and quantifies temporal noise as the main factor for this type of noise. As well, it is shown how differences in R/G/B responses and as well possible presence of R/G/B-response non-linearity can exacerbate color-blotch noise. Furthermore, it is shown how run-time averaging can effectively remove this noise (to a large extent) from a color image-if capture condition permits.
Photo-response non-uniformity error tolerance testing methodology for CMOS imager systems
Brent McCleary, Antonio Ortega
An image sensor system-level pixel-to-pixel photo-response non-uniformity (PRNU) error tolerance method is presented in this paper. A scheme is developed to determine sensor PRNU acceptability and corresponding sensor application categorization. Many low-cost imaging systems utilize CMOS imagers with integrated on-chip digital logic for performing image processing and compression. Due to pixel geometry and substrate material variations, the light sensitivity of pixels will be non-uniform (PRNU). Excessive variation in the sensitivity of pixels is a significant cause of the screening rejection for these image sensors. The proposed testing methods in this paper use the concept of acceptable degradation applied to the camera system processed and decoded images of these sensors. The analysis techniques developed in this paper give an estimation of the impact of the sensor's PRNU on image quality. This provides the ability to classify the sensors for different applications based upon their PRNU distortion and error rates. The human perceptual criteria is used in the determination of acceptable sensor PRNU limits. These PRNU thresholds are a function of the camera system's image processing (including compression) and sensor noise sources. We use a Monte Carlo simulation solution and a probability model-based simulation solution along with the sensor models to determine PRNU error rates and significances for a range of sensor operating conditions (e.g., conversion gain settings, integration times). We develop correlations between industry standard PRNU measurements and final processed and decoded image quality thresholds. The results presented in this paper show that the proposed PRNU testing method can reduce the rejection rate of CMOS sensors. Comparisons are presented on the sensor PRNU failure rates using industry standard testing methods and our proposed methods.
System Performance: Video
icon_mobile_dropdown
Improved video image by pixel-based learning for super-resolution
In recent years, the resolution of display devices has been extremely increased. The resolution of video camera (except very expensive one), however, is quite lower than that of display since it is difficult to achieve high spatial resolution with specific frame rate (e.g. 30 frames per second) due to the limited bandwidth. The resolution of image can be increased by interpolation, such as bi-cubic interpolation, but in this method it is known that the edges of image are blurred. To create plausible high-frequency details in the blurred image, super-resolution technique has been studied for a long time. In this paper, we proose a new algorithm for video super-resolution by considering multi-sensor camera system. The multi-sensor camera can capture two types video sequence as follow; (a) high-resolution with low frame rate luminance sequence, (b) low-resolution with high frame rate color sequences. The training pairs for super-resolution are obtained from these two sequences. The relationships between the high- and low-resolution frames are trained using pixel-based feature named "texton" and stored in the database with their spatial distribution. The low-resolution sequences are then represented with texton and each texton is substituted by searching the trained database to create high-resolution features in output sequences. The experimental results showed that the proposed method can well reproduce both the detail regions and sharp edges of the scene. It was also shown that the PSNR of the image obtained by proposed method is improved compared to the image by bi-cubic interpolation method.
Subjective video quality comparison of HDTV monitors
G. Seo, C. Lim, S. Lee, et al.
HDTV broadcasting services have become widely available. Furthermore, in the upcoming IPTV services, HDTV services are important and quality monitoring becomes an issue, particularly in IPTV services. Consequently, there have been great efforts to develop video quality measurement methods for HDTV. On the other hand, most HDTV programs will be watched on digital TV monitors which include LCD and PDP TV monitors. In general, the LCD and PDP TV monitors have different color characteristics and response times. Furthermore, most commercial TV monitors include post-processing to improve video quality. In this paper, we compare subjective video quality of some commercial HD TV monitors to investigate the impact of monitor type on perceptual video quality. We used the ACR method as a subjective testing method. Experimental results show that the correlation coefficients among the HDTV monitors are reasonable high. However, for some video sequences and impairments, some differences in subjective scores were observed.
Constructing a metrics for blur perception with blur discrimination experiments
Chien-Chung Chen, Kuei-Po Chen, Chia-Huei Tseng, et al.
In this study, we measured blur discrimination threshold at different blur levels. We found that the discrimination threshold first decreased and then increased again as reference edge width blur increased. This dipper shape of the blur discrimination threshold vs. reference width functions (TvW) functions can be explained by a divisive inhibition model. The first stage of the model contains a linear operator whose excitation is the inner product of the image and the sensitivity profile of the operator. The response of the blur discrimination mechanism is the power function of the excitation of the linear operator divided by the sum of the divisive inhibition and an additive factor. Changing mean luminance of the edge has little effect on blur discrimination except at very low luminance. When luminance is low, the blur discrimination was higher at small reference blur than those measured at medium to high luminance. This difference diminished at large reference blur. Such luminance effect can be explained by a change in the additive factor in the model. Reducing contrast of the edge shifted the whole TvW function up vertically. This effect can be explained by the decrease of gain factors in the linear operator. With these results, we constructed a metric for blur perception from the divisive inhibition we proposed and tested in this study.
Objective perceptual picture quality measurement method for high-definition video based on full reference framework
Osamu Sugimoto, Sei Naito, Shigeyuki Sakazawa, et al.
The authors study a method for objective measurement of perceived picture quality for high definition video based on the full reference framework. The proposed method applies seven spatio-temporal image features to estimate perceived quality of pictures degraded by compression coding. Computer simulation shows that the proposed method can estimate perceived picture quality at a correlation coefficient of above 0.91.
Motion blur perception considering anisotropic contrast sensitivity of human visual system
In this paper, we measured the anisotropic spatio-velocity CSF of human visual system and applied the measured CSF to evaluate motion blur of the LCD. Many Gabor stimuli with different contrasts, spatial frequencies, scroll speeds and angles are displayed on to the LCD and observers are asked whether those stimuli can perceive or not. The thresholds of those stimuli are defined as the contrast that 50% of observers perceive the stimulus. Based on this assessment, we obtained the contrast sensitivity given as the inverse of the threshold. By using the measured spatio-velocity CSFs, we evaluated the anisotropic motion blur characteristics of the LCD.
Interactive Paper Session
icon_mobile_dropdown
A geometry calibration and visual seamlessness method based on multi-projector tiled display wall
Yahui Liu, Qingxuan Jia, Hanxu Sun, et al.
Multi-projector virtual environment based on PC cluster has characteristics of low cost, high resolution and widely visual angle, which has become a research hotspot in Virtual Reality application. Geometric distortion calibration and seamless splicing is key problems in multi-projector display. The paper does research on geometry calibration method and edge blending. It proposes an automatic calibration preprocessing algorithm based on a camera, which projects images to the regions expected in terms of the relation between a plane surface and a curved surface and texture mapping method. In addition, overlap regions, which bring about intensity imbalance regions, may be adjusted by an edge blending function. Implementation indicates that the approach can accomplish geometry calibration and edge blending on an annular screen.
A facial expression image database and norm for Asian population: a preliminary report
Chien-Chung Chen, Shu-ling Cho, Katarzyna Horszowska, et al.
We collected 6604 images of 30 models in eight types of facial expression: happiness, anger, sadness, disgust, fear, surprise, contempt and neutral. Among them, 406 most representative images from 12 models were rated by more than 200 human raters for perceived emotion category and intensity. Such large number of emotion categories, models and raters is sufficient for most serious expression recognition research both in psychology and in computer science. All the models and raters are of Asian background. Hence, this database can also be used when the culture background is a concern. In addition, 43 landmarks each of the 291 rated frontal view images were identified and recorded. This information should facilitate feature based research of facial expression. Overall, the diversity in images and richness in information should make our database and norm useful for a wide range of research.