Image Quality and System Performance XII

Front Matter: Volume 9396

Show abstract

This PDF file contains the front matter associated with SPIE Proceedings Volume 9396, including the Title Page, Copyright information, Table of Contents, Introduction (if any), and Conference Committee listing.

Advanced mechanisms for delivering high-quality digital content

Mikołaj Leszczuk, Lucjan Janowski

Show abstract

In this paper, a practical solution for optimal coding parameters using different bandwidth requirements is presented. The obtained specification is based on the analysis of a large database of more than 10,000 sequences compressed with different parameters. The obtained parameters can be used both for adaptive streaming or storage optimization.

Towards assessment of the image quality in the high-content screening

Yury Tsoy

Show abstract

High-Content Screening (HCS) is a powerful technology for biological research, which relies heavily on the capabilities for processing and analysis of cell biology images. The quality of the quantification results, obtained by analysis of hundreds and thousands of images, is crucial for analysis of biological phenomena under study. Traditionally, a quality control in the HCS refers to the preparation of biological assay, setting up instrumentation, and analysis of the obtained quantification results, thus skipping an important step of assessment of the image quality. So far, only few papers have been addressing this issue, but no standard methodology yet exists, that would allow pointing out images, potentially producing outliers when processed. In this research the importance of the image quality control for the HCS is emphasized, with the following possible advantages: (a) validation of the visual quality of the screening; (b) detection of the potentially problem images; (c) more accurate setting of the processing parameters. For the detection of outlier images the Power Log-Log Slope (PLLS) is applied, as it is known to be sensitive to the focusing errors, and validated using open data sets. The results show that PLLS correlates with the cell counting error and, when taken it into account, allows reducing the variance of measurements. Possible extensions and problems of the approach are discussed.

Information theoretic methods for image processing algorithm optimization

Sergey F. Prokushkin, Erez Galil

Show abstract

Modern image processing pipelines (e.g., those used in digital cameras) are full of advanced, highly adaptive filters that often have a large number of tunable parameters (sometimes > 100). This makes the calibration procedure for these filters very complex, and the optimal results barely achievable in the manual calibration; thus an automated approach is a must. We will discuss an information theory based metric for evaluation of algorithm adaptive characteristics (“adaptivity criterion”) using noise reduction algorithms as an example. The method allows finding an “orthogonal decomposition” of the filter parameter space into the “filter adaptivity” and “filter strength” directions. This metric can be used as a cost function in automatic filter optimization. Since it is a measure of a physical “information restoration” rather than perceived image quality, it helps to reduce the set of the filter parameters to a smaller subset that is easier for a human operator to tune and achieve a better subjective image quality. With appropriate adjustments, the criterion can be used for assessment of the whole imaging system (sensor plus post-processing).

Forward and backward tone mapping of high dynamic range images based on sub band architecture

Ines Bouzidi, Azza Ouled Zaid

Show abstract

This paper presents a novel High Dynamic Range (HDR) tone mapping (TM) system based on sub-band architecture. Standard wavelet filters of Daubechies, Symlets, Coiflets and Biorthogonal were used to estimate the proposed system performance in terms of Low Dynamic Range (LDR) image quality and reconstructed HDR image fidelity. During TM stage, the HDR image is firstly decomposed in sub-bands using symmetrical analysis-synthesis filter bank. The transform coefficients are then rescaled using a predefined gain map. The inverse Tone Mapping (iTM) stage is straightforward. Indeed, the LDR image passes through the same sub-band architecture. But, instead of reducing the dynamic range, the LDR content is boosted to an HDR representation. Moreover, in our TM sheme, we included an optimization module to select the gain map components that minimize the reconstruction error, and consequently resulting in high fidelity HDR content. Comparisons with recent state-of-the-art methods have shown that our method provides better results in terms of visual quality and HDR reconstruction fidelity using objective and subjective evaluations.

MTF evaluation of white pixel sensors

Albrecht Lindner, Kalin Atanassov, Jiafu Luo, et al.

Show abstract

We present a methodology to compare image sensors with traditional Bayer RGB layouts to sensors with alternative layouts containing white pixels. We focused on the sensors’ resolving powers, which we measured in the form of a modulation transfer function for variations in both luma and chroma channels. We present the design of the test chart, the acquisition of images, the image analysis, and an interpretation of results. We demonstrate the approach at the example of two sensors that only differ in their color filter arrays. We confirmed that the sensor with white pixels and the corresponding demosaicing result in a higher resolving power in the luma channel, but a lower resolving power in the chroma channels when compared to the traditional Bayer sensor.

Intrinsic camera resolution measurement

Peter D. Burns, Judit Martinez Bauza

Show abstract

Objective evaluation of digital image quality usually includes analysis of spatial detail in captured images. Although previously-developed methods and standards have found success in the evaluation of system performance, the systems in question usually include spatial image processing (e.g. sharpening or noise-reduction), and the results are influenced by these operations. Our interest, however, is in the intrinsic resolution of the system. By this we mean the performance primarily defined by the lens and imager, and not influenced by subsequent image processing steps that are invertible. Examples of such operations are brightness and contrast adjustments, and simple sharpening and blurring (setting aside image clipping and quantization). While these operations clearly modify image perception, they do not in general change the fundamental spatial image information that is captured. We present a method to measure an intrinsic spatial frequency response (SFR) computed from test image(s) for which spatial operations may have been applied. The measure is intended ‘see through’ operations for which image detail is retrievable but measure the loss of image resolution otherwise. We adopt a two-stage image capture model. The first stage includes a locally-stable point-spread function (lens), the integration and sampling by the detector (imager), and the introduction of detector noise. The second stage comprises the spatial image processing. We describe the validation of the method, which was done using both simulation and actual camera evaluations.

Mobile phone camera benchmarking in low light environment

Veli-Tapani Peltoketo

Show abstract

High noise values and poor signal to noise ratio are traditionally associated to the low light imaging. Still, there are several other camera quality features which may suffer from low light environment. For example, what happens to the color accuracy and resolution or how the camera speed behaves in low light? Furthermore, how low light environments affect to the camera benchmarking and which metrics are the critical ones? The work contains standard based image quality measurements including noise, color, and resolution measurements in three different light environments: 1000, 100, and 30 lux. Moreover, camera speed measurements are done. Detailed measurement results of each quality and speed category are revealed and compared. Also a suitable benchmark algorithm is evaluated and corresponding score is calculated to find an appropriate metric which characterize the camera performance in different environments. The result of this work introduces detailed image quality and camera speed measurements of mobile phone camera systems in three different light environments. The paper concludes how different light environments influence to the metrics and which metrics should be measured in low light environment. Finally, a benchmarking score is calculated using measurement data of each environment and mobile phone cameras are compared correspondingly.

Luminance and gamma optimization for mobile display in low ambient conditions

Seonmee Lee, Taeyong Park, Junwoo Jang, et al.

Show abstract

This study presents effective method to provide user’s visual comfort and reduce power consumption through the optimization of display luminance and gamma curve for using mobile display in low ambient conditions. Visual assessment was carried out to obtain appropriate display luminance for visual comfort, and gamma curve was optimized to obtain power saving effect without reducing image quality in lower luminance than in appropriate luminance. This suggested method was verified through perceptual experiment and power consumption measurement.

A new method to evaluate the perceptual resolution

M. Uno, S. Sasahara

Show abstract

A new method for evaluating the perceptual resolution of the actual image has been developed. The characteristic of this new evaluation method is that the square sum of the cross correlation coefficients becomes the index of the evaluation. Because the RIT Contrast-Resolution Test Target was a very systematic analysis pattern that contains spatial resolution and contrast, it was used for this study. The cross correlation coefficient is calculated between the reference image and the printed image read by a scanner. To provide the precision of the new evaluation method, subjective evaluation was performed with the actual image. As a result of having confirmed correlation of the evaluation value from subjective evaluation and the new evaluation method, the correlation was much higher than other evaluation method using a similar analysis pattern. Furthermore, the sharpness evaluation of the actual image was enabled by applying the evaluation value of this new method to evaluate the perceptual resolution, that correlation with the subjectivity was high.

MFP scanner motion characterization using self-printed target

Minwoong Kim, Peter Bauer, Jerry K. Wagner, et al.

Show abstract

Multifunctional printers (MFP) are products that combine the functions of a printer, scanner, and copier. Our goal is to help customers to be able to easily diagnose scanner or print quality issues with their products by developing an automated diagnostic system embedded in the product. We specifically focus on the characterization of scanner motions, which may be defective due to irregular movements of the scan-head. The novel design of our test page and two-stage diagnostic algorithm are described in this paper. The most challenging issue is to evaluate the scanner performance properly when both printer and scanner units contribute to the motion errors. In the first stage called the uncorrected-print-error-stage, aperiodic and periodic motion behaviors are characterized in both the spatial and frequency domains. Since it is not clear how much of the error is contributed by each unit, the scanned input is statistically analyzed in the second stage called the corrected-print-error-stage. Finally, the described diagnostic algorithms output the estimated scan error and print error separately as RMS values of the displacement of the scan and print lines, respectively, from their nominal positions in the scanner or printer motion direction. We validate our test page design and approaches by ground truth obtained from a high-precision, chrome-on-glass reticle manufactured using semiconductor chip fabrication technologies.

Autonomous detection of ISO fade point with color laser printers

Ni Yan, Eric Maggard, Roberta Fothergill, et al.

Show abstract

Image quality assessment is a very important field in image processing. Human observation is slow and subjective, it also requires strict environment setup for the psychological test ¹. Thus developing algorithms to match desired human experiments is always in need. Many studies have focused on detecting the fading phenomenon after the materials are printed, that is to monitor the persistence of the color ink ^2-4. However, fading is also a common artifact produced by printing systems when the cartridges run low. We want to develop an automatic system to monitor cartridge life and report fading defects when they appear. In this paper, we first describe a psychological experiment that studies the human perspective on printed fading pages. Then we propose an algorithm based on Color Space Projection and K-means clustering to predict the visibility of fading defects. At last, we integrate the psychological experiment result with our algorithm to give a machine learning tool that monitors cartridge life.

Autonomous detection of text fade point with color laser printers

Yanling Ju, Eric Maggard, Renee Jessome, et al.

Show abstract

Fading is one of the issues of most critical concern for print quality degradation with color laser electro- photographic printers. Fading occurs when the cartridge is depleted. ISO/IEC 19798:2007(E) specifies a process for determining the cartridge page yield for a given color electro-photographic printer model. It is based on repeatedly printing a suite of test pages, followed by visual examination of the sequence of printed diagnostic pages. But this method is a very costly process since it involves visual examination of a large number of pages. And also the final decision is based on the visual examination of a specially designed diagnostic page, which is different than typical office document pages, since it consists of color bars, and contains no text. In this paper, we propose a new method to autonomously detect the text fading in prints from home or office color printers using a typical office document page instead of a specially designed diagnostic page. In our method, we scan and analyze the printed pages to predict where expert observers would judge fading to have occurred in the print sequence. Our approach is based on a machine-learning framework in which features derived from image analysis are mapped to a fade point prediction.

Photoconductor surface modeling for defect compensation based on printed images

Ahmed H. Eid, Brian E. Cooper

Show abstract

Manufacturing imperfections of photoconductor (PC) drums in electrophotographic (EP) printers cause low- frequency artifacts that could produce objectionable non-uniformities in the final printouts. In this paper, we propose a technique to detect and quantify PC artifacts. Furthermore, we spatially model the PC drum surface for dynamic compensation of drum artifacts. After scanning printed pages of flat field areas, we apply a wavelet- based filtering technique to the scanned images to isolate the PC-related artifacts from other printing artifacts, based on the frequency, range, and direction of the PC defects. Prior knowledge of the PC circumference determines the printed area at each revolution of the drum for separate analysis. Applied to the filtered images, the expectation maximization (EM) algorithm models the PC defects as a mixture of Gaussians. We use the estimated parameters of the Gaussians to measure the severity of the defect. In addition, a 2-D polynomial fitting approach characterizes the spatial artifacts of the drum, by analyzing multiple revolutions of printed output. The experimental results show a high correlation of the modeled artifacts from different revolutions of a drum. This allows for generating a defect-compensating profile of the defective drum.

Controlling misses and false alarms in a machine learning framework for predicting uniformity of printed pages

Minh Q. Nguyen, Jan P. Allebach

Show abstract

In our previous work¹ , we presented a block-based technique to analyze printed page uniformity both visually and metrically. The features learned from the models were then employed in a Support Vector Machine (SVM) framework to classify the pages into one of the two categories of acceptable and unacceptable quality. In this paper, we introduce a set of tools for machine learning in the assessment of printed page uniformity. This work is primarily targeted to the printing industry, specifically the ubiquitous laser, electrophotographic printer. We use features that are well-correlated with the rankings of expert observers to develop a novel machine learning framework that allows one to achieve the minimum "false alarm" rate, subject to a chosen "miss" rate. Surprisingly, most of the research that has been conducted on machine learning does not consider this framework. During the process of developing a new product, test engineers will print hundreds of test pages, which can be scanned and then analyzed by an autonomous algorithm. Among these pages, most may be of acceptable quality. The objective is to find the ones that are not. These will provide critically important information to systems designers, regarding issues that need to be addressed in improving the printer design. A "miss" is defined to be a page that is not of acceptable quality to an expert observer that the prediction algorithm declares to be a "pass". Misses are a serious problem, since they represent problems that will not be seen by the systems designers. On the other hand, "false alarms" correspond to pages that an expert observer would declare to be of acceptable quality, but which are flagged by the prediction algorithm as "fails". In a typical printer testing and development scenario, such pages would be examined by an expert, and found to be of acceptable quality after all. "False alarm" pages result in extra pages to be examined by expert observers, which increases labor cost. But "false alarms" are not nearly as catastrophic as "misses", which represent potentially serious problems that are never seen by the systems developers. This scenario motivates us to develop a machine learning framework that will achieve the minimum "false alarm" rate subject to a specified "miss" rate. In order to construct such a set of receiver operating characteristic² (ROC) curves, we examine various tools for the prediction, ranging from an exhaustive search over the space of the nonlinear discriminants to a Cost-Sentitive SVM³ framework. We then compare the curves gained from those methods. Our work shows promise for applying a standard framework to obtain a full ROC curve when it comes to tackling other machine learning problems in industry.

Estimation of repetitive interval of periodic bands in laser electrophotographic printer output

Jia Zhang, Jan P. Allebach

Show abstract

In the printing industry, electrophotography (EP) is a commonly used technology in laser printers and copiers. In the EP printing process, there are many rotating components involved in the six major steps: charging, exposure, development, transfer, fusing, and cleaning. If there is any irregularity in one of the rotating components, repetitive defects, such as isolated bands or spots, will occur on the output of the printer or copier. To troubleshoot these types of repetitive defect issues, the repeating interval of these isolated bands or spots is an important clue to locate the irregular rotating component. In our previous work, we have effectively identified the presence of isolated large pitch bands in the output from EP printers. In this paper, we describe an algorithm to estimate the repetitive interval of periodic bands, when the data is corrupted by the presence of aperiodic bands, missing periodic bands, and noise. We will also illustrate the effectiveness and robustness of our method with example results.

Image quality optimization via application of contextual contrast sensitivity and discrimination functions

Edward Fry, Sophie Triantaphillidou, John Jarvis, et al.

Show abstract

What is the best luminance contrast weighting-function for image quality optimization? Traditionally measured contrast sensitivity functions (CSFs), have been often used as weighting-functions in image quality and difference metrics. Such weightings have been shown to result in increased sharpness and perceived quality of test images. We suggest contextual CSFs (cCSFs) and contextual discrimination functions (cVPFs) should provide bases for further improvement, since these are directly measured from pictorial scenes, modeling threshold and suprathreshold sensitivities within the context of complex masking information. Image quality assessment is understood to require detection and discrimination of masked signals, making contextual sensitivity and discrimination functions directly relevant. In this investigation, test images are weighted with a traditional CSF, cCSF, cVPF and a constant function. Controlled mutations of these functions are also applied as weighting-functions, seeking the optimal spatial frequency band weighting for quality optimization. Image quality, sharpness and naturalness are then assessed in two-alternative forced-choice psychophysical tests. We show that maximal quality for our test images, results from cCSFs and cVPFs, mutated to boost contrast in the higher visible frequencies.

A study of slanted-edge MTF stability and repeatability

Jackson K. M. Roland

Show abstract

The slanted-edge method of measuring the spatial frequency response (SFR) as an approximation of the modulation transfer function (MTF) has become a well known and widely used image quality testing method over the last 10 years. This method has been adopted by multiple international standards including ISO and IEEE. Nearly every commercially available image quality testing software includes the slanted-edge method and there are numerous open-source algorithms available. This method is one of the most important image quality algorithms in use today. This paper explores test conditions and the impacts they have on the stability and precision of the slanted-edge method as well as details of the algorithm itself. Real world and simulated data are used to validate the characteristics of the algorithm. Details of the target such as edge angle and contrast ratio are tested to determine the impact on measurement under various conditions. The original algorithm defines a near vertical edge so that errors introduced are minor but the theory behind the algorithm requires a perfectly vertical edge. A correction factor is introduced as a way to compensate for this problem. Contrast ratio is shown to have no impact on results in an absence of noise.

Comparative performance between human and automated face recognition systems, using CCTV imagery, different compression levels, and scene parameters

A. Tsifouti, S. Triantaphillidou, M.-C. Larabi, et al.

Show abstract

In this investigation we identify relationships between human and automated face recognition systems with respect to compression. Further, we identify the most influential scene parameters on the performance of each recognition system. The work includes testing of the systems with compressed Closed-Circuit Television (CCTV) footage, consisting of quantified scene (footage) parameters. Parameters describe the content of scenes concerning camera to subject distance, facial angle, scene brightness, and spatio-temporal busyness. These parameters have been previously shown to affect the human visibility of useful facial information, but not much work has been carried out to assess the influence they have on automated recognition systems. In this investigation, the methodology previously employed in the human investigation is adopted, to assess performance of three different automated systems: Principal Component Analysis, Linear Discriminant Analysis, and Kernel Fisher Analysis. Results show that the automated systems are more tolerant to compression than humans. In automated systems, mixed brightness scenes were the most affected and low brightness scenes were the least affected by compression. In contrast for humans, low brightness scenes were the most affected and medium brightness scenes the least affected. Findings have the potential to broaden the methods used for testing imaging systems for security applications.

A study of image exposure for the stereoscopic visualization of sparkling materials

Victor Medina, Alexis Paljic, Dominique Lafon-Pham

Show abstract

This work is performed as part of the perceptual validation stage in the stereoscopic visualization of computer- generated (CG) images of materials (typically car paints) containing sparkling metallic flakes. The perceived material aspect is closely linked to the flake density, depth, and sparkling; in turn, our perception of an image of said materials is strongly dependent on the image exposure, that is, the amount of light entering the sensor during the imaging process. Indeed, a high exposure may over saturate the image, reducing discrimination amongst high-luminance flakes, affecting the perceived depth; on the other hand, a low exposure may reduce image contrast, merging low-luminance flakes with the background, and reducing perceived flake density and sparkling. In order to choose the right exposure for each CG image, we have performed a user study where we presented observers with a series of stereoscopic photographs of plates, taken at different exposures with a radiometrically color-calibrated camera ,⁵ and asked them to assess each photograph's similarity to a physical reference. We expect these results to help us find a correlation between optical settings and visual perception regarding the aforementioned parameters, which we could then use in the rendering process to obtain the desired material aspect.

QuickEval: a web application for psychometric scaling experiments

Khai Van Ngo, Jehans Jr. Storvik, Christopher André Dokkeberg, et al.

Show abstract

QuickEval is a web application for carrying out psychometric scaling experiments. It offers the possibility of running controlled experiments in a laboratory, or large scale experiment over the web for people all over the world. It is a unique one of a kind web application, and it is a software needed in the image quality field. It is also, to the best of knowledge, the first software that supports the three most common scaling methods; paired comparison, rank order, and category judgement. It is also the first software to support rank order. Hopefully, a side effect of this newly created software is that it will lower the threshold to perform psychometric experiments, improve the quality of the experiments being carried out, make it easier to reproduce experiments, and increase research on image quality both in academia and industry. The web application is available at www.colourlab.no/quickeval.

A database for spectral image quality

Steven Le Moan, Sony T. George, Marius Pedersen, et al.

Show abstract

We introduce a new image database dedicated to multi-/hyperspectral image quality assessment. A total of nine scenes representing pseudo-at surfaces of different materials (textile, wood, skin. . . ) were captured by means of a 160 band hyperspectral system with a spectral range between 410 and 1000nm. Five spectral distortions were designed, applied to the spectral images and subsequently compared in a psychometric experiment, in order to provide a basis for applications such as the evaluation of spectral image difference measures. The database can be downloaded freely from http://www.colourlab.no/cid.

Alternative performance metrics and target values for the CID2013 database

T. Virtanen, Mikko Nuutinen, J. Radun, et al.

Show abstract

An established way of validating and testing new image quality assessment (IQA) algorithms have been to compare how well they correlate with subjective data on various image databases. One of the most common measures is to calculate linear correlation coefficient (LCC) and Spearman’s rank order correlation coefficient (SROCC) against the subjective mean opinion score (MOS). Recently, databases with multiply distorted images have emerged ^1,2. However with multidimensional stimuli, there is more disagreement between observers as the task is more preferential than that of distortion detection. This reduces the statistical differences between image pairs. If the subjects cannot distinguish a difference between some of the image pairs, should we demand any better performance with IQA algorithms? This paper proposes alternative performance measures for the evaluation of IQA’s for the CID2013 database. One proposed alternative performance measure is root-mean-square-error (RMSE) value for the subjective data as a function of the number of observers. The other alternative performance measure is the number of statistical differences between image pairs. This study shows that after 12 subjects the RMSE value saturates around the level of three, meaning that a target RMSE value for an IQA algorithm for CID2013 database should be three. In addition, this study shows that the state-of-the-art IQA algorithms found the better image from the image pairs with a probability of 0.85 when the image pairs with statistically significant differences were taken into account.

Extending subjective experiments for image quality assessment with baseline adjustments

Ping Zhao, Marius Pedersen

Show abstract

In a typical working cycle of image quality assessment, it is common to have a number of human observers to give perceptual ratings on multiple levels of distortions of selected test images. If additional distortions need to be introduced into the experiment, the entire subjective experiment must be performed over again in order to incorporate the additional distortions. However, this would usually consume considerable more time and resources. Baseline adjustment is one method to extend an experiment with additional distortions without having to do a full experiment, reducing both the time and resources needed. In this paper, we conduct a study to verify and evaluate the baseline adjustment method regarding extending an existing subjective experimental session to another. Our experimental results suggest that the baseline adjustment method can be effective. We identify the optimal distortion levels to be included in the baselines should be the ones of which the stimulus combinations produce the minimum standard deviations in the mean adjusted Z-scores over all human observers in the existing rating session. We also demonstrate that it is possible to reduce the number of baseline stimuli, so the cost of extending subjective experiments can be optimized. Comparing to conventional researches mainly focusing on case studies of hypothetical data sets, we perform this research based on the real perceptual ratings collected from an existing subjective experiment.

Subjective quality of video sequences rendered on LCD with local backlight dimming at different lighting conditions

Claire Mantel, Jari Korhonen, Jesper Melgaard Pedersen, et al.

Show abstract

This paper focuses on the influence of ambient light on the perceived quality of videos displayed on Liquid Crystal Display (LCD) with local backlight dimming. A subjective test assessing the quality of videos with two backlight dimming methods and three lighting conditions, i.e. no light, low light level (5 lux) and higher light level (60 lux) was organized to collect subjective data. Results show that participants prefer the method exploiting local dimming possibilities to the conventional full backlight but that this preference varies depending on the ambient light level. The clear preference for one method at the low light conditions decreases at the high ambient light, confirming that the ambient light significantly attenuates the perception of the leakage defect (light leaking through dark pixels). Results are also highly dependent on the content of the sequence, which can modulate the effect of the ambient light from having an important influence on the quality grades to no influence at all.

RGB-NIR image fusion: metric and psychophysical experiments

Alex E. Hayes, Graham D. Finlayson, Roberto Montagna

Show abstract

In this paper, we compare four methods of fusing visible RGB and near-infrared (NIR) images to produce a color output image, using a psychophysical experiment and image fusion quality metrics. The results of the psychophysical experiment show that two methods are significantly preferred to the original RGB image, and therefore RGB-NIR image fusion may be useful for photographic enhancement in those cases. The Spectral Edge method is the most preferred method, followed by the dehazing method of Schaul et al. We then investigate image fusion metrics which give results correlated with the psychophysical experiment results. We extend several existing metrics from 2 to 1 to M to N channel image fusion, as well as introducing new metrics based on output image colorfulness and contrast, and test them on our experimental data. While none of the individual metrics gives a ranking of the algorithms which exactly matches that of the psychophysical experiment, through a combination of two metrics we accurately rank the two leading fusion methods.

Non-reference quality assessment of infrared images reconstructed by compressive sensing

J. E. Ospina-Borras, H. D. Benitez-Restrepo

Show abstract

Infrared (IR) images are representations of the world and have natural features like images in the visible spectrum. As such, natural features from infrared images support image quality assessment (IQA).¹ In this work, we compare the quality of a set of indoor and outdoor IR images reconstructed from measurement functions formed by linear combination of their pixels. The reconstruction methods are: linear discrete cosine transform (DCT) acquisition, DCT augmented with total variation minimization, and compressive sensing scheme. Peak Signal to Noise Ratio (PSNR), three full-reference (FR), and four no-reference (NR) IQA measures compute the qualities of each reconstruction: multi-scale structural similarity (MSSIM), visual information fidelity (VIF), information fidelity criterion (IFC), sharpness identification based on local phase coherence (LPC-SI), blind/referenceless image spatial quality evaluator (BRISQUE), naturalness image quality evaluator (NIQE) and gradient singular value decomposition (GSVD), respectively. Each measure is compared to human scores that were obtained by differential mean opinion score (DMOS) test. We observe that GSVD has the highest correlation coefficients of all NR measures, but all FR have better performance. We use MSSIM to compare the reconstruction methods and we find that CS scheme produces a good-quality IR image, using only 30000 random sub-samples and 1000 DCT coefficients (2%). In contrast, linear DCT provides higher correlation coefficients than CS scheme by using all the pixels of the image and 31000 DCT (47%) coefficients.

Study of the effects of video content on quality of experience

Pradip Paudyal, Federica Battisti, Marco Carli

Show abstract

In this article the effects of video content on Quality of Experience (QoE) have been presented. Delivery of the video content with high level of QoE from bandwidth-limited and error-prone network is of crucial importance for the service providers. Therefore, it is of fundamental importance to analyse the impact of the network impairments and video content on perceived quality during the QoE metric design. The major contributions of the article are in the study of i)the impact of network impairments together with video content, ii) impact of the video content and ii) the impact of video content related parameters: spatial-temporal perceptual information, video content, and frame size on QoE has been presented. The results show that when the impact of impairments on perceived quality is low, the quality is significantly influenced by video content, and video content itself also has a significant impact on QoE. Finally, the results strengthen the need for new parameter characterization, for better QoE metric design.

The effects of scene content, compression, and frame rate on the performance of analytics systems

A. Tsifouti, S. Triantaphillidou, M. C. Larabi, et al.

Show abstract

In this investigation we study the effects of compression and frame rate reduction on the performance of four video analytics (VA) systems utilizing a low complexity scenario, such as the Sterile Zone (SZ). Additionally, we identify the most influential scene parameters affecting the performance of these systems. The SZ scenario is a scene consisting of a fence, not to be trespassed, and an area with grass. The VA system needs to alarm when there is an intruder (attack) entering the scene. The work includes testing of the systems with uncompressed and compressed (using H.264/MPEG-4 AVC at 25 and 5 frames per second) footage, consisting of quantified scene parameters. The scene parameters include descriptions of scene contrast, camera to subject distance, and attack portrayal. Additional footage, including only distractions (no attacks) is also investigated. Results have shown that every system has performed differently for each compression/frame rate level, whilst overall, compression has not adversely affected the performance of the systems. Frame rate reduction has decreased performance and scene parameters have influenced the behavior of the systems differently. Most false alarms were triggered with a distraction clip, including abrupt shadows through the fence. Findings could contribute to the improvement of VA systems.

How perception of ultra-high definition is modified by viewing distance and screen size

Amélie Lachat, Jean-Charles Gicquel, Jérôme Fournier

Show abstract

Ultra High Definition (UHD) is a new technology, which main idea is to improve user’s perception of details and sensation of immersion in comparison with High Definition systems (HD). However, it is important to understand the influence of the new UHD technical parameters on user’s perception. Hence, to investigate the influence of the viewing distance, screen size and scene content on perceived video quality and feelings of users, a series of subjective experiments with four different contents (3 documentaries and 1 sport content) shooted by UHD camera were performed. These contents were displayed using three different image resolutions (SD, HD, UHD) and two UHD displays (55-inch and 84-inch). Each subject had to assess content for three different viewing distances (1.5, 3, 4.5 times of the screen height corresponding to optimal viewing distances of respectively UHD, HD, and close to SD optimal distance). Finally, 72 test conditions were evaluated. For each scene, observers reported their opinion on the perceived video quality using a 5-grade subjective scale. Results have shown that viewing distance has a significant influence on perceived quality. Moreover the highest MOS was obtained at optimal viewing for UHD, with a small difference between HD an UHD. At 3H and 4.5H, there is no difference from a statistical point of view. Screen size influences the perception of quality but not in the same way for the three image resolution and three viewing distances.

A no-reference video quality assessment metric based on ROI

Lixiu Jia, Xuefei Zhong, Yan Tu, et al.

Show abstract

A no reference video quality assessment metric based on the region of interest (ROI) was proposed in this paper. In the metric, objective video quality was evaluated by integrating the quality of two compressed artifacts, i.e. blurring distortion and blocking distortion. The Gaussian kernel function was used to extract the human density maps of the H.264 coding videos from the subjective eye tracking data. An objective bottom-up ROI extraction model based on magnitude discrepancy of discrete wavelet transform between two consecutive frames, center weighted color opponent model, luminance contrast model and frequency saliency model based on spectral residual was built. Then only the objective saliency maps were used to compute the objective blurring and blocking quality. The results indicate that the objective ROI extraction metric has a higher the area under the curve (AUC) value. Comparing with the conventional video quality assessment metrics which measured all the video quality frames, the metric proposed in this paper not only decreased the computation complexity, but improved the correlation between subjective mean opinion score (MOS) and objective scores.

Comparison of no-reference image quality assessment machine learning-based algorithms on compressed images

Christophe Charrier, AbdelHakim Saadane, Christine Fernandez-Maloigne

Show abstract

No-reference image quality metrics are of fundamental interest as they can be embedded in practical applications. The main goal of this paper is to perform a comparative study of seven well known no-reference learning-based image quality algorithms. To test the performance of these algorithms, three public databases are used. As a first step, the trial algorithms are compared when no new learning is performed. The second step investigates how the training set influences the results. The Spearman Rank Ordered Correlation Coefficient (SROCC) is utilized to measure and compare the performance. In addition, an hypothesis test is conducted to evaluate the statistical significance of performance of each tested algorithm.

Objective evaluation of slanted edge charts

Harvey (Hervé) Hornung

Show abstract

Camera objective characterization methodologies are widely used in the digital camera industry. Most objective characterization systems rely on a chart with specific patterns, a software algorithm measures a degradation or difference between the captured image and the chart itself. The Spatial Frequency Response (SFR) method, which is part of the ISO 12233¹ standard, is now very commonly used in the imaging industry, it is a very convenient way to measure a camera Modulation transfer function (MTF). The SFR algorithm can measure frequencies beyond the Nyquist frequency thanks to super-resolution, so it does provide useful information on aliasing and can provide modulation for frequencies between half Nyquist and Nyquist on all color channels of a color sensor with a Bayer pattern. The measurement process relies on a chart that is simple to manufacture: a straight transition from a bright reflectance to a dark one (black and white for instance), while a sine chart requires handling precisely shades of gray which can also create all sort of issues with printers that rely on half-toning. However, no technology can create a perfect edge, so it is important to assess the quality of the chart and understand how it affects the accuracy of the measurement. In this article, I describe a protocol to characterize the MTF of a slanted edge chart, using a high-resolution flatbed scanner. The main idea is to use the RAW output of the scanner as a high-resolution micro-densitometer, since the signal is linear it is suitable to measure the chart MTF using the SFR algorithm. The scanner needs to be calibrated in sharpness: the scanner MTF is measured with a calibrated sine chart and inverted to compensate for the modulation loss from the scanner. Then the true chart MTF is computed. This article compares measured MTF from commercial charts and charts printed on printers, and also compares how of the contrast of the edge (using different shades of gray) can affect the chart MTF, then concludes on what distance range and camera resolution the chart can reliably measure the camera MTF.

Evaluating the multi-scale iCID metric

Steven Le Moan, Jens Preiss, Philipp Urban

Show abstract

In this study, we investigate the extent to which an image-difference metric based on structural similarity can correlate with human judgment. We introduce a modified version of the recently published iCID metric and present new results over two large image quality databases. It is particularly noteworthy that the proposed metric yields a correlation of 0.861 with mean opinion scores on the 2013 version of the renowned Tampere Image Database, without dedicated parameter optimization.

Image quality evaluation of LCDs based on novel RGBW sub-pixel structure

Sungjin Kim, Dongwoo Kang, Jinsang Lee, et al.

Show abstract

Many display manufacturers have studied RGBW pixel structure adding a white sub-pixel to RGB LCD and recently revealed UHD TVs based on novel RGBW LCD. The RGBW LCD has 50% higher white luminance and 25% lower primary color luminance compared to RGB LCD. In this paper, the image quality of RGBW and RGB LCD was dealt with. Before evaluating them, TV broadcast video and IEC-62087 video were analyzed for test video clips. In order to analyze them, a TV reference video from TV broadcast content in Korea was firstly collected. As a result of TV reference video analysis, RGBW LCD was expected to improve image quality more because most of colors are distributed around white point and population ratio of achromatic colors is higher. RGB, RGBW and RGBW using wide color gamut (WCG) backlight unit (BLU) LCDs were prepared, and a series of visual assessments were conducted. As a result, RGBW LCD obtained higher scores than RGB LCD about four attributes (‘Brightness’, ‘Naturalness’, ‘Contrast’ and overall image quality) and ‘Colorfulness’ was not higher score than RGB LCD in test still images. RGBW LCD’s overall image quality in the TV reference video clips also was assessed higher than RGB LCD. Additionally, RGBW LCD using WCG BLU shows better performance about especially ‘Colorfulness’ than RGBW LCD.

Is there a preference for linearity when viewing natural images?

David Kane, Marcelo Bertamío

Show abstract

The system gamma of the imaging pipeline, defined as the product of the encoding and decoding gammas, is typically greater than one and is stronger for images viewed with a dark background (e.g. cinema) than those viewed in lighter conditions (e.g. office displays).^1-3 However, for high dynamic range (HDR) images reproduced on a low dynamic range (LDR) monitor, subjects often prefer a system gamma of less than one,⁴ presumably reflecting the greater need for histogram equalization in HDR images. In this study we ask subjects to rate the perceived quality of images presented on a LDR monitor using various levels of system gamma. We reveal that the optimal system gamma is below one for images with a HDR and approaches or exceeds one for images with a LDR. Additionally, the highest quality scores occur for images where a system gamma of one is optimal, suggesting a preference for linearity (where possible). We find that subjective image quality scores can be predicted by computing the degree of histogram equalization of the lightness distribution. Accordingly, an optimal, image dependent system gamma can be computed that maximizes perceived image quality.

Image Quality and System Performance XII

Volume Details

Table of Contents

Table of Contents