Proceedings Volume 9400

Real-Time Image and Video Processing 2015

cover
Proceedings Volume 9400

Real-Time Image and Video Processing 2015

Purchase the printed version of this volume at proceedings.com or access the digital version at SPIE Digital Library.

Volume Details

Date Published: 16 March 2015
Contents: 5 Sessions, 26 Papers, 0 Presentations
Conference: SPIE/IS&T Electronic Imaging 2015
Volume Number: 9400

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Front Matter: Volume 9400
  • Real-Time Hardware
  • Real-Time Algorithms I
  • Real-Time Algorithms II
  • Interactive Paper Session
Front Matter: Volume 9400
icon_mobile_dropdown
Front Matter: Volume 9400
This PDF file contains the front matter associated with SPIE Proceedings Volume 9400 including the Title Page, Copyright information, Table of Contents, Introduction, and Conference Committee listing.
Real-Time Hardware
icon_mobile_dropdown
Customized Nios II multi-cycle instructions to accelerate block-matching techniques
Diego González, Guillermo Botella, Carlos García, et al.
This study focuses on accelerating the optimization of motion estimation algorithms, which are widely used in video coding standards, by using both the paradigm based on Altera Custom Instructions as well as the efficient combination of SDRAM and On-Chip memory of Nios II processor. Firstly, a complete code profiling is carried out before the optimization in order to detect time leaking affecting the motion compensation algorithms. Then, a multi-cycle Custom Instruction which will be added to the specific embedded design is implemented. The approach deployed is based on optimizing SOC performance by using an efficient combination of On-Chip memory and SDRAM with regards to the reset vector, exception vector, stack, heap, read/write data (.rwdata), read only data (.rodata), and program text (.text) in the design. Furthermore, this approach aims to enhance the said algorithms by incorporating Custom Instructions in the Nios II ISA. Finally, the efficient combination of both methods is then developed to build the final embedded system. The present contribution thus facilitates motion coding for low-cost Soft-Core microprocessors, particularly the RISC architecture of Nios II implemented in FPGA. It enables us to construct an SOC which processes 50×50 @ 180 fps.
Hardware design to accelerate PNG encoder for binary mask compression on FPGA
PNG (Portable Network Graphics) is a lossless compression method for real-world pictures. Since its specification, it continues to attract the interest of the image processing community. Indeed, PNG is an extensible file format for portable and well-compressed storage of raster images. In addition, it supports all of Black and White (binary mask), grayscale, indexed-color, and truecolor images. Within the framework of the Demat+ project which intend to propose a complete solution for storage and retrieval of scanned documents, we address in this paper a hardware design to accelerate the PNG encoder for binary mask compression on FPGA. For this, an optimized architecture is proposed as part of an hybrid software and hardware co-operating system. For its evaluation, the new designed PNG IP has been implemented on the ALTERA Arria II GX EP2AGX125EF35" FPGA. The experimental results show a good match between the achieved compression ratio, the computational cost and the used hardware resources.
Real-time algorithm enabling high dynamic range imaging and high frame rate exploitation for custom CMOS image sensor system implemented by FPGA with co-processor
Blake C. Jacquot, Nathan Johnson-Williams
We present results from a prototype CMOS camera system implementing a multiple sampled pixel level algorithm (“Last Sample Before Saturation”) in real-time to create High-Dynamic Range (HDR) images that approach the dynamic range of CCDs. The system is built around a commercial 1280 × 1024 CMOS image sensor with 10-bits per pixel and up to 500 Hz full frame rate with higher frame rates available through windowing. We provide details of system architecture and present images collected with the system.
Fast semivariogram computation using FPGA architectures
Yamuna Lagadapati, Mukul Shirvaikar, Xuanliang Dong
The semivariogram is a statistical measure of the spatial distribution of data and is based on Markov Random Fields (MRFs). Semivariogram analysis is a computationally intensive algorithm that has typically seen applications in the geosciences and remote sensing areas. Recently, applications in the area of medical imaging have been investigated, resulting in the need for efficient real time implementation of the algorithm. The semivariogram is a plot of semivariances for different lag distances between pixels. A semi-variance, γ(h), is defined as the half of the expected squared differences of pixel values between any two data locations with a lag distance of h. Due to the need to examine each pair of pixels in the image or sub-image being processed, the base algorithm complexity for an image window with n pixels is 𝑂(𝑛2). Field Programmable Gate Arrays (FPGAs) are an attractive solution for such demanding applications due to their parallel processing capability. FPGAs also tend to operate at relatively modest clock rates measured in a few hundreds of megahertz, but they can perform tens of thousands of calculations per clock cycle while operating in the low range of power. This paper presents a technique for the fast computation of the semivariogram using two custom FPGA architectures. The design consists of several modules dedicated to the constituent computational tasks. A modular architecture approach is chosen to allow for replication of processing units. This allows for high throughput due to concurrent processing of pixel pairs. The current implementation is focused on isotropic semivariogram computations only. Anisotropic semivariogram implementation is anticipated to be an extension of the current architecture, ostensibly based on refinements to the current modules. The algorithm is benchmarked using VHDL on a Xilinx XUPV5-LX110T development Kit, which utilizes the Virtex5 FPGA. Medical image data from MRI scans are utilized for the experiments. Computational speedup is measured with respect to Matlab implementation on a personal computer with an Intel i7 multi-core processor. Preliminary simulation results indicate that a significant advantage in speed can be attained by the architectures, making the algorithm viable for implementation in medical devices
2D to 3D conversion implemented in different hardware
Eduardo Ramos-Diaz, Victor Gonzalez-Huitron, Volodymyr I. Ponomaryov, et al.
Conversion of available 2D data for release in 3D content is a hot topic for providers and for success of the 3D applications, in general. It naturally completely relies on virtual view synthesis of a second view given by original 2D video. Disparity map (DM) estimation is a central task in 3D generation but still follows a very difficult problem for rendering novel images precisely. There exist different approaches in DM reconstruction, among them manually and semiautomatic methods that can produce high quality DMs but they demonstrate hard time consuming and are computationally expensive. In this paper, several hardware implementations of designed frameworks for an automatic 3D color video generation based on 2D real video sequence are proposed. The novel framework includes simultaneous processing of stereo pairs using the following blocks: CIE L*a*b* color space conversions, stereo matching via pyramidal scheme, color segmentation by k-means on an a*b* color plane, and adaptive post-filtering, DM estimation using stereo matching between left and right images (or neighboring frames in a video), adaptive post-filtering, and finally, the anaglyph 3D scene generation. Novel technique has been implemented on DSP TMS320DM648, Matlab’s Simulink module over a PC with Windows 7, and using graphic card (NVIDIA Quadro K2000) demonstrating that the proposed approach can be applied in real-time processing mode. The time values needed, mean Similarity Structural Index Measure (SSIM) and Bad Matching Pixels (B) values for different hardware implementations (GPU, Single CPU, and DSP) are exposed in this paper.
A real-time GPU implementation of the SIFT algorithm for large-scale video analysis tasks
Hannes Fassold, Jakub Rosner
The SIFT algorithm is one of the most popular feature extraction methods and therefore widely used in all sort of video analysis tasks like instance search and duplicate/ near-duplicate detection. We present an efficient GPU implementation of the SIFT descriptor extraction algorithm using CUDA. The major steps of the algorithm are presented and for each step we describe how to efficiently parallelize it massively, how to take advantage of the unique capabilities of the GPU like shared memory / texture memory and how to avoid or minimize common GPU performance pitfalls. We compare the GPU implementation with the reference CPU implementation in terms of runtime and quality and achieve a speedup factor of approximately 3 - 5 for SD and 5 - 6 for Full HD video with respect to a multi-threaded CPU implementation, allowing us to run the SIFT descriptor extraction algorithm in real-time on SD video. Furthermore, quality tests show that the GPU implementation gives the same quality as the reference CPU implementation from the HessSIFT library. We further describe the benefits of GPU-accelerated SIFT descriptor calculation for video analysis applications such as near-duplicate video detection.
Real-Time Algorithms I
icon_mobile_dropdown
Real-time deblurring of handshake blurred images on smartphones
Reza Pourreza-Shahri, Chih-Hsiang Chang, Nasser Kehtarnavaz
This paper discusses an Android app for the purpose of removing blur that is introduced as a result of handshakes when taking images via a smartphone. This algorithm utilizes two images to achieve deblurring in a computationally efficient manner without suffering from artifacts associated with deconvolution deblurring algorithms. The first image is the normal or auto-exposure image and the second image is a short-exposure image that is automatically captured immediately before or after the auto-exposure image is taken. A low rank approximation image is obtained by applying singular value decomposition to the auto-exposure image which may appear blurred due to handshakes. This approximation image does not suffer from blurring while incorporating the image brightness and contrast information. The eigenvalues extracted from the low rank approximation image are then combined with those from the shortexposure image. It is shown that this deblurring app is computationally more efficient than the adaptive tonal correction algorithm which was previously developed for the same purpose.
Real-time object tracking for moving target auto-focus in digital camera
Haike Guan, Norikatsu Niinami, Tong Liu
Focusing at a moving object accurately is difficult and important to take photo of the target successfully in a digital camera. Because the object often moves randomly and changes its shape frequently, position and distance of the target should be estimated at real-time so as to focus at the objet precisely. We propose a new method of real-time object tracking to do auto-focus for moving target in digital camera. Video stream in the camera is used for the moving target tracking. Particle filter is used to deal with problem of the target object’s random movement and shape change. Color and edge features are used as measurement of the object’s states. Parallel processing algorithm is developed to realize real-time particle filter object tracking easily in hardware environment of the digital camera. Movement prediction algorithm is also proposed to remove focus error caused by difference between tracking result and target object’s real position when the photo is taken. Simulation and experiment results in digital camera demonstrate effectiveness of the proposed method. We embedded real-time object tracking algorithm in the digital camera. Position and distance of the moving target is obtained accurately by object tracking from the video stream. SIMD processor is applied to enforce parallel real-time processing. Processing time less than 60ms for each frame is obtained in the digital camera with its CPU of only 162MHz.
Embedded wavelet-based face recognition under variable position
Pascal Cotret, Stéphane Chevobbe, Mehdi Darouich
For several years, face recognition has been a hot topic in the image processing field: this technique is applied in several domains such as CCTV, electronic devices delocking and so on. In this context, this work studies the efficiency of a wavelet-based face recognition method in terms of subject position robustness and performance on various systems. The use of wavelet transform has a limited impact on the position robustness of PCA-based face recognition. This work shows, for a well-known database (Yale face database B*), that subject position in a 3D space can vary up to 10% of the original ROI size without decreasing recognition rates. Face recognition is performed on approximation coefficients of the image wavelet transform: results are still satisfying after 3 levels of decomposition. Furthermore, face database size can be divided by a factor 64 (22K with K = 3). In the context of ultra-embedded vision systems, memory footprint is one of the key points to be addressed; that is the reason why compression techniques such as wavelet transform are interesting. Furthermore, it leads to a low-complexity face detection stage compliant with limited computation resources available on such systems. The approach described in this work is tested on three platforms from a standard x86-based computer towards nanocomputers such as RaspberryPi and SECO boards. For K = 3 and a database with 40 faces, the execution mean time for one frame is 0.64 ms on a x86-based computer, 9 ms on a SECO board and 26 ms on a RaspberryPi (B model).
Subjective evaluation of H.265/HEVC based dynamic adaptive video streaming over HTTP (HEVC-DASH)
The Dynamic Adaptive Streaming over HTTP (DASH) standard is becoming increasingly popular for real-time adaptive HTTP streaming of internet video in response to unstable network conditions. Integration of DASH streaming techniques with the new H.265/HEVC video coding standard is a promising area of research. The performance of HEVC-DASH systems has been previously evaluated by a few researchers using objective metrics, however subjective evaluation would provide a better measure of the user’s Quality of Experience (QoE) and overall performance of the system. This paper presents a subjective evaluation of an HEVC-DASH system implemented in a hardware testbed. Previous studies in this area have focused on using the current H.264/AVC (Advanced Video Coding) or H.264/SVC (Scalable Video Coding) codecs and moreover, there has been no established standard test procedure for the subjective evaluation of DASH adaptive streaming. In this paper, we define a test plan for HEVC-DASH with a carefully justified data set employing longer video sequences that would be sufficient to demonstrate the bitrate switching operations in response to various network condition patterns. We evaluate the end user’s real-time QoE online by investigating the perceived impact of delay, different packet loss rates, fluctuating bandwidth, and the perceived quality of using different DASH video stream segment sizes on a video streaming session using different video sequences. The Mean Opinion Score (MOS) results give an insight into the performance of the system and expectation of the users. The results from this study show the impact of different network impairments and different video segments on users’ QoE and further analysis and study may help in optimizing system performance.
Real-Time Algorithms II
icon_mobile_dropdown
FIR filters for hardware-based real-time multi-band image blending
Vladan Popovic, Yusuf Leblebici
Creating panoramic images has become a popular feature in modern smart phones, tablets, and digital cameras. A user can create a 360 degree field-of-view photograph from only several images. Quality of the resulting image is related to the number of source images, their brightness, and the used algorithm for their stitching and blending. One of the algorithms that provides excellent results in terms of background color uniformity and reduction of ghosting artifacts is the multi-band blending. The algorithm relies on decomposition of image into multiple frequency bands using dyadic filter bank. Hence, the results are also highly dependant on the used filter bank. In this paper we analyze performance of the FIR filters used for multi-band blending. We present a set of five filters that showed the best results in both literature and our experiments. The set includes Gaussian filter, biorthogonal wavelets, and custom-designed maximally flat and equiripple FIR filters. The presented results of filter comparison are based on several no-reference metrics for image quality. We conclude that 5/3 biorthogonal wavelet produces the best result in average, especially when its short length is considered. Furthermore, we propose a real-time FPGA implementation of the blending algorithm, using 2D non-separable systolic filtering scheme. Its pipeline architecture does not require hardware multipliers and it is able to achieve very high operating frequencies. The implemented system is able to process 91 fps for 1080p (1920×1080) image resolution.
Iris unwrapping using the Bresenham circle algorithm for real-time iris recognition
An efficient parallel architecture design for the iris unwrapping process in a real-time iris recognition system using the Bresenham Circle Algorithm is presented in this paper. Based on the characteristics of the model parameters this algorithm was chosen over the widely used polar conversion technique as the iris unwrapping model. The architecture design is parallelized to increase the throughput of the system and is suitable for processing an inputted image size of 320 × 240 pixels in real-time using Field Programmable Gate Array (FPGA) technology. Quartus software is used to implement, verify, and analyze the design’s performance using the VHSIC Hardware Description Language. The system’s predicted processing time is faster than the modern iris unwrapping technique used today∗.
Interactive Paper Session
icon_mobile_dropdown
Efficient fast thumbnail extraction algorithm for HEVC
Wonjin Lee, Gwanggil Jeon, Jechang Jeong
In this paper, we proposed a fast thumbnail extraction algorithm using partial decoding for HEVC. The proposed algorithm reconstructs only 4x4 boundary and TU boundary needed for thumbnail image with partial decoding. Experimental results show that proposed method significantly reduces the computational complexity and extraction time for thumbnail. In addition, the visual quality of thumbnail image of proposed method had no big difference compared to conventional method.
Parallel hybrid algorithm for solution in electrical impedance equation
Volodymyr Ponomaryov, Marco Robles-Gonzalez, Ariana Bucio-Ramirez, et al.
This work is dedicated to the analysis of the forward and the inverse problem to obtain a better approximation to the Electrical Impedance Tomography equation. In this case, we employ for the forward problem the numerical method based on the Taylor series in formal power and for the inverse problem the Finite Element Method. For the analysis of the forward problem, we proposed a novel algorithm, which employs a regularization technique for the stability, additionally the parallel computing is used to obtain the solution faster; this modification permits to obtain an efficient solution of the forward problem. Then, the found solution is used in the inverse problem for the approximation employing the Finite Element Method. The algorithms employed in this work are developed in structural programming paradigm in C++, including parallel processing; the time run analysis is performed only in the forward problem because the Finite Element Method due to their high recursive does not accept parallelism. Some examples are performed for this analysis, in which several conductivity functions are employed for two different cases: for the analytical cases: the exponential and sinusoidal functions are used, and for the geometrical cases the circle at center and five disk structure are revised as conductivity functions. The Lebesgue measure is used as metric for error estimation in the forward problem, meanwhile, in the inverse problem PSNR, SSIM, MSE criteria are applied, to determine the convergence of both methods.
Fast-coding robust motion estimation model in a GPU
Carlos García, Guillermo Botella, Francisco de Sande, et al.
Nowadays vision systems are used with countless purposes. Moreover, the motion estimation is a discipline that allow to extract relevant information as pattern segmentation, 3D structure or tracking objects. However, the real-time requirements in most applications has limited its consolidation, considering the adoption of high performance systems to meet response times. With the emergence of so-called highly parallel devices known as accelerators this gap has narrowed. Two extreme endpoints in the spectrum of most common accelerators are Field Programmable Gate Array (FPGA) and Graphics Processing Systems (GPU), which usually offer higher performance rates than general propose processors. Moreover, the use of GPUs as accelerators involves the efficient exploitation of any parallelism in the target application. This task is not easy because performance rates are affected by many aspects that programmers should overcome. In this paper, we evaluate OpenACC standard, a programming model with directives which favors porting any code to a GPU in the context of motion estimation application. The results confirm that this programming paradigm is suitable for this image processing applications achieving a very satisfactory acceleration in convolution based problems as in the well-known Lucas & Kanade method.
Real-time single-exposure ROI-driven HDR adaptation based on focal-plane reconfiguration
J. Fernández-Berni, R. Carmona-Galán, R. del Río, et al.
This paper describes a prototype smart imager capable of adjusting the photo-integration time of multiple regions of interest concurrently, automatically and asynchronously with a single exposure period. The operation is supported by two intertwined photo-diodes at pixel level and two digital registers at the periphery of the pixel matrix. These registers divide the focal-plane into independent regions within which automatic concurrent adjustment of the integration time takes place. At pixel level, one of the photo-diodes senses the pixel value itself whereas the other, in collaboration with its counterparts in a particular ROI, senses the mean illumination of that ROI. Additional circuitry interconnecting both photo-diodes enables the asynchronous adjustment of the integration time for each ROI according to this sensed illumination. The sensor can be reconfigured on-the-fly according to the requirements of a vision algorithm.
Task-oriented quality assessment and adaptation in real-time mission critical video streaming applications
In recent years video traffic has become the dominant application on the Internet with global year-on-year increases in video-oriented consumer services. Driven by improved bandwidth in both mobile and fixed networks, steadily reducing hardware costs and the development of new technologies, many existing and new classes of commercial and industrial video applications are now being upgraded or emerging. Some of the use cases for these applications include areas such as public and private security monitoring for loss prevention or intruder detection, industrial process monitoring and critical infrastructure monitoring. The use of video is becoming commonplace in defence, security, commercial, industrial, educational and health contexts. Towards optimal performances, the design or optimisation in each of these applications should be context aware and task oriented with the characteristics of the video stream (frame rate, spatial resolution, bandwidth etc.) chosen to match the use case requirements. For example, in the security domain, a task-oriented consideration may be that higher resolution video would be required to identify an intruder than to simply detect his presence. Whilst in the same case, contextual factors such as the requirement to transmit over a resource-limited wireless link, may impose constraints on the selection of optimum task-oriented parameters. This paper presents a novel, conceptually simple and easily implemented method of assessing video quality relative to its suitability for a particular task and dynamically adapting videos streams during transmission to ensure that the task can be successfully completed. Firstly we defined two principle classes of tasks: recognition tasks and event detection tasks. These task classes are further subdivided into a set of task-related profiles, each of which is associated with a set of taskoriented attributes (minimum spatial resolution, minimum frame rate etc.). For example, in the detection class, profiles for intruder detection will require different temporal characteristics (frame rate) from those used for detection of high motion objects such as vehicles or aircrafts. We also define a set of contextual attributes that are associated with each instance of a running application that include resource constraints imposed by the transmission system employed and the hardware platforms used as source and destination of the video stream. Empirical results are presented and analysed to demonstrate the advantages of the proposed schemes.
A simulator tool set for evaluating HEVC/SHVC streaming
Tawfik Al Hadhrami, James Nightingale, Qi Wang, et al.
Video streaming and other multimedia applications account for an ever increasing proportion of all network traffic. The recent adoption of High Efficiency Video Coding (HEVC) as the H.265 standard provides many opportunities for new and improved services multimedia services and applications in the consumer domain. Since the delivery of version one of H.265, the Joint Collaborative Team on Video Coding have been working towards standardisation of a scalable extension (SHVC) to the H.265 standard and a series of range extensions and new profiles. As these enhancements are added to the standard the range of potential applications and research opportunities will expend. For example the use of video is also growing rapidly in other sectors such as safety, security, defence and health with real-time high quality video transmission playing an important role in areas like critical infrastructure monitoring and disaster management. Each of which may benefit from the application of enhanced HEVC/H.265 and SHVC capabilities. The majority of existing research into HEVC/H.265 transmission has focussed on the consumer domain addressing issues such as broadcast transmission and delivery to mobile devices with the lack of freely available tools widely cited as an obstacle to conducting this type of research. In this paper we present a toolset which facilitates the transmission and evaluation of HEVC/H.265 and SHVC encoded video on the popular open source NCTUns simulator. Our toolset provides researchers with a modular, easy to use platform for evaluating video transmission and adaptation proposals on large scale wired, wireless and hybrid architectures. The toolset consists of pre-processing, transmission, SHVC adaptation and post-processing tools to gather and analyse statistics. It has been implemented using HM15 and SHM5, the latest versions of the HEVC and SHVC reference software implementations to ensure that currently adopted proposals for scalable and range extensions to the standard can be investigated. We demonstrate the effectiveness and usability of our toolset by evaluating SHVC streaming and adaptation to meet terminal constraints and network conditions in a range of wired, wireless, and large scale wireless mesh network scenarios, each of which is designed to simulate a realistic environment. Our results are compared to those for H264/SVC, the scalable extension to the existing H.264/AVC advanced video coding standard.
Dynamic resource allocation engine for cloud-based real-time video transcoding in mobile cloud computing environments
Bada Adedayo, Qi Wang, Jose M. Alcaraz Calero, et al.
The recent explosion in video-related Internet traffic has been driven by the widespread use of smart mobile devices, particularly smartphones with advanced cameras that are able to record high-quality videos. Although many of these devices offer the facility to record videos at different spatial and temporal resolutions, primarily with local storage considerations in mind, most users only ever use the highest quality settings. The vast majority of these devices are optimised for compressing the acquired video using a single built-in codec and have neither the computational resources nor battery reserves to transcode the video to alternative formats. This paper proposes a new low-complexity dynamic resource allocation engine for cloud-based video transcoding services that are both scalable and capable of being delivered in real-time. Firstly, through extensive experimentation, we establish resource requirement benchmarks for a wide range of transcoding tasks. The set of tasks investigated covers the most widely used input formats (encoder type, resolution, amount of motion and frame rate) associated with mobile devices and the most popular output formats derived from a comprehensive set of use cases, e.g. a mobile news reporter directly transmitting videos to the TV audience of various video format requirements, with minimal usage of resources both at the reporter’s end and at the cloud infrastructure end for transcoding services.
Impact of different cloud deployments on real-time video applications for mobile video cloud users
Kashif A. Khan, Qi Wang, Chunbo Luo, et al.
The latest trend to access mobile cloud services through wireless network connectivity has amplified globally among both entrepreneurs and home end users. Although existing public cloud service vendors such as Google, Microsoft Azure etc. are providing on-demand cloud services with affordable cost for mobile users, there are still a number of challenges to achieve high-quality mobile cloud based video applications, especially due to the bandwidth-constrained and errorprone mobile network connectivity, which is the communication bottleneck for end-to-end video delivery. In addition, existing accessible clouds networking architectures are different in term of their implementation, services, resources, storage, pricing, support and so on, and these differences have varied impact on the performance of cloud-based real-time video applications. Nevertheless, these challenges and impacts have not been thoroughly investigated in the literature. In our previous work, we have implemented a mobile cloud network model that integrates localized and decentralized cloudlets (mini-clouds) and wireless mesh networks. In this paper, we deploy a real-time framework consisting of various existing Internet cloud networking architectures (Google Cloud, Microsoft Azure and Eucalyptus Cloud) and a cloudlet based on Ubuntu Enterprise Cloud over wireless mesh networking technology for mobile cloud end users. It is noted that the increasing trend to access real-time video streaming over HTTP/HTTPS is gaining popularity among both research and industrial communities to leverage the existing web services and HTTP infrastructure in the Internet. To study the performance under different deployments using different public and private cloud service providers, we employ real-time video streaming over the HTTP/HTTPS standard, and conduct experimental evaluation and in-depth comparative analysis of the impact of different deployments on the quality of service for mobile video cloud users. Empirical results are presented and discussed to quantify and explain the different impacts resulted from various cloud deployments, video application and wireless/mobile network setting, and user mobility. Additionally, this paper analyses the advantages, disadvantages, limitations and optimization techniques in various cloud networking deployments, in particular the cloudlet approach compared with the Internet cloud approach, with recommendations of optimized deployments highlighted. Finally, federated clouds and inter-cloud collaboration challenges and opportunities are discussed in the context of supporting real-time video applications for mobile users.
Improving wavelet denoising based on an in-depth analysis of the camera color processing
Tamara Seybold, Mathias Plichta, Walter Stechele
While Denoising is an extensively studied task in signal processing research, most denoising methods are designed and evaluated using readily processed image data, e.g. the well-known Kodak data set. The noise model is usually additive white Gaussian noise (AWGN). This kind of test data does not correspond to nowadays real-world image data taken with a digital camera. Using such unrealistic data to test, optimize and compare denoising algorithms may lead to incorrect parameter tuning or suboptimal choices in research on real-time camera denoising algorithms. In this paper we derive a precise analysis of the noise characteristics for the different steps in the color processing. Based on real camera noise measurements and simulation of the processing steps, we obtain a good approximation for the noise characteristics. We further show how this approximation can be used in standard wavelet denoising methods. We improve the wavelet hard thresholding and bivariate thresholding based on our noise analysis results. Both the visual quality and objective quality metrics show the advantage of the proposed method. As the method is implemented using look-up-tables that are calculated before the denoising step, our method can be implemented with very low computational complexity and can process HD video sequences real-time in an FPGA.
Impulsive noise suppression in color images based on the geodesic digital paths
In the paper a novel filtering design based on the concept of exploration of the pixel neighborhood by digital paths is presented. The paths start from the boundary of a filtering window and reach its center. The cost of transitions between adjacent pixels is defined in the hybrid spatial-color space. Then, an optimal path of minimum total cost, leading from pixels of the window's boundary to its center is determined. The cost of an optimal path serves as a degree of similarity of the central pixel to the samples from the local processing window. If a pixel is an outlier, then all the paths starting from the window's boundary will have high costs and the minimum one will also be high. The filter output is calculated as a weighted mean of the central pixel and an estimate constructed using the information on the minimum cost assigned to each image pixel. So, first the costs of optimal paths are used to build a smoothed image and in the second step the minimum cost of the central pixel is utilized for construction of the weights of a soft-switching scheme. The experiments performed on a set of standard color images, revealed that the efficiency of the proposed algorithm is superior to the state-of-the-art filtering techniques in terms of the objective restoration quality measures, especially for high noise contamination ratios. The proposed filter, due to its low computational complexity, can be applied for real time image denoising and also for the enhancement of video streams.
Optimal camera exposure for video surveillance systems by predictive control of shutter speed, aperture, and gain
Juan Torres, José Manuel Menéndez
This paper establishes a real-time auto-exposure method to guarantee that surveillance cameras in uncontrolled light conditions take advantage of their whole dynamic range while provide neither under nor overexposed images. State-of-the-art auto-exposure methods base their control on the brightness of the image measured in a limited region where the foreground objects are mostly located. Unlike these methods, the proposed algorithm establishes a set of indicators based on the image histogram that defines its shape and position. Furthermore, the location of the objects to be inspected is likely unknown in surveillance applications. Thus, the whole image is monitored in this approach. To control the camera settings, we defined a parameters function (Ef ) that linearly depends on the shutter speed and the electronic gain; and is inversely proportional to the square of the lens aperture diameter. When the current acquired image is not overexposed, our algorithm computes the value of Ef that would move the histogram to the maximum value that does not overexpose the capture. When the current acquired image is overexposed, it computes the value of Ef that would move the histogram to a value that does not underexpose the capture and remains close to the overexposed region. If the image is under and overexposed, the whole dynamic range of the camera is therefore used, and a default value of the Ef that does not overexpose the capture is selected. This decision follows the idea that to get underexposed images is better than to get overexposed ones, because the noise produced in the lower regions of the histogram can be removed in a post-processing step while the saturated pixels of the higher regions cannot be recovered. The proposed algorithm was tested in a video surveillance camera placed at an outdoor parking lot surrounded by buildings and trees which produce moving shadows in the ground. During the daytime of seven days, the algorithm was running alternatively together with a representative auto-exposure algorithm in the recent literature. Besides the sunrises and the nightfalls, multiple weather conditions occurred which produced light changes in the scene: sunny hours that produced sharpen shadows and highlights; cloud coverages that softened the shadows; and cloudy and rainy hours that dimmed the scene. Several indicators were used to measure the performance of the algorithms. They provided the objective quality as regards: the time that the algorithms recover from an under or over exposure, the brightness stability, and the change related to the optimal exposure. The results demonstrated that our algorithm reacts faster to all the light changes than the selected state-of-the-art algorithm. It is also capable of acquiring well exposed images and maintaining the brightness stable during more time. Summing up the results, we concluded that the proposed algorithm provides a fast and stable auto-exposure method that maintains an optimal exposure for video surveillance applications. Future work will involve the evaluation of this algorithm in robotics.
Real-time object recognition in multidimensional images based on joined extended structural tensor and higher-order tensor decomposition methods
In this paper a system for real-time recognition of objects in multidimensional video signals is proposed. Object recognition is done by pattern projection into the tensor subspaces obtained from the factorization of the signal tensors representing the input signal. However, instead of taking only the intensity signal the novelty of this paper is first to build the Extended Structural Tensor representation from the intensity signal that conveys information on signal intensities, as well as on higher-order statistics of the input signals. This way the higher-order input pattern tensors are built from the training samples. Then, the tensor subspaces are built based on the Higher-Order Singular Value Decomposition of the prototype pattern tensors. Finally, recognition relies on measurements of the distance of a test pattern projected into the tensor subspaces obtained from the training tensors. Due to high-dimensionality of the input data, tensor based methods require high memory and computational resources. However, recent achievements in the technology of the multi-core microprocessors and graphic cards allows real-time operation of the multidimensional methods as is shown and analyzed in this paper based on real examples of object detection in digital images.
Near real-time operation of public image database for ground vehicle navigation
E. Ali, S. P. Kozaitis
An effective color night vision system for ground vehicle navigation should operate in near real-time to be practical. We described a system that uses a public database as a source of color information to colorize night vision imagery. Such an approach presents several problems due to differences between acquired and reference imagery. Our system performed registration, colorizing, and reference updating in near real-time in an effort to help drivers of ground vehicles during night to see a colored view of a scene.