Proceedings Volume 7244

Real-Time Image and Video Processing 2009

cover
Proceedings Volume 7244

Real-Time Image and Video Processing 2009

View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 4 February 2009
Contents: 6 Sessions, 25 Papers, 0 Presentations
Conference: IS&T/SPIE Electronic Imaging 2009
Volume Number: 7244

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Front Matter: Volume 7244
  • Real-Time Hardware
  • Real-Time Camera Systems
  • Real-Time Video Processing
  • Real-Time Algorithms
  • Interactive Paper Session
Front Matter: Volume 7244
icon_mobile_dropdown
Front Matter: Volume 7244
This PDF file contains the front matter associated with SPIE Proceedings Volume 7244, including the Title Page, Copyright information, Table of Contents, and the Conference Committee listing.
Real-Time Hardware
icon_mobile_dropdown
Iris matching with configurable hardware
Iris recognition systems have recently become an attractive identification method because of their extremely high accuracy. Most modern iris recognition systems are currently deployed on traditional sequential digital systems, such as a computer. However, modern advancements in configurable hardware, most notably Field-Programmable Gate Arrays (FPGAs) have provided an exciting opportunity to discover the parallel nature of modern image processing algorithms. In this study, iris matching, a repeatedly executed portion of a modern iris recognition algorithm is parallelized on an FPGA system. We demonstrate a 19 times speedup of the parallelized algorithm on the FPGA system when compared to a state-of-the-art CPU-based version.
Image segmentation based upon topological operators: real-time implementation case study
In miscellaneous applications of image treatment, thinning and crest restoring present a lot of interests. Recommended algorithms for these procedures are those able to act directly over grayscales images while preserving topology. But their strong consummation in term of time remains the major disadvantage in their choice. In this paper we present an efficient hardware implementation on RISC processor of two powerful algorithms of thinning and crest restoring developed by our team. Proposed implementation enhances execution time. A chain of segmentation applied to medical imaging will serve as a concrete example to illustrate the improvements brought thanks to the optimization techniques in both algorithm and architectural levels. The particular use of the SSE instruction set relative to the X86_32 processors (PIV 3.06 GHz) will allow a best performance for real time processing: a cadency of 33 images (512*512) per second is assured.
Real-time embedded atmospheric compensation for long-range imaging using the average bispectrum speckle method
While imaging over long distances is critical to a number of security and defense applications, such as homeland security and launch tracking, current optical systems are limited in resolving power. This is largely a result of the turbulent atmosphere in the path between the region under observation and the imaging system, which can severely degrade captured imagery. There are a variety of post-processing techniques capable of recovering this obscured image information; however, the computational complexity of such approaches has prohibited real-time deployment and hampers the usability of these technologies in many scenarios. To overcome this limitation, we have designed and manufactured an embedded image processing system based on commodity hardware which can compensate for these atmospheric disturbances in real-time. Our system consists of a reformulation of the average bispectrum speckle method coupled with a high-end FPGA processing board, and employs modular I/O capable of interfacing with most common digital and analog video transport methods (composite, component, VGA, DVI, SDI, HD-SDI, etc.). By leveraging the custom, reconfigurable nature of the FPGA, we have achieved performance twenty times faster than a modern desktop PC, in a form-factor that is compact, low-power, and field-deployable.
Grayscale image segmentation for real-time traffic sign recognition: the hardware point of view
Tam P. Cao, Guang Deng, Darrell Elton
In this paper, we study several grayscale-based image segmentation methods for real-time road sign recognition applications on an FPGA hardware platform. The performance of different image segmentation algorithms in different lighting conditions are initially compared using PC simulation. Based on these results and analysis, suitable algorithms are implemented and tested on a real-time FPGA speed sign detection system. Experimental results show that the system using segmented images uses significantly less hardware resources on an FPGA while maintaining comparable system's performance. The system is capable of processing 60 live video frames per second.
A comparison between DSP and FPGA platforms for real-time imaging applications
Mukul Shirvaikar, Tariq Bushnaq
Real-time applications impose serious demands on hardware size, time deadlines, power dissipation, and cost of the solution. A typical system may also require modification of parameters during operation. Digital Signal Processors (DSPs) are a special class of microprocessors designed to specifically address real time implementation issues. As the complexity of real-time systems increases the need to introduce more efficient hardware platforms grows. In recent years Field Programmable Gate Arrays (FPGAs) have gained a lot of traction in the real-time community, as a replacement for the traditional DSP solutions. FPGAs are indeed revolutionizing image and signal processing due to their advanced capabilities such as reconfigurability. The Discrete Wavelet Transform is a classic real-time imaging algorithm that is drawing the attention of engineers in recent years. In this paper, we compare the FPGA implementation of 2-D liftingbased wavelet transform using optimized hand written VHDL code with a DSP implementation of the same algorithm using the C language. The goal of this paper is to compare the development effort and the performance of a traditional DSP processor to a FPGA based implementation of an image real-time application. The results of the experiment proves the superiority of FPGAs over traditional DSP processors in terms of time execution, power dissipation, and hardware utilization, nevertheless this advantage comes at the cost of a higher development effort. The hardware platform used is an Altera DE2 board with a 50MHz Cyclone II FPGA chip and a TI TMS320C6416 DSP Starter Kit (DSK).
Hardware architecture to accelerate the belief propagation algorithm for a Wyner-Ziv decoder
Wyner-Ziv based video codecs reverse the processing complexity between encoders and decoders such that the complexity of the encoder can be significantly reduced at the expense of highly complex decoders requiring hardware accelerators to achieve real time performance. In this paper we describe a flexible hardware architecture for processing the Belief Propagation algorithm in a real time Wyner-Ziv video decoder for several hundred, very large, Low Density Parity Check (LDPC) codes. The proposed architecture features a hierarchical memory structure to provide a caching capability to overcome the high memory bandwidths needed to supply data to the processors. By taking advantage of the deterministic nature of LDPC codes to increase cache utilization, we are able to substantially reduce the size of expensive, high speed memory needed to support the processing of large codes compared to designs implementing a single layer memory structure.
Real-Time Camera Systems
icon_mobile_dropdown
Real-time implementation of single-shot passive auto focus on DM350 digital camera processor
With the introduction of high mega-pixel image sensors and large focal length lenses in today's consumer level digital still cameras, single-shot passive auto-focus (AF) performance in terms of speed and accuracy remains to be a critical issue among camera manufacturers. To address the AF performance issue, this paper covers the real-time implementation of a previously developed modified rule-based single-shot AF search method on the Texas Instruments TMS320DM350 processor. It is shown that a balance between AF speed and accuracy is needed to meet the real-time constraint of the digital camera system. Performance results indicate that this solution outperforms the standard global search method in terms of AF speed and accuracy.
Real-time development system for image processing engines
Sergio Goma, Radu Gheorghe, Milivoje Aleksic
Certain feedback loop based algorithms contained in an image processing engine, such as auto white balance, auto exposure or auto focus, are best designed and evaluated within a real-time framework due to strong requirements of close study of the dynamics present. Furthermore, the development process entails the usual flexibility associated with any software module implementation, such as the ability to dump debugging information or placement of break points in the code. In addition, the end deployment platform is not usually available during the design process, while tuning of the above mentioned algorithms must encompass particularities of each individual target sensor. We explore in this paper a real-time hardware-software solution that addresses all the requirements mentioned before and functions on a non-real time operating system (Windows). Moreover we exemplify and quantify the hard deadlines required by such a feedback control loop algorithm and illustrate how they are supported in our implementation.
Bayer bilateral denoising on TriMedia3270
H. Phelippeau, M. Akil, B. Dias Rodrigues, et al.
Digital cameras are now commonly included in several digital devices such as mobile phones. They are present everywhere and have become the principal image capturing tool. Inherent to light and semiconductors properties, sensor noise [10] continues to be an important factor of image quality [12], especially in low light conditions. Removing the noise with mathematical solutions appears thus unavoidable to obtain an acceptable image quality. However, embedded devices are limited by processing capabilities and power consumption and thus cannot make use of the full range of complex mathematical noise removing solutions. The bilateral filter [6] appears to be an interesting compromise between implementation complexity and noise removing performances. Especially, the Bayer [5] bilateral filter proposed in [11] is well adapted for single sensor devices. In this paper, we simulate and optimize the Bayer bilateral filter execution on a common media-processor: the TM3270 [4] from the NXP Semiconductors TriMedia family. To do so we use the TriMedia Compilation System (TCS). We applied common optimization techniques (such as LUT, loop unrolling, convenient data type representation) as well as custom TriMedia operations. We finally propose a new Bayer bilateral filter formulation dedicated to the TM3270 architecture that yields an execution improvement of 99.6% compared to the naïve version. This improvement results in real-time video processing at VGA resolution at the 350MHz clock rate.
Real-time global motion estimation for video stabilization
Digital video stabilization is a cost-effective way to reduce the effect of camera shake in handheld video cameras. We propose several enhancements for video stabilization based on integral projection matching,1 which is a simple and efficient global motion estimation technique for translational motion. One-dimensional intensity projections along the horizontal and vertical axes provide a signature of the image. Global motion estimation aims at finding the largest similarity between shifted intensity projections between consecutive frames. The obtained shifts provide information about the global inter-frame motion. Relying upon the estimated global motion an output frame of reduced size is determined deploying motion smoothing. We propose several enhancements of prior works to improve the stabilization performance and to reduce computational complexity and memory requirements. The main enhancement is a partitioning of the projection intensities to better cope with in-scene motion. Logarithmic search is deployed to seek for a minimum matching error for selected partitions in two subsequent frames. Furthermore we propose a novel motion smoothing approach we call center-attracted motion damping. We evaluate the performance of the enhancements under various imaging conditions using real video sequences as well as synthetic video sequences with provided ground-truth motion. The stabilization accuracy is sufficient under most imaging conditions so that the effect of camera shake is eliminated or significantly reduced in the stabilized video.
Real-Time Video Processing
icon_mobile_dropdown
Selective application of sub-pixel motion estimation and Hadamard transform in H.264/AVC
Abdelrahman Abdelazim, Mingyuan Yang, Christos Grecos, et al.
In this paper, we propose an algorithm for selective application of sub-pixel Motion Estimation and Hadamard transform in the H.264/AVC video coding standard. The algorithm exploits the spatial interpolation effect of the reference slices on the best matches of different block sizes in order to increase the computational efficiency of the overall motion estimation process. Experimental results show that the proposed algorithm significantly reduces the CPU cycles in the Fast-Full-Search Motion Estimation Scheme by up to 8.2% with similar RD performance, as compared to the H.264/AVC standard.
Real-time detection and tracking of multiple objects with partial decoding in H.264/AVC bitstream domain
In this paper, we show that we can apply probabilistic spatiotemporal macroblock filtering (PSMF) and partial decoding processes to effectively detect and track multiple objects in real time in H.264|AVC bitstreams with stationary background. Our contribution is that our method cannot only show fast processing time but also handle multiple moving objects that are articulated, changing in size or internally have monotonous color, even though they contain a chaotic set of non-homogeneous motion vectors inside. In addition, our partial decoding process for H.264|AVC bitstreams enables to improve the accuracy of object trajectories and overcome long occlusion by using extracted color information.
Adaptive interpolation filter method for improving coding efficiency in H.264/AVC
Kun Su Yoon, Hyun Woo Cho
In this paper, we propose an improved adaptive interpolation filter method for improving coding efficiency in H.264/ AVC. Although the conventional cost functions have showed a good performance in terms of rate and distortion, it still leaves room for improvement. To improve coding efficiency, we introduce a new cost function which considers the bit rates and distortion for coding the macroblock. The best filter is adaptively selected to minimize the proposed cost function. Experimental results show that the adaptive interpolation filter with the proposed cost function significantly improves the coding efficiency compared to ones using conventional cost function. It leads to about a 5.62% (1 reference frame) and 5.14% (5 reference frames) bit rate reduction on average compared to H.264/AVC, respectively.
Video calibration for spatial-temporal registration with gain and offset adjustments
For most video quality measurement algorithms, a processed video sequence and the corresponding source video sequence need to be aligned in the spatial and temporal directions. Furthermore, when the source video sequences are encoded and transmitted, gain and offset can be introduced. The estimation process, which estimates spatial shifts, temporal shift, gain and offset, is known as video calibration. In this paper, we proposed a video calibration method for full-reference and reduced-reference video quality measurement algorithms. The proposed method extracts a number of features from source video sequences. Using these features, we perform video calibration. Experimental results show that the proposed method provides good performance and the proposed method was included in an international standard.
Real-time visual tracking system modelling in MPSoC using platform based design
Zai Jian Jia, Tomás Bautista, Antonio Núñez, et al.
In this paper, we present the modelling of a real-time tracking system on a Multi-Processor System on Chip (MPSoC). Our final goal is to build a more complex computer vision system (CVS) by integrating several applications in a modular way, which performs different kind of data processing issues but sharing a common platform, and this way, a solution for a set of applications using the same architecture is offered and not just for one application. In our current work, a visual tracking system with real-time behaviour (25 frames/sec) is used like a reference application, and also, guidelines for our future CVS applications development. Our algorithm written in C++ is based on correlation technique and the threshold dynamic update approach. After an initial computational complexity analysis, a task-graph was generated from this tracking algorithm. Concurrently with this functionality correctness analysis, a generic model of multi-processor platform was developed. Finally, the tracking system performance mapped onto the proposed architecture and shared resource usage were analyzed to determine the real architecture capacity, and also to find out possible bottlenecks in order to propose new solutions which allow more applications to be mapped on the platform template in the future.
Real-Time Algorithms
icon_mobile_dropdown
Real-time vehicle detection and tracking based on perspective and non-perspective space cooperation
In recent years advanced driver assistance systems (ADAS) have received increasing interest to confront car accidents. In particular, video processing based vehicle detection methods are emerging as an efficient way to address accident prevention. Many video-based approaches are proposed in the literature for vehicle detection, involving sophisticated and costly computer vision techniques. Most of these methods require ad hoc hardware implementations to attain real-time operation. Alternatively, other approaches perform a domain change --via transforms like FFT, inverse perspective mapping (IPM) or Hough transform-- that simplifies otherwise complex feature detection. In this work, a cooperative strategy between two domains, the original perspective space and the transformed non-perspective space computed trough IPM, is proposed in order to alleviate the processing load in each domain by maximizing the information exchange between the two domains. A system is designed upon this framework that computes the location and dimension of the vehicles in a video sequence. Additionally, the system is made scalable to the complexity imposed by the scenario. As a result, real-time vehicle detection and tracking is accomplished in a general purpose platform. The system has been tested for sequences comprising a wide variety of scenarios, showing robust and accurate performance.
Real-time vision-based traffic flow measurements and incident detection
Barak Fishbain, Ianir Ideses, David Mahalel, et al.
Visual surveillance for traffic systems requires short processing time, low processing cost and high reliability. Under those requirements, image processing technologies offer a variety of systems and methods for Intelligence Transportation Systems (ITS) as a platform for traffic Automatic Incident Detection (AID). There exist two classes of AID methods mainly studied: one is based on inductive loops, radars, infrared sonar and microwave detectors and the other is based on video images. The first class of methods suffers from drawbacks in that they are expensive to install and maintain and they are unable to detect slow or stationary vehicles. Video sensors, on the other hand, offer a relatively low installation cost with little traffic disruption during maintenance. Furthermore, they provide wide area monitoring allowing analysis of traffic flows and turning movements, speed measurement, multiple-point vehicle counts, vehicle classification and highway state assessment, based on precise scene motion analysis. This paper suggests the utilization of traffic models for real-time vision-based traffic analysis and automatic incident detection. First, the traffic flow variables, are introduced. Then, it is described how those variables can be measured from traffic video streams in real-time. Having the traffic variables measured, a robust automatic incident detection scheme is suggested. The results presented here, show a great potential for integration of traffic flow models into video based intelligent transportation systems. The system real time performance is achieved by utilizing multi-core technology using standard parallelization algorithms and libraries (OpenMP, IPP).
Real-time depth map manipulation for 3D visualization
One of the key aspects of 3D visualization is computation of depth maps. Depth maps enables synthesis of 3D video from 2D video and use of multi-view displays. Depth maps can be acquired in several ways. One method is to measure the real 3D properties of the scene objects. Other methods rely on using two cameras and computing the correspondence for each pixel. Once a depth map is acquired for every frame, it can be used to construct its artificial stereo pair. There are many known methods for computing the optical flow between adjacent video frames. The drawback of these methods is that they require extensive computation power and are not very well suited to high quality real-time 3D rendering. One efficient method for computing depth maps is extraction of motion vector information from standard video encoders. In this paper we present methods to improve the 3D visualization quality acquired from compression CODECS by spatial/temporal and logical operations and manipulations. We show how an efficient real time implementation of spatial-temporal local order statistics such as median and local adaptive filtering in 3D-DCT domain can substantially improve the quality of depth maps and consequently 3D video while retaining real-time rendering. Real-time performance is achived by utilizing multi-core technology using standard parallelization algorithms and libraries (OpenMP, IPP).
Accelerating sub-pixel marker segmentation using GPU
Sub-pixel accurate marker segmentation is an important task for many computer vision systems. The 3D-positions of markers are used in control loops to determine the position of machine tools or robot end-effectors. Accurate segmentation of the marker position in the image plane is crucial for accurate reconstruction. Many subpixel segmentation algorithms are computationally intensive, especially when the number of markers increases. Modern graphics hardware with its massively parallel architecture provides a powerful tool for many image segmentation tasks. Especially, the time consuming sub-pixel refinement steps in marker segmentation can benefit from the recent progress. This article presents an implementation of a sub-pixel marker segmentation framework using the GPU to accelerate the processing time. The image segmentation chain consists of two stages. The first is a pre-processing stage which segments the initial position of the marker with pixel accuracy, the second stage refines the initial marker position to sub-pixel accuracy. Both stages are implemented as shader programs on the GPU. The flexible architecture allows it to combine different pre-processing and sub-pixel refinement algorithms. Experimental results show that significant speed-up can be achieved compared to CPU implementations, especially when the number of markers increases.
Fuzzy set and directional image processing techniques for impulsive noise reduction employing DSP
In literature, numerous algorithms in image denoising in case of a noise of different nature were implemented. One of the principal noises is impulsive one companioning any transmission process. This paper presents novel approach unificating two most powerful techniques used during last years: directional processing and fuzzy-set techniques. Novel method permits the detection of noisy pixels and local movements (edges and fine details) in a static image or in an image sequence. The proposed algorithm realizes the noise suppression preserving fine details and edges, as so as color chromaticity properties in the multichannel image. We present applications of proposed algorithm in color imaging and in multichannel remote sensing from several bands. Finally, hardware requirements are evaluated permitting real time implementation on DSP of Texas Instruments using a Reference Framework defined as RF5. It was implemented on DSP the multichannel algorithms in a multitask process that permits to improve the performance of several tasks, and at the same time enhancing the time processing and reducing computational charge in a dedicated hardware. Numerous experimental results in the processing the color images/sequences and satellite remote sensing data show the superiority of proposed approach as in objective criteria (PSNR, MAE, NCD), as in visual subjective way. The needed processing times and visual characteristics are exposed in the paper demonstrating accepted performance of the approach.
Interactive Paper Session
icon_mobile_dropdown
Correction of artifacts in correlated double-sampled CCD video resulting from insufficient bandwidth
Correlated Doubling Sampling (CDS) is a popular technique for extracting pixel signal data from raw CCD detector output waveforms. However some common electronic design approaches to implementing CDS can produce undesired artifacts in digitized pixel signal data if the bandwidth of the CCD waveform entering the CDS circuit is too low, which could be the result of an intentional design implementation approach or the result of a failure in one or more electronic components. An example of an undesirable artifact is an overshoot (undershoot) pixel response when transitioning from a black-to-light (light-to-black) image scene in the serial read out direction. In this paper an analytical model is developed that accurately describes the temporal behavior of the CDS circuit under all CCD video bandwidth conditions. This model is then used to create a signal processing kernel that effectively removes all undesired artifacts associated with the operation of CDS electronics with CCD waveforms exhibiting low bandwidth response. This correction approach is demonstrated on digitized data from a CCD-based instrument with known bandwidth issues and exhibiting undershoot/overshoot artifacts and the results show that the undesirable artifacts can be completely removed.
Unsupervised exposure correction for video
X. Petrova, S. Sedunov, A. Ignatov
The paper describes an "off-the-shelf" algorithmic solution for unsupervised exposure correction for video. An important feature of the algorithm is accurate processing not only for natural video sequences, but also for edited, rendered or combined content, including content with letter-boxes or pillar-boxes captured from TV broadcasts. The algorithm allows to change degree of exposure correction smoothly for continuous video scenes and to change it promptly on cuts. Solution includes scene change detection, letter-box detection, pillar-box detection, exposure correction adaptation, exposure correction and color correction. Exposure correction adaptation is based on histogram analysis and soft logics inference. Decision rules are based on relative number of entries in the low tones, mid tones and highlights, maximum entries in the low tones and mid tones, number of non-empty histogram entries and width of the middle range of the histogram. All decision rules have physical meaning, which allows to tune parameters easily for display devices of different classes. Exposure correction consists of computation of local average using edge-preserving filtering, applying local tone mapping and post-processing. At the final stage color correction aiming to reduce color distortions is applied.
Determination of vehicle speed in traffic video
In this paper, we present a semi real-time vehicle tracking algorithm to determine the speed of the vehicles in traffic from traffic cam video. The results of this work can be used for traffic control, security and safety both by government agencies and commercial organizations. The method described in this paper involves object feature identification, detection, and tracking in multiple video frames. The distance between vertical broken lane markers has been used to estimate absolute distances within each frame and convert pixel location coordinates to world coordinates. Speed calculations are made based on the calibrated pixel distances. Optical flow images have been computed and used for blob analysis to extract features representing moving objects. Some challenges exist in distinguishing among vehicles in uniform flow of traffic when the object are too close, are in low contrast with one another, and travel with the same or close to the same speed. In the absence of a ground truth for the actual speed of the tracked vehicles accuracy cannot be determined. However, the vehicle speeds in steady flow of traffic have been computed to within 5% of the speed limit on the analyzed highways in the video clips.
Variable disparity-motion estimation based fast three-view video coding
In this paper, variable disparity-motion estimation (VDME) based 3-view video coding is proposed. In the encoding, key-frame coding (KFC) based motion estimation and variable disparity estimation (VDE) for effectively fast three-view video encoding are processed. These proposed algorithms enhance the performance of 3-D video encoding/decoding system in terms of accuracy of disparity estimation and computational overhead. From some experiments, stereo sequences of 'Pot Plant' and 'IVO', it is shown that the proposed algorithm's PSNRs is 37.66 and 40.55 dB, and the processing time is 0.139 and 0.124 sec/frame, respectively.