Real-Time Image and Video Processing 2013 | (2013) | Publications

Volume Details

Date Published: 7 March 2013

Contents: 6 Sessions, 27 Papers, 0 Presentations

Conference: IS&T/SPIE Electronic Imaging 2013

Volume Number: 8656

All links to SPIE Proceedings will open in the SPIE Digital Library.

Show all abstracts

View Session

Front Matter: Volume 8656
Real-Time Algorithms
Real-Time Hardware
Real-Time Systems
Real-Time Video Coding
Interactive Paper Session

Front Matter: Volume 8656

Show abstract

This PDF file contains the front matter associated with SPIE Proceedings Volume 8656, including the Title Page, Copyright information, Table of Contents, and Conference Committee listing.

Real-Time Algorithms

Real-time robust target tracking in videos via graph-cuts

Barak Fishbain, Dorit S. Hochbaum, Yan T. Yang

Show abstract

Video tracking is a fundamental problem in computer vision with many applications. The goal of video tracking is to isolate a target object from its background across a sequence of frames. Tracking is inherently a three dimensional problem in that it incorporates the time dimension. As such, the computational efficiency of video segmentation is a major challenge. In this paper we present a generic and robust graph-theory-based tracking scheme in videos. Unlike previous graph-based tracking methods, the suggested approach treats motion as a pixel's property (like color or position) rather than as consistency constraints (i.e., the location of the object in the current frame is constrained to appear around its location in the previous frame shifted by the estimated motion) and solves the tracking problem optimally (i.e., neither heuristics nor approximations are applied). The suggested scheme is so robust that it allows for incorporating the computationally cheaper MPEG-4 motion estimation schemes. Although block matching techniques generate noisy and coarse motion fields, their use allows faster computation times as broad variety of off-the-shelf software and hardware components that specialize in performing this task are available. The evaluation of the method on standard and non-standard benchmark videos shows that the suggested tracking algorithm can support a fast and accurate video tracking, thus making it amenable to real-time applications.

Tracking yarns in high resolution fabric images: a real-time approach for online fabric flaw detection

Dorian Schneider

Show abstract

An algorithmic framework for real-time localization of single yarns within industrial fabric images is presented. The information about precise yarn locations forms the foundation for a fabric flaw detection system which is based on individual yarn measurements. Matching a camera frame rate of 15 fps, we define the term "real-time" by the capability of tracking all yarns within a 5 megapixel image in less than 35 ms, leaving a time slot of 31ms for further image processing and defect detection algorithms. The processing pipeline comprises adaptive histogram equalization, Wiener deconvolution, normalized template matching and a novel feature point sorting scheme. To meet real-time requirements, extensive use of the NVIDIA CUDA framework is made. Implementation details are given and source code for selected algorithms is provided. Evaluation results show that wefts and warps can be tracked reliably and independently of the fabric material or binding. Video and image footage is provided on the project website to expand the paper content.

Real-time bicycle detection at signalized intersections using thermal imaging technology

Robin Collaert

Show abstract

More and more governments and authorities around the world are promoting the use of bicycles in cities, as this is healthy for the bicyclist and improves the quality of life in general. Safety and efficiency of bicyclists has become a major focus. To achieve this, there is a need for a smarter approach towards the control of signalized intersections. Various traditional detection technologies, such as video, microwave radar and electromagnetic loops, can be used to detect vehicles at signalized intersections, but none of these can consistently separate bikes from other traffic, day and night and in various weather conditions.

As bikes should get a higher priority and also require longer green time to safely cross the signalized intersection, traffic managers are looking for alternative detection systems that can make the distinction between bicycles and other vehicles near the stop bar. In this paper, the drawbacks of a video-based approach are presented, next to the benefits of a thermal-video-based approach for vehicle presence detection with separation of bicycles. Also, the specific technical challenges are highlighted in developing a system that combines thermal image capturing, image processing and output triggering to the traffic light controller in near real-time and in a single housing.

How fast can one arbitrarily and precisely scale images?

Leonid Bilevich, Leonid Yaroslavsky

Show abstract

Image scaling is a frequent operation in video processing for optical metrology. In the paper, results of comparative study of computational complexity of different algorithms for scaling digital images with arbitrary scaling factors are presented and discussed. The following algorithms were compared: different types of spatial domain processing algorithms (linear, cubic, cubic spline interpolation) and a new DCT-based algorithm, which implements perfect (interpolation error free) scaling through discrete sinc-interpolation and is virtually free of boundary effects (characteristic for the DFT-based scaling algorithms). The comparison results enable evaluation of the feasibility of realtime implementation of the algorithms for arbitrary image scaling.

Digital ruler: real-time object tracking and dimension measurement using stereo cameras

James Nash, Kalin Atanassov, Sergio Goma, et al.

Show abstract

Stereo metrology involves obtaining spatial estimates of an object’s length or perimeter using the disparity between boundary points. True 3D scene information is required to extract length measurements of an object’s projection onto the 2D image plane. In stereo vision the disparity measurement is highly sensitive to object distance, baseline distance, calibration errors, and relative movement of the left and right demarcation points between successive frames. Therefore a tracking filter is necessary to reduce position error and improve the accuracy of the length measurement to a useful level. A Cartesian coordinate extended Kalman (EKF) filter is designed based on the canonical equations of stereo vision. This filter represents a simple reference design that has not seen much exposure in the literature. A second filter formulated in a modified sensor-disparity (DS) coordinate system is also presented and shown to exhibit lower errors during a simulated experiment.

Real-Time Hardware

FPGA design of a real-time edge enhancing smoothing filter

Nimit Pandya, Chang Choo

Show abstract

Traditional noise removal filters have an undesirable side effect of blurring edges, which is unacceptable for some image processing applications. To overcome this problem, our ongoing project evaluates an edge enhancing smoothening filter and implements it on FPGAs to reduce noise while sharpening edges. One such edge enhancing smoothing filter consists of a combination of the bilateral filter (used for edge preserving smoothing) and the Shock filter (used for edge enhancement) to achieve the desired result. This paper describes an implementation of the bilateral filter on Altera FPGAs. Shock filter part is then briefly described. Area and speed performance results for different Altera FPGA families are comparatively shown.

Large object extraction for binary images on the GPU

Gregory Huchet

Show abstract

Object filtering by size is a basic task in computer vision. A common way to extract large objects in a binary image is to run the connected-component labeling (CCL) algorithm and to compute the area of each component. Selecting the components with large areas is then straightforward. Several CCL algorithms for the GPU have already been implemented but few of them compute the component area. This extra step can be critical for real-time applications such as real-time video segmentation. The aim of this paper is to present a new approach for the extraction of visually large objects in a binary image that works in real-time. It is implemented using CUDA (Compute Unified Device Architecture), a parallel computing architecture developed by NVIDIA.

Real-time structured light intraoral 3D measurement pipeline

Radu Gheorghe, Andrei Tchouprakov, Roman Sokolov

Show abstract

Computer aided design and manufacturing (CAD/CAM) is increasingly becoming a standard feature and service provided to patients in dentist offices and denture manufacturing laboratories. Although the quality of the tools and data has slowly improved in the last years, due to various surface measurement challenges, practical, accurate, invivo, real-time 3D high quality data acquisition and processing still needs improving. Advances in GPU computational power have allowed for achieving near real-time 3D intraoral in-vivo scanning of patient’s teeth. We explore in this paper, from a real-time perspective, a hardware-software-GPU solution that addresses all the requirements mentioned before. Moreover we exemplify and quantify the hard and soft deadlines required by such a system and illustrate how they are supported in our implementation.

Three-dimensional fuzzy filter in color video sequence denoising implemented on DSP

Volodymyr I. Ponomaryov, Hector Montenegro, Ricardo Peralta-Fabi

Show abstract

In this paper, we present a Fuzzy 3D filter for color video sequences to suppress impulsive noise. The difference between the designed algorithm in comparison with other state- of-the-art algorithms consists of employing the three RGB bands of the video sequence data and analyzing the fuzzy gradients values obtained in eight directions, finally processing two temporal neighboring frames together. The simulation results have confirmed sufficiently better performance of the novel 3D filter both in terms of objective metrics (PSNR, MAE, NCD, SSIM) as well as in subjective perception via human vision in the color sequences. An efficiency analysis of the designed and other promising filters have been performed on the DSP TMS320DM642 by Texas Instruments^TM through MATLAB’s Simulink^TM module, showing that the 3D filter can be used in real-time processing applications.

Real-Time Systems

Design of a pseudo-log image transform IP in an HLS-based memory management framework

Shahzad Ahmad Butt, Stéphane Mancini, Frédéric Rousseau, et al.

Show abstract

The pseudo-log image transform is essentially a logarithmic transformation that simulates the distribution of the eye’s photoreceptors and finds application in many important areas of real time image and video processing such as motion detection and estimation in robots and foveated space variant cameras. It belongs to a family of non-linear image processing kernels in which references made to memory are non-linear functions of loop indices. Non-linear kernels need some form of memory management in order to achieve the required throughput, to minimize on-chip memory and to maximize possible data re-use. In this paper we present the design of a pseudo-log image processing hardware accelerator IP, integrated with different interpolation filtering techniques, using a memory management framework. The framework can automatically generate a memory hierarchy around the IP and a data transfer controller that facilitates data exchange with main memory. The memory hierarchy reduces on-chip memory requirements, optimizes throughput and increases data-reuse. The design of the IP is fully performed at the algorithmic level in C/C++. The algorithmic description is profiled within the framework to create a customized memory hierarchy, also described at the synthesizable algorithmic level. Finally, high level synthesis is used to perform hardware design space exploration and performance estimation. Experiments show that the generated memory hierarchy is able to feed the IP with a very high bandwidth even in presence of long external memory latencies.

Real-time color/shape-based traffic signs acquisition and recognition system

Sergio Saponara

Show abstract

A real-time system is proposed to acquire from an automotive fish-eye CMOS camera the traffic signs, and provide their automatic recognition on the vehicle network. Differently from the state-of-the-art, in this work color-detection is addressed exploiting the HSI color space which is robust to lighting changes. Hence the first stage of the processing system implements fish-eye correction and RGB to HSI transformation. After color-based detection a noise deletion step is implemented and then, for the classification, a template-based correlation method is adopted to identify potential traffic signs, of different shapes, from acquired images. Starting from a segmented-image a matching with templates of the searched signs is carried out using a distance transform. These templates are organized hierarchically to reduce the number of operations and hence easing real-time processing for several types of traffic signs. Finally, for the recognition of the specific traffic sign, a technique based on extraction of signs characteristics and thresholding is adopted. Implemented on DSP platform the system recognizes traffic signs in less than 150 ms at a distance of about 15 meters from 640x480-pixel acquired images. Tests carried out with hundreds of images show a detection and recognition rate of about 93%.

DSPACE hardware architecture for on-board real-time image/video processing in European space missions

Sergio Saponara, Massimiliano Donati, Luca Fanucci, et al.

Show abstract

The on-board data processing is a vital task for any satellite and spacecraft due to the importance of elaborate the sensing data before sending them to the Earth, in order to exploit effectively the bandwidth to the ground station. In the last years the amount of sensing data collected by scientific and commercial space missions has increased significantly, while the available downlink bandwidth is comparatively stable. The increasing demand of on-board real-time processing capabilities represents one of the critical issues in forthcoming European missions. Faster and faster signal and image processing algorithms are required to accomplish planetary observation, surveillance, Synthetic Aperture Radar imaging and telecommunications. The only available space-qualified Digital Signal Processor (DSP) free of International Traffic in Arms Regulations (ITAR) restrictions faces inadequate performance, thus the development of a next generation European DSP is well known to the space community. The DSPACE space-qualified DSP architecture fills the gap between the computational requirements and the available devices. It leverages a pipelined and massively parallel core based on the Very Long Instruction Word (VLIW) paradigm, with 64 registers and 8 operational units, along with cache memories, memory controllers and SpaceWire interfaces. Both the synthesizable VHDL and the software development tools are generated from the LISA high-level model. A Xilinx-XC7K325T FPGA is chosen to realize a compact PCI demonstrator board. Finally first synthesis results on CMOS standard cell technology (ASIC 180 nm) show an area of around 380 kgates and a peak performance of 1000 MIPS and 750 MFLOPS at 125MHz.

Real-Time Video Coding

Priority-based methods for reducing the impact of packet loss on HEVC encoded video streams

James Nightingale, Qi Wang, Christos Grecos

Show abstract

The rapid growth in the use of video streaming over IP networks has outstripped the rate at which new network infrastructure has been deployed. These bandwidth-hungry applications now comprise a significant part of all Internet traffic and present major challenges for network service providers. The situation is more acute in mobile networks where the available bandwidth is often limited. Work towards the standardisation of High Efficiency Video Coding (HEVC), the next generation video coding scheme, is currently on track for completion in 2013. HEVC offers the prospect of a 50% improvement in compression over the current H.264 Advanced Video Coding standard (H.264/AVC) for the same quality. However, there has been very little published research on HEVC streaming or the challenges of delivering HEVC streams in resource-constrained network environments. In this paper we consider the problem of adapting an HEVC encoded video stream to meet the bandwidth limitation in a mobile networks environment. Video sequences were encoded using the Test Model under Consideration (TMuC HM6) for HEVC. Network abstraction layers (NAL) units were packetized, on a one NAL unit per RTP packet basis, and transmitted over a realistic hybrid wired/wireless testbed configured with dynamically changing network path conditions and multiple independent network paths from the streamer to the client. Two different schemes for the prioritisation of RTP packets, based on the NAL units they contain, have been implemented and empirically compared using a range of video sequences, encoder configurations, bandwidths and network topologies. In the first prioritisation method the importance of an RTP packet was determined by the type of picture and the temporal switching point information carried in the NAL unit header. Packets containing parameter set NAL units and video coding layer (VCL) NAL units of the instantaneous decoder refresh (IDR) and the clean random access (CRA) pictures were given the highest priority followed by NAL units containing pictures used as reference pictures from which others can be predicted. The second method assigned a priority to each NAL unit based on the rate-distortion cost of the VCL coding units contained in the NAL unit. The sum of the rate-distortion costs of each coding unit contained in a NAL unit was used as the priority weighting. The preliminary results of extensive experiments have shown that all three schemes offered an improvement in PSNR, when comparing original and decoded received streams, over uncontrolled packet loss. Using the first method consistently delivered a significant average improvement of 0.97dB over the uncontrolled scenario while the second method provided a measurable, but less consistent, improvement across the range of testing conditions and encoder configurations.

Low complexity DCT engine for image and video compression

Maher Jridi, Yousri Ouerhani , Ayman Alfalou

Show abstract

In this paper, we defined a low complexity 2D-DCT architecture. The latter will be able to transform spatial pixels to spectral pixels while taking into account the constraints of the considered compression standard. Indeed, this work is our first attempt to obtain one reconfigurable multistandard DCT. Due to our new matrix decomposition, we could define one common 2D-DCT architecture. The constant multipliers can be configured to handle the case of RealDCT and/or IntDCT (multiplication by 2). Our optimized algorithm not only provides a reduction of computational complexity, but also leads to scalable pipelined design in systolic arrays. Indeed, the 8 × 8 StdDCT can be computed by using 4×4 StdDCT which can be obtained by calculating 2×2 StdDCT. Besides, the proposed structure can be extended to deal with higher number of N (i.e. 16 × 16 and 32 × 32). The performance of the proposed architecture are better when compared with conventional designs. In particular, for N = 4, it is found that the proposed design have nearly third the area-time complexity of the existing DCT structures. This gain is expected to be higher for a greater size of 2D-DCT.

A CABAC codec of H.264AVC with secure arithmetic coding

Nihel Neji, Maher Jridi, Ayman Alfalou, et al.

Show abstract

This paper presents an optimized H.264/AVC coding system for HDTV displays based on a typical flow with high coding efficiency and statics adaptivity features. For high quality streaming, the codec uses a Binary Arithmetic Encoding/Decoding algorithm with high complexity and a JVCE (Joint Video compression and encryption) scheme. In fact, particular attention is given to simultaneous compression and encryption applications to gain security without compromising the speed of transactions [1].

The proposed design allows us to encrypt the information using a pseudo-random number generator (PRNG). Thus we achieved the two operations (compression and encryption) simultaneously and in a dependent manner which is a novelty in this kind of architecture.

Moreover, we investigated the hardware implementation of CABAC (Context-based adaptive Binary Arithmetic Coding) codec. The proposed architecture is based on optimized binarizer/de-binarizer to handle significant pixel rates videos with low cost and high performance for most frequent SEs. This was checked using HD video frames. The obtained synthesis results using an FPGA (Xilinx’s ISE) show that our design is relevant to code main profile video stream.

A modified prediction scheme of the H.264 multiview video coding to improve the decoder performance

Ayman M. Hamadan, Hussein A. Aly, Mohamed M. Fouad, et al.

Show abstract

In this paper, we present a modified inter-view prediction scheme for the multiview video coding (MVC).With more inter-view prediction, the number of reference frames required to decode a single view increase. Consequently, the data size of decoding a single view increases, thus impacting the decoder performance. In this paper, we propose an MVC scheme that requires less inter-view prediction than that of the MVC standard scheme. The proposed scheme is implemented and tested on real multiview video sequences. Improvements are shown using the proposed scheme in terms of average data size required either to decode a single view, or to access any frame (i.e., random access), with comparable rate-distortion. It is compared to the MVC standard scheme and another improved techniques from the literature.

Interactive Paper Session

Achieving real-time capsule endoscopy (CE) video visualization through panoramic imaging

Steven Yi, Jean Xie, Peter Mui, et al.

Show abstract

In this paper, we mainly present a novel and real-time capsule endoscopy (CE) video visualization concept based on panoramic imaging. Typical CE videos run about 8 hours and are manually reviewed by physicians to locate diseases such as bleedings and polyps. To date, there is no commercially available tool capable of providing stabilized and processed CE video that is easy to analyze in real time. The burden on physicians’ disease finding efforts is thus big. In fact, since the CE camera sensor has a limited forward looking view and low image frame rate (typical 2 frames per second), and captures very close range imaging on the GI tract surface, it is no surprise that traditional visualization method based on tracking and registration often fails to work. This paper presents a novel concept for real-time CE video stabilization and display. Instead of directly working on traditional forward looking FOV (field of view) images, we work on panoramic images to bypass many problems facing traditional imaging modalities. Methods on panoramic image generation based on optical lens principle leading to real-time data visualization will be presented. In addition, non-rigid panoramic image registration methods will be discussed.

Analysis and characterization of embedded vision systems for taxonomy formulation

Muhammad Imran, Khaled Benkrid, Khursheed Khursheed, et al.

Show abstract

The current trend in embedded vision systems is to propose bespoke solutions for specific problems as each application has different requirement and constraints. There is no widely used model or benchmark which aims to facilitate generic solutions in embedded vision systems. Providing such model is a challenging task due to the wide number of use cases, environmental factors, and available technologies. However, common characteristics can be identified to propose an abstract model. Indeed, the majority of vision applications focus on the detection, analysis and recognition of objects. These tasks can be reduced to vision functions which can be used to characterize the vision systems. In this paper, we present the results of a thorough analysis of a large number of different types of vision systems. This analysis led us to the development of a system’s taxonomy, in which a number of vision functions as well as their combination characterize embedded vision systems. To illustrate the use of this taxonomy, we have tested it against a real vision system that detects magnetic particles in a flowing liquid to predict and avoid critical machinery failure. The proposed taxonomy is evaluated by using a quantitative parameter which shows that it covers 95 percent of the investigated vision systems and its flow is ordered for 60 percent systems. This taxonomy will serve as a tool for classification and comparison of systems and will enable the researchers to propose generic and efficient solutions for same class of systems.

Design and implementation of a real-time image registration in an infrared search and track system

Fu-yuan Xu, Guo-hua Gu, Tie-kun Zhao, et al.

Show abstract

In order to realize the fast search and tracking of ground dim target under the movement condition of infrared detector, an infrared image registration method is proposed based on phase correlation registration and the feature point registration, which can make the infrared image motion compensation come true. After using the phase correlation registration for rough registration in movement infrared images, the high precision registration was realized by the priori information of the feature point image registration, which combined the features of phase correlation registration and feature point registration. Experimental results validate that the infrared image displacement information had been provided and the small detection rate and accuracy had been improved by the algorithm, which didn’t reduce the registration accuracy in real-time IRST.

Binary video codec for data reduction in wireless visual sensor networks

Khursheed Khursheed, Naeem Ahmad, Muhammad Imran, et al.

Show abstract

Wireless Visual Sensor Networks (WVSN) is formed by deploying many Visual Sensor Nodes (VSNs) in the field. Typical applications of WVSN include environmental monitoring, health care, industrial process monitoring, stadium/airports monitoring for security reasons and many more. The energy budget in the outdoor applications of WVSN is limited to the batteries and the frequent replacement of batteries is usually not desirable. So the processing as well as the communication energy consumption of the VSN needs to be optimized in such a way that the network remains functional for longer duration. The images captured by VSN contain huge amount of data and require efficient computational resources for processing the images and wide communication bandwidth for the transmission of the results. Image processing algorithms must be designed and developed in such a way that they are computationally less complex and must provide high compression rate. For some applications of WVSN, the captured images can be segmented into bi-level images and hence bi-level image coding methods will efficiently reduce the information amount in these segmented images. But the compression rate of the bi-level image coding methods is limited by the underlined compression algorithm. Hence there is a need for designing other intelligent and efficient algorithms which are computationally less complex and provide better compression rate than that of bi-level image coding methods. Change coding is one such algorithm which is computationally less complex (require only exclusive OR operations) and provide better compression efficiency compared to image coding but it is effective for applications having slight changes between adjacent frames of the video. The detection and coding of the Region of Interest (ROIs) in the change frame efficiently reduce the information amount in the change frame. But, if the number of objects in the change frames is higher than a certain level then the compression efficiency of both the change coding and ROI coding becomes worse than that of image coding. This paper explores the compression efficiency of the Binary Video Codec (BVC) for the data reduction in WVSN. We proposed to implement all the three compression techniques i.e. image coding, change coding and ROI coding at the VSN and then select the smallest bit stream among the results of the three compression techniques. In this way the compression performance of the BVC will never become worse than that of image coding. We concluded that the compression efficiency of BVC is always better than that of change coding and is always better than or equal that of ROI coding and image coding.

Determinant of homography-matrix-based multiple-object recognition

Nagachetan Bangalore, Madhu Kiran, Anil Suryaprakash

Show abstract

Finding a given object in an image or a sequence of frames is one of the fundamental computer vision challenges. Humans can recognize a multitude of objects with little effort despite scale, lighting and perspective changes. A robust computer vision based object recognition system is achievable only if a considerable tolerance to change in scale, rotation and light is achieved. Partial occlusion tolerance is also of paramount importance in order to achieve robust object recognition in real-time applications. In this paper, we propose an effective method for recognizing a given object from a class of trained objects in the presence of partial occlusions and considerable variance in scale, rotation and lighting conditions. The proposed method can also identify the absence of a given object from the class of trained objects. Unlike the conventional methods for object recognition based on the key feature matches between the training image and a test image, the proposed algorithm utilizes a statistical measure from the homography transform based resultant matrix to determine an object match. The magnitude of determinant of the homography matrix obtained by the homography transform between the test image and the set of training images is used as a criterion to recognize the object contained in the test image. The magnitude of the determinant of homography matrix is found to be very near to zero (i.e. less than 0.005) and ranges between 0.05 and 1, for the out-of-class object and in-class objects respectively. Hence, an out-of-class object can also be identified by using low threshold criteria on the magnitude of the determinant obtained. The proposed method has been extensively tested on a huge database of objects containing about 100 similar and difficult objects to give positive results for both out-of-class and in-class object recognition scenarios. The overall system performance has been documented to be about 95% accurate for a varied range of testing scenarios.

Investigating the structure preserving encryption of high efficiency video coding (HEVC)

Zafar Shahid, William Puech

Show abstract

This paper presents a novel method for the real-time protection of new emerging High Efficiency Video Coding (HEVC) standard. Structure preserving selective encryption is being performed in CABAC entropy coding module of HEVC, which is significantly different from CABAC entropy coding of H.264/AVC. In CABAC of HEVC, exponential Golomb coding is replaced by truncated Rice (TR) up to a specific value for binarization of transform coefficients. Selective encryption is performed using AES cipher in cipher feedback mode on a plaintext of binstrings in a context aware manner. The encrypted bitstream has exactly the same bit-rate and is format complaint. Experimental evaluation and security analysis of the proposed algorithm is performed on several benchmark video sequences containing different combinations of motion, texture and objects.

A computationally efficient approach to 3D point cloud reconstruction

C.-H. Chang, N. Kehtarnavaz, K. Raghuram, et al.

Show abstract

This paper addresses improving the computational efficiency of the 3D point cloud reconstruction pipeline using uncalibrated image sequences. In existing pipelines, the bundle adjustment is carried out globally, which is quite time consuming since the computational complexity keeps growing as the number of image frames is increased. Furthermore, a searching and sorting algorithm needs to be used in order to store feature points and 3D locations. In order to reduce the computational complexity of the 3D point cloud reconstruction pipeline, a local refinement approach is introduced in this paper. The results obtained indicate that the introduced local refinement improves the computational efficiency as compared to the global bundle adjustment.

TDC-based readout electronics for real-time acquisition of high resolution PET bio-images

N. Marino, S. Saponara, G. Ambrosi, et al.

Show abstract

Positron emission tomography (PET) is a clinical and research tool for in vivo metabolic imaging. The demand for better image quality entails continuous research to improve PET instrumentation. In clinical applications, PET image quality benefits from the time of flight (TOF) feature. Indeed, by measuring the photons arrival time on the detectors with a resolution less than 100 ps, the annihilation point can be estimated with centimeter resolution. This leads to better noise level, contrast and clarity of detail in the images either using analytical or iterative reconstruction algorithms. This work discusses a silicon photomultiplier (SiPM)-based magnetic-field compatible TOF-PET module with depth of interaction (DOI) correction. The detector features a 3D architecture with two tiles of SiPMs coupled to a single LYSO scintillator on both its faces. The real-time front-end electronics is based on a current-mode ASIC where a low input impedance, fast current buffer allows achieving the required time resolution. A pipelined time to digital converter (TDC) measures and digitizes the arrival time and the energy of the events with a timestamp of 100 ps and 400 ps, respectively. An FPGA clusters the data and evaluates the DOI, with a simulated z resolution of the PET image of 1.4 mm FWHM.

A visibility improvement technique for fog images suitable for real-time application

Yoshitaka Toyoda, Daisuke Suzuki, Koichi Yamashita, et al.

Show abstract

Cameras used in outdoor scenes require high visibility performance under various environmental conditions. We present a visibility improvement technique which can improve the visibility of images captured in bad weather such as fog and haze, and also applicable to real-time processing in surveillance cameras and vehicle cameras. Our algorithm enhances contrast pixel by pixel according to the brightness and sharpness of neighboring pixels. In order to reduce computational costs, we preliminarily specify the adaptive functions which determine contrast gain from brightness and sharpness of neighboring pixels. We optimize these functions using the sets of fog images and examine how well they can predict the fog-degraded area using both qualitative and quantitative assessment. We demonstrate that our method can prevent excessive correction to the area without fog to suppress noise amplification in sky or shadow region, while applying powerful correction to the fog-degraded area. In comparison with other real-time oriented methods, our method can reproduce clear-day visibility while preserving gradation in shadows and highlights and also preserving naturalness of the original image. Our algorithm with low computational costs can be compactly implemented on hardware and thus applicable to wide-range of video equipments for the purpose of visibility improvement in surveillance cameras, vehicle cameras, and displays.

Fast non-blind deconvolution based on 2D point spread function database for real-time ultrasound imaging

Jooyoung Kang, Sung-Chan Park, Kyuhong Kim, et al.

Show abstract

In the ultrasound medical imaging system, blurring which occurs after passing through ultrasound scanner system, represents Point Spread Function (PSF) that describes the response of the ultrasound imaging system to a point source distribution. So, de-blurring can be achieved by de-convolving the images with an estimated of PSF. However, it is hard to attain an accurate estimation of PSF due to the unknown properties of the tissues of the human body through the ultrasound signal propagates. In addition to, the complexity is very high in order to estimate point spread function and de-convolve the ultrasound image with estimated PSF for real-time implementation of ultrasound imaging. Therefore, conventional methods of ultrasound image restoration are based on a simple 1D PSF estimation [8] that axial direction only by restoring the performance improvement is not in the direction of Lateral. And, in case of 2D PSF estimation, PSF estimation and restoration of the high complexity is not being widely used. In this paper, we proposed new method for selection of the 2D PSF (estimated PSF of the average speed sound and depth) simultaneously with performing fast non-blind 2D de-convolution in the ultrasound imaging system. Our algorithm works on the beam-formed uncompressed radio-frequency data, with pre-measured and estimated 2D PSFs database from actual probe used. In the 2d PSF database, there are pre-measured and estimated 2D PSFs that classified the each different depth (about 5 different depths) and speed of sound (about 1450 or 1540m/s). Using a minimum variance and simple Weiner filter method, we present a novel way to select the optimal 2D PSF in pre-measured and estimated 2D PSFs database that acquired from the actual transducer being used. For de-convolution part with the chosen PSF, we focused on the low complexity issue. So, we are using the Weiner Filter and fast de-convolution technique using hyper-Laplacian priors [11], [12] which is several orders of magnitude faster than existing techniques that use hyper-Laplacian priors. Then, in order to prevent discontinuities between the differently restored each depth image regions, we use the piecewise linear interpolation on overlapping regions. We have tested our algorithm with vera-sonic system and commercial ultrasound scanner (Philips C4-2), in known speed of sound phantoms and unknown speeds in vivo scans. We have applied a non-blind de-convolution with 2D PSFs database for ultrasound imaging system. Using the real PSF from actual transducer being used, our algorithm produces a better restoration of ultrasound image than de-convolution by simulated PSF, and has low complexity for real-time ultrasound imaging. This method is robust and easy to implement. This method may be a realistic candidate for real-time implementation.