Three-Dimensional Image Processing (3DIP) and Applications 2013

Front Matter: Volume 8650

Show abstract

This PDF file contains the front matter associated with SPIE Proceedings Volume 8650, including the Title Page, Copyright information, Table of Contents, and Conference Committee listing.

Depth image post-processing method by diffusion

Yun Li, Mårten Sjöström, Ulf Jennehag, et al.

Show abstract

Multi-view three-dimensional television relies on view synthesis to reduce the number of views being transmitted. Arbitrary views can be synthesized by utilizing corresponding depth images with textures. The depth images obtained from stereo pairs or range cameras may contain erroneous values, which entail artifacts in a rendered view. Post-processing of the data may then be utilized to enhance the depth image with the purpose to reach a better quality of synthesized views. We propose a Partial Differential Equation (PDE)-based interpolation method for a reconstruction of the smooth areas in depth images, while preserving significant edges. We modeled the depth image by adjusting thresholds for edge detection and a uniform sparse sampling factor followed by the second order PDE interpolation. The objective results show that a depth image processed by the proposed method can achieve a better quality of synthesized views than the original depth image. Visual inspection confirmed the results.

Beta-function B-spline smoothing on triangulations

Lubomir T. Dechevsky, Peter Zanaty

Show abstract

In this work we investigate a novel family of C^k-smooth rational basis functions on triangulations for fitting, smoothing, and denoising geometric data. The introduced basis function is closely related to a recently introduced general method introduced in utilizing generalized expo-rational B-splines, which provides C^k-smooth convex resolutions of unity on very general disjoint partitions and overlapping covers of multidimensional domains with complex geometry. One of the major advantages of this new triangular construction is its locality with respect to the star-1 neighborhood of the vertex on which the said base is providing Hermite interpolation. This locality of the basis functions can be in turn utilized in adaptive methods, where, for instance a local refinement of the underlying triangular mesh affects only the refined domain, whereas, in other method one needs to investigate what changes are occurring outside of the refined domain. Both the triangular and the general smooth constructions have the potential to become a new versatile tool of Computer Aided Geometric Design (CAGD), Finite and Boundary Element Analysis (FEA/BEA) and Iso-geometric Analysis (IGA).

Evaluation of efficient high quality depth upsampling methods for 3DTV

L. P. J. Vosters, C. Varekamp, G. de Haan

Show abstract

High quality 3D content generation requires high quality depth maps. In practice, depth maps generated by stereo-matching, depth sensingcameras, or decoders, have a low resolution and suffer from unreliable estimates and noise. Therefore depth post-processing is necessary. In this paper we benchmark state-of-the-art filter based depth upsampling methods on depth accuracy and interpolation quality by conducting a parameter space search to find the optimum set of parameters for various upscale factors and noise levels. Additionally, we analyze each method’s computational complexity with the big O notation and we measure the runtime of the GPU implementation that we built for each method.

Multiview ToF sensor fusion technique for high-quality depth map

Deukhyeon Kim, Jinwook Choi, Kwanghoon Sohn

Show abstract

The Time-of-Flight (ToF) sensor has been widely used in computer vision fields since it can provide depth information in real time. However, the depth map obtained from ToF sensor is distressed with error, and has a lower resolution than general cameras. In this paper, we propose a novel framework to fuse and upsample multi-view depth maps obtained from multiple ToF sensors. The proposed method can be robust to the camera calibration error and effectively applied to the Multi-view Video plus Depth (MVD) system. For that, we perform depth balancing and confidence map based multi-view depth fusion. The depth balancing adjusts the distribution of depth values between multiple ToF sensors. It can provide a coherent depth for the corresponding points between depth maps. Confidence map based multi-view depth fusion technique can restore the depth acquisition error and align multiple depth maps well with the corresponding color image by using only reliable depth values. Experimental results show that the proposed method using multiple ToF sensors is superior to the conventional method based on the 2D-plus-depth system consisting of one color camera and one depth sensor.

Time-of-flight depth image enhancement using variable integration time

Sun Kwon Kim, Ouk Choi, Byongmin Kang, et al.

Show abstract

Time-of-Flight (ToF) cameras are used for a variety of applications because it delivers depth information at a high frame rate. These cameras, however, suffer from challenging problems such as noise and motion artifacts. To increase signal-to-noise ratio (SNR), the camera should calculate a distance based on a large amount of infra-red light, which needs to be integrated over a long time. On the other hand, the integration time should be short enough to suppress motion artifacts. We propose a ToF depth imaging method to combine advantages of short and long integration times exploiting an imaging fusion scheme proposed for color imaging. To calibrate depth differences due to the change of integration times, a depth transfer function is estimated by analyzing the joint histogram of depths in the two images of different integration times. The depth images are then transformed into wavelet domains and fused into a depth image with suppressed noise and low motion artifacts. To evaluate the proposed method, we captured a moving bar of a metronome with different integration times. The experiment shows the proposed method could effectively remove the motion artifacts while preserving high SNR comparable to the depth images acquired during long integration time.

Pseudo-random modulation for multiple 3D time-of-flight camera operation

Dong-Ki Min, Ilia Ovsiannikov, Yohwan Noh, et al.

Show abstract

3D time-of-flight depth cameras utilize modulated light sources to detect the distance to objects as phase information. A serious limitation may exist in cases when multiple depth time-of-flight cameras are imaging the same scene simultaneously. The interference caused by the multiple modulated light sources can severely distort captured depth images. To prevent this problem and enable concurrent 3D multi-camera imaging, we propose modulating the camera light source and demodulating the received signal using sequences of pulses, where the phase of each sequence is varied in a pseudo-random fashion. The proposed algorithm is mathematically derived and proved by experiment.

Lossy contour-coding in segmentation-based intra-depth map coding

Jan Hanca, Adrian Munteanu, Peter Schelkens

Show abstract

Efficient depth-map coding is of paramount importance in next generation 3D video applications such as 3DTV and free viewpoint video. In this paper we propose a novel intra depth map coding system employing optimized segmentation procedures suitable for depth maps, followed by lossy or lossless contour coding techniques. In lossy mode, our method performs Hausdorff-distance constrained coding, by which the distance between the actual and decoded contours is upper-bounded by a user-defined bound. The trade-off between contour location accuracy and coding performance is analyzed. Experimental results show that, on average, lossy coding outperforms lossless contour coding and should be considered in all segmentation-based depth map coding systems. The comparison against JPEG-2000 shows that the proposed system is a viable alternative for light intra coding of depth maps.

Estimation of spreading fire geometrical characteristics using near infrared stereovision

L. Rossi, T. Toulouse, M. Akhloufi, et al.

Show abstract

In fire research and forest firefighting, there is a need of robust metrological systems able to estimate the geometrical characteristics of outdoor spreading fires. In recent years, we assist to an increased interest in wildfire research to develop non destructive techniques based on computer vision. This paper presents a new approach for the estimation of fire geometrical characteristics using near infrared stereovision. Spreading fire information like position, rate of spread, height and surface, are estimated from the computed 3D fire points. The proposed system permits to track fire spreading on a ground area of 5mx10m. Keywords: near infrared, stereovision, spreading fire, geometrical characteristics

Uniform grid upsampling of 3D lidar point cloud data

Prudhvi Gurram, Shuowen Hu, Alex Chan

Show abstract

Airborne laser scanning light detection and ranging (LiDAR) systems are used for remote sensing topology and bathymetry. The most common data collection technique used in LiDAR systems employs a linear mode scanning. The resulting scanning data form a non-uniformly sampled 3D point cloud. To interpret and further process the 3D point cloud data, these raw data are usually converted to digital elevation models (DEMs). In order to obtain DEMs in a uniform and upsampled raster format, the elevation information from the available non-uniform 3D point cloud data are mapped onto the uniform grid points. After the mapping is done, the grid points with missing elevation information are lled by using interpolation techniques. In this paper, partial di erential equations (PDE) based approach is proposed to perform the interpolation and to upsample the 3D point cloud onto a uniform grid. Due to the desirable e ects of using higher order PDEs, smoothness is maintained over homogeneous regions, while sharp edge information in the scene well preserved. The proposed algorithm reduces the draping e ects near the edges of distinctive objects in the scene. Such annoying draping e ects are commonly associated with existing point cloud rendering algorithms. Simulation results are presented in this paper to illustrate the advantages of the proposed algorithm.

Improvements on a MMI-based method for automatic texture mapping of 3D dense models

P. Ferrara, F. Uccheddu, A. Pelagotti

Show abstract

Maximization of Mutual Information routines proved to be suitable for registration of multimodal images. Here a method is proposed to select, in a set of candidates, the image which has a closer resemblance with a given external one. Such algorithm is intended to serve within a wider scope procedure for the automatic texturing of 3D models, where the initial 2D-3D registration problem is shifted to a 2D-2D registration challenge. In order to improve its performance a number of variations in the way the Mutual Information is computed are introduced and a method to judge its reliability is proposed.

Analysis of weighting of normals for spherical harmonic cross-correlation

Robert L. Larkins, Michael J. Cree, Adrian A. Dorrington

Show abstract

Spherical harmonic cross-correlation is a robust registration technique that uses the normals of two overlapping meshes to bring them into coarse rotational alignment. The amount of overlap between the two meshes is the primary determinant of whether the spherical harmonic cross-correlation achieves correct registration. By weighting each normal or clusters of normals, their contribution to the registration is influenced, allowing beneficial normals to be emphasized and deemphasizing those that are not. In this paper we evaluate how different weighting schemes impact registration efficiency and accuracy. It is found that two of the proposed weighting schemes are capable of correctly registering 22% of the mesh pairs, while the baseline, which equally weighted all normals, registered 14% of the mesh pairs. Using Fibonacci binning to equally weight surfaces provided the best all-round advantage, especially if efficiency is considered, as binning allows spherical harmonics to be pre-computed. By increasing the threshold that is applied to the weighting schemes, meshes with minimal overlap can be registered, with one case only having 2% overlap. The performed analysis shows that weighting normals when applied in a conducive manner can achieve considerable improvements improvements to registration accuracy.

Edge-aided virtual view rendering for multiview video plus depth

Suryanarayana M. Muddala, Mårten Sjöström, Roger Olsson, et al.

Show abstract

Depth-Image-Based Rendering (DIBR) of virtual views is a fundamental method in three dimensional 3-D video applications to produce different perspectives from texture and depth information, in particular the multi-view-plus-depth (MVD) format. Artifacts are still present in virtual views as a consequence of imperfect rendering using existing DIBR methods. In this paper, we propose an alternative DIBR method for MVD. In the proposed method we introduce an edge pixel and interpolate pixel values in the virtual view using the actual projected coordinates from two adjacent views, by which cracks and disocclusions are automatically filled. In particular, we propose a method to merge pixel information from two adjacent views in the virtual view before the interpolation; we apply a weighted averaging of projected pixels within the range of one pixel in the virtual view. We compared virtual view images rendered by the proposed method to the corresponding view images rendered by state-of-theart methods. Objective metrics demonstrated an advantage of the proposed method for most investigated media contents. Subjective test results showed preference to different methods depending on media content, and the test could not demonstrate a significant difference between the proposed method and state-of-the-art methods.

Novel calibration procedure for SVBRDF estimation of 3D objects using directional illumination and structured light projection

Jakub F. Krzesłowski, Robert Sitnik, Grzegorz Mączkowski

Show abstract

Estimation of geometry and reflectance of 3D objects requires that surface geometry is registered together with photometric data. We present a method which combines geometrical camera calibration and photometric calibration into a single procedure utilizing only one calibration target. Using structured light projection and directional illumination, the surface of a 3D object can be registered with an integrated measuring device. To estimate spatial distribution of reflectance parameters, a Spatially Varying Bidirectional Reflectance Distribution Function (SVBRDF) model is used. We also show a 3D image processing method to estimate SVBRDF parameters using an arbitrary defined array of illuminators and algorithms to reconstruct this surface using specialized visualization software. This approach allows for effective measurement of geometry and visual properties of 3D objects represented by a dense point cloud model. It can become a valuable tool for documentation of digital heritage and in industrial computer vision applications.

Wide range time-of-flight camera: design, analysis, and simulation

Ouk Choi

Show abstract

Time-of-flight cameras measure the distances to scene points by emitting and detecting a modulated infrared light signal. The modulation frequency of the signal determines a certain maximum range within which the measured distance is unambiguous. If the actual distance to a scene point is longer than the maximum range, the measured distance suffers from phase wrapping, which makes the measured to be shorter than its actual distance by an unknown multiple of the maximum range. This paper proposes a time-of-flight camera, which is capable of restoring the actual distance by simultaneously emitting light signals of different modulation frequencies and detecting them separately in different regions of the sensor. We analyze the noise characteristic of the camera, and aquire simulated depth maps using a commercially available time-of-flight camera, reflecting the increased amount of noise due to the use of dual-frequency signals. We finally propose a phase unwrapping method for restoring the actual distances from such a dual-frequency depth map. Through experiments, we demonstrate that the proposed method is capable of extending the maximum range at least twice, with high success rates.

Efficient intensity-based camera pose estimation in presence of depth

Maha El Choubassi, Oscar Nestares, Yi Wu, et al.

Show abstract

The widespread success of Kinect enables users to acquire both image and depth information with satisfying accuracy at relatively low cost. We leverage the Kinect output to efficiently and accurately estimate the camera pose in presence of rotation, translation, or both. The applications of our algorithm are vast ranging from camera tracking, to 3D points clouds registration, and video stabilization. The state-of-the-art approach uses point correspondences for estimating the pose. More explicitly, it extracts point features from images, e.g., SURF or SIFT, and builds their descriptors, and matches features from different images to obtain point correspondences. However, while features-based approaches are widely used, they perform poorly in scenes lacking texture due to scarcity of features or in scenes with repetitive structure due to false correspondences. Our algorithm is intensity-based and requires neither point features’ extraction, nor descriptors’ generation/matching. Due to absence of depth, the intensity-based approach alone cannot handle camera translation. With Kinect capturing both image and depth frames, we extend the intensity-based algorithm to estimate the camera pose in case of both 3D rotation and translation. The results are quite promising.

Depth correction in ToF camera

Byong Min Kang, Keechang Lee, James D. K. Kim, et al.

Show abstract

A depth captured by a Time-of-Flight (ToF) camera is sometimes distorted when object is close to camera. This problem causes low quality in several applications. In this paper, we take saturation of pixel into account and propose how to correct depth regardless of color or distance of objects using multi-integration time and phase reconstruction methods. The ToF camera captures depth of objects by using the ratio of electron charges which is not saturated for each integration time and by using phase summation equality. For verifying our approach, we used our prototype ToF camera with 480x270 pixel resolution, 16MHz modulated frequency, f/1.6 lens and 40msec integration time. Integration times are set by 16msec and 24msec, respectively. And target object used in test is color chart consisting of 24 standard colors and is located at intervals of 25cm (0.75m to 2.00m). We verified depth of each patch in color chart has identical value.

Karate moves recognition from skeletal motion

Simone Bianco, Francesco Tisato

Show abstract

This work aims at automatically recognizing sequences of complex karate movements and giving a measure of the quality of the movements performed. Since this is a problem which intrinsically needs a 3D model, in this work we propose a solution taking as input sequences of skeletal motions that can derive from both motion capture hardware or consumer-level, off the shelf, depth sensing systems. The proposed system is constituted by four different modules: skeleton representation, pose classification, temporal alignment, and scoring. The proposed system is tested on a set of different punch, kick and defense karate moves executed starting from the simplest case, i.e. fixed static stances (heiko dachi) up to sequences in which the starting stances is different from the ending one. The dataset has been recorded using a single Microsoft Kinect. The dataset includes the recordings of both male and female athletes with different skill levels, ranging from novices to masters.

Analyzing the relevance of shape descriptors in automated recognition of facial gestures in 3D images

Julian S. Rodriguez A., Flavio Prieto

Show abstract

The present document shows and explains the results from analyzing shape descriptors (DESIRE and Spherical Spin Image) for facial recognition of 3D images. DESIRE is a descriptor made of depth images, silhouettes and rays extended from a polygonal mesh; whereas the Spherical Spin Image (SSI) associated to a polygonal mesh point, is a 2D histogram built from neighboring points by using the position information that captures features of the local shape. The database used contains images of facial expressions which in average were recognized 88.16% using a neuronal network and 91.11% with a Bayesian classifier in the case of the first descriptor; in contrast, the second descriptor only recognizes in average 32% and 23,6% using the same mentioned classifiers respectively.

3D segmentation of the true and false lumens on CT aortic dissection images

Nawel Fetnaci, Paweł Łubniewski, Bruno Miguel, et al.

Show abstract

Our works are related to aortic dissections which are a medical emergency and can quickly lead to death. In this paper, we want to retrieve in CT images the false and the true lumens which are aortic dissection features. Our aim is to provide a 3D view of the lumens that we can difficultly obtain either by volume rendering or by another visualization tool which only directly gives the outer contour of the aorta; or by other segmentation methods because they mainly directly segment either only the outer contour of the aorta or other connected arteries and organs both. In our work, we need to segment the two lumens separately; this segmentation will allow us to: distinguish them automatically, facilitate the landing of the aortic prosthesis, propose a virtual 3d navigation and do quantitative analysis. We chose to segment these data by using a deformable model based on the fast marching method. In the classical fast marching approach, a speed function is used to control the front propagation of a deforming curve. The speed function is only based on the image gradient. In our CT images, due to the low resolution, with the fast marching the front propagates from a lumen to the other; therefore, the gradient data is insufficient to have accurate segmentation results. In the paper, we have adapted the fast marching method more particularly by modifying the speed function and we succeed in segmenting the two lumens separately.

Adaptive quality assurance of the product development process of additive manufacturing with modern 3D data evaluation methods

Julia Kroll, Sabine Botta, Jannis Breuninger, et al.

Show abstract

In this paper, the possibilities of modern 3D data evaluation for metrology and quality assurance are presented for the special application of the plastic laser sinter process, especially the Additive Manufacturing process. We use the advantages of computer tomography and of the 3D focus variation at all stages of a production process for an increased quality of the resulting products. With the CT and the 3D focus variation the modern quality assurance and metrology have state of the art instruments that allow non-destructive, complete and accurate measuring of parts. Therefore, these metrological methods can be used in many stages of the product development process for non-destructive quality control. In this work, studies and evaluation of 3D data and the conclusions for relevant quality criteria are presented. Additionally, new developments and implementations for adapting the evaluation results for quality prediction, comparison and for correction are described to show how an adequate process control can be achieved with the help of modern 3D metrology techniques. The focus is on the optimization of laser sintering components with regard to their quality requirements so that the functionality during production can be guaranteed and quantified.

Quality assessment of adaptive 3D video streaming

Samira Tavakoli, Jesús Gutiérrez, Narciso García

Show abstract

The streaming of 3D video contents is currently a reality to expand the user experience. However, because of the variable bandwidth of the networks used to deliver multimedia content, a smooth and high-quality playback experience could not always be guaranteed. Using segments in multiple video qualities, HTTP adaptive streaming (HAS) of video content is a relevant advancement with respect to classic progressive download streaming. Mainly, it allows resolving these issues by offering significant advantages in terms of both user-perceived Quality of Experience (QoE) and resource utilization for content and network service providers. In this paper we discuss the impact of possible HAS client’s behavior while adapting to the network capacity on enduser. This has been done through an experiment of testing the end-user response to the quality variation during the adaptation procedure. The evaluation has been carried out through a subjective test of the end-user response to various possible clients’ behaviors for increasing, decreasing, and oscillation of quality in 3D video. In addition, some of the HAS typical impairments during the adaptation has been simulated and their effects on the end-user perception are assessed. The experimental conclusions have made good insight into the user’s response to different adaptation scenarios and visual impairments causing the visual discomfort that can be used to develop the adaptive streaming algorithm to improve the end-user experience.

Introducing the cut-out star target to evaluate the resolution performance of 3D structured-light systems

Tom Osborne, Vikas Ramachandra, Kalin Atanassov, et al.

Show abstract

Structured light depth map systems are a type of 3D system where a structured light pattern is projected into the object space and an adjacent receiving camera is used to capture the image of the scene. By using the distance between the camera and the projector together with the structured pattern you can estimate the depth of objects in the scene from the camera. It is important to be able to compare two systems to see how one compares to another. Accuracy, resolution, and speed are three aspects of a structured light system that are often used for performance evaluation. It would be ideal if we could use the accuracy and resolution measurements to answer questions such as how close two cubes can be together and be resolved as two objects. Or, determine how close a person must be to the structured light system in order to determine how many fingers this person is holding up. It turns out, from our experiments, a systems ability to resolve the shape of an object is dependent on a number of factors such as the shape of an object, its orientation and how close it is to other adjacent objects. This makes the task of comparing the resolution of two systems difficult. Our goal is to choose a target or a set of targets from which we make measurements that will enable us to quantify, on the average, the comparative resolution performance of one system to another without having to make multiple measurements on scenes with a large set of object shapes, orientations and proximities to each other. In this document we will go over a number of targets we evaluated and will focus on the “Cut-out Star Target” that we selected as being the best choice. Using this target we will show our evaluation results of two systems. The metrics we used for the evaluation were developed during this work. These metrics will not directly answers the question of how close two objects can be to each other and still be resolve, but it will indicate which system will perform better over a large set of objects, orientations and proximities to other objects.

Discovering unexpected information using a building energy visualization tool

B. Lange, N. Rodriguez, W. Puech, et al.

Show abstract

Building energy consumption is an important problem in construction field, old buildings are gap of energy and they need to be refactored. Energy footprint of buildings needs to be reduced. New buildings are designed to be suitable with energy efficiency paradigm. To improve energy efficiency, Building Management Systems (BMS) are used: BMS are IT (Information Technology) systems composed by a rules engine and a database connected to sensors. Unfortunately, BMS are only monitoring systems: they cannot predict and mine efficiently building information. RIDER project has emerged from this observation. This project is conducted by several French companies and universities, IBM at Montpellier, France, leads the project. The main goal of this project is to create a smart and scalable BMS. This new kind of BMS will be able to dig into data and predict events. This IT system is based on component paradigm and the core can be extended with external components. Some of them are developed during the project: data mining, building generation model and visualization. All of these components will provide new features to improve rules used by the core. In this paper, we will focus on the visualization component. This visualization use a volume rendering method based on sensors data interpolation and a correlation method to create new views. We will present the visualization method used and which rules can be provided by this component.

An efficient anaglyph stereo video compression pipeline

Adhatus Solichah Ahmadiyah, Guan-Ming Su, Kai-Lung Hua, et al.

Show abstract

This paper presents two novel end-to-end stereo video compression pipelines consisting of single-sensor digital camera pairs, the legacy consumer-grade video decoders, and anaglyph displays. As 3D videos contain a large amount of data, efficient compression methods to distribute streams over the current communication infrastructure are highly required. In addition, low computation complexity algorithms to reconstruct the 3D scenes using the existing hardware are also preferred. We proposed two methods to transmit a single encoded stream containing only required data to create anaglyph video from single-sensor camera pairs. Our first proposed method packs and encodes only the required demosaicked color channels used in the anaglyph display in YCbCr 4:4:4 format, whereas the second proposed method repacks the color filter array stereo image pairs into the legacy video format YCbCr 4:2:0 mono and leaves the demosaicking operations at the decoder side. The experimental results demonstrate the superior performance of our proposed methods over the traditional one by achieving up to 4.66 dB improvement in terms of Composite Peak-to-Signal Noise Ratio (CPSNR).

Passive stereoscopic panomorph system

Anne-Sophie Poulin-Girard, Simon Thibault, Denis Laurendeau

Show abstract

In the last decade, wide-angle stereoscopic systems using fisheye lenses have been proposed but the compromise made to obtain a large field of view is low resolution and high distortion resulting in imprecise depth estimation of objects in a 3D scene. High and non-uniform distortion, especially in the azimuthal direction, is often considered as a weakness of panoramic lenses because it is sometimes difficult to compensate for by image processing. The aim of this paper is to present an alternative to existing stereoscopic panoramic systems by taking advantage of nonuniform distortion and anamorphosis in Panomorph lenses. There are many challenges related to this project such as the calibration of the system and the creation of a 3D depth estimation algorithm that suits the resolution of the different areas in the images. This paper presents different configurations of two Panomorph lenses within a stereoscopic device and a study of specific parameters to highlight their impact on the quality of 3D reconstruction of an object in a scene. Finally, an overview of future work is presented.

3D/2D image registration by image transformation descriptors (ITDs) for thoracic aorta imaging

Paweł J. Łubniewski, Laurent Sarry, Bruno Miguel, et al.

Show abstract

In this article, we present a novel image registration technique. Unlike most state of the art methods, our approach allows us to compute directly the relationship between images. The proposed registration framework, built in a modular way, can be adjusted to particular problems. Tests on sample image database of thoracic aorta proved that our method is fast and robust and could be successfully used for many cases. We have enhanced our previous works to provide a rapid 3D/2D registration method. It uses direct computing of the image transformation descriptors (ITDs) to align the projection images. The 3D transformation is estimated by an interesting technique which allows to propose a 3D pose update, interpreting the 2D transform of the projections in the 3D domain. The presented 3D/2D registration technique based on ITDs can be used as an initialization technique for classic registration algorithms. Its unique properties can be advantageous for many image alignment problems. The possibility of using different descriptors, adapted for particular cases, makes our approach very flexible. Fast time of computing is an important feature and motivates to use our technique even as an initialization step before execution of well known standard algorithms which could be more precise, but slow and sensitive to initialization of the parameters.

Stereo matching with partial information

Y. Cem Sübakan, Ömer Can Gürol, Çağatay Dikici

Show abstract

In this paper, we address the stereo matching problem in stereo videos using partially informed Markov Random Fields (MRFs) using the motion information between subsequent frames as a side information. We use the motion vectors within one of the videos to regularize the disparity estimate using this motion eld. The proposed scheme enables us to obtain good disparity estimates using faster and simpler disparity nding algorithms in each step.

Self-calibration of depth sensing systems based on structured-light 3D

Vikas Ramachandra, James Nash, Kalin Atanassov, et al.

Show abstract

A structured-light system for depth estimation is a type of 3D active sensor that consists of a structured-light projector, that projects a light pattern on the scene (e.g. mask with vertical stripes), and a camera which captures the illuminated scene. Based on the received patterns, depths of different regions in the scene can be inferred. For this setup to work optimally, the camera and projector must be aligned such that the projection image plane and the image capture plane are parallel, i.e. free of any relative rotations (yaw, pitch and roll). In reality, due to mechanical placement inaccuracy, the projector-camera pair will not be aligned. In this paper we present a calibration process which measures the misalignment. We also estimate a scale factor to account for differences in the focal lengths of the projector and the camera. The three angles of rotation can be found by introducing a plane in the field of view of the camera and illuminating it with the projected light patterns. An image of this plane is captured and processed to obtain the relative pitch, yaw and roll angles, as well as the scale through an iterative process. This algorithm leverages the effects of the misalignment/ rotation angles on the depth map of the plane image.

3D hand localization by low-cost webcams

Cheng-Yuan Ko, Chung-Te Li, Chen-Han Chung, et al.

Show abstract

In recent years, depth sensors, such as Kinect provides new opportunities for Human-Computer Interaction (HCI). However, for more universalization of depth sensors in consumer electronics, the cost of the sensors should be considered. In this paper, we proposed an algorithm for 3D hand localization by two commodity low cost webcams. Because of the noise produced by the low cost webcams, the depth quality is not very good. Ac-cording to the poor quality depth map, our algorithm can still to do the 3D hand localization. The proposed algorithm can provides 3D hand localization information for applications, such as interactive 3DTV.

3D shape extraction of internal and external surfaces of glass objects

A. Bajard, O. Aubreton, F. Truchetet

Show abstract

Three-dimensional (3D) digitization of manufactured objects has been investigated for several years and consequently, many techniques have been proposed. Even if some techniques have been successfully commercialized, most of them assume a diffuse or near diffuse reflectance of the object’s surface, and difficulties remain for the acquisition of “optically non cooperative” surfaces, such as transparent or specular ones. To address such surfaces, we propose a non conventional technique, called “Scanning from Heating” (SfH). In contrast to classical active triangulation techniques that acquire the reflection of visible light, we measure the thermal emission of the heated surface. The aim of this paper is to demonstrate, by using the experimental setup designed for specular (transparent or not) objects, how this method allows reconstruction both of internal and external surfaces of glass objects from a unique measure.

Smarter compositing with the Kinect

A. Karantza, R.L. Canosa

Show abstract

A image processing pipeline is presented that applies principles from the computer graphics technique of deferred shading to composite rendered objects into a live scene viewed by a Kinect. Issues involving the presentation of the Kinect's output are addressed, and algorithms for improving the believability and aesthetic matching of the rendered scene against the real scene are proposed. An implementation of this pipeline using GLSL shaders to perform this pipeline at interactive framerates is given. The results of experiments with this program are provided that show promise that the approaches evaluated here can be applied to improve other implementations.

Three-Dimensional Image Processing (3DIP) and Applications 2013

Volume Details

Table of Contents

Table of Contents