Three-Dimensional Image Processing (3DIP) and Applications II | (2012) | Publications

Volume Details

Date Published: 1 March 2012

Contents: 10 Sessions, 50 Papers, 0 Presentations

Conference: IS&T/SPIE Electronic Imaging 2012

Volume Number: 8290

All links to SPIE Proceedings will open in the SPIE Digital Library.

Show all abstracts

View Session

Front Matter: Volume 8290
Stereo and Multiview Imaging I
Stereo and Multiview Imaging II
Time-Of-Flight Data, Depth Maps Analysis
3D Shape Modeling, Retrieval
3D Analysis, Feature Extraction, Segmentation
3D Metrology
3D Imaging Systems
3D Compression and Watermarking
Interactive Paper Session

Front Matter: Volume 8290

Show abstract

This PDF file contains the front matter associated with SPIE Proceedings Volume 8290, including the Title Page, Copyright information, Table of Contents, and the Conference Committee listing.

Stereo and Multiview Imaging I

Edge-aware stereo matching with O(1) complexity

Cevahir Çiğla, A. Aydin Alatan

Show abstract

In this paper, a novel local stereo matching algorithm is introduced, providing precise disparity maps with low computational complexity. Following the common steps of local matching methods, namely cost calculation, aggregation, minimization and occlusion handling; the time consuming intensity dependent aggregation procedure is improved in terms of both speed and precision. For this purpose, a novel approach, denoted as permeability filtering (PF), is introduced, engaging computationally efficient two pass integration approach by weighted and connected support regions. The proposed approach exploits a new paradigm, separable successive weighted summation (SWS), among horizontal and vertical directions enabling constant operational complexity for adaptive filtering, as well as providing connected 2D support regions. Once aggregation of the cost values for each disparity candidate is performed independently, minimization is achieved by winner-take-all approach. The same procedure is also utilized to diffuse information through overlapped pixels during occlusion handling, after detecting unreliable disparity assignments. According to the experimental results on Middlebury stereo benchmark, the proposed method outperforms the state-of-the- art local methods in terms of precision and computational efficiency through unifying constant time filtering and weighted aggregation.

Establishing eye contact for home video communication using stereo analysis and free viewpoint synthesis

Christian Weigel, Niklas Treutner

Show abstract

Eye contact has been proven to be an important visual cue in video communication applications. We present a method to re-establish eye contact in a home video communication scenario caused by misalignment between the camera and the communication window the participant looks at. Our method covers the complete algorithm chain from acquisition to rendering and uses a pixel-based 3D analysis and rendering approach to create a virtual view of a camera placed at the position of the communication window. The outcomes of a large scale subjective study identified the crucial problems of such an approach. Based on the significant observations during the study in this paper we address the most important problems. We propose a method that produces spatially consistent depth maps using cross-check based filling. We address aliasing artifacts during point rendering and present a method to enhance the virtual view by image inpainting based on robust contour warping.

Depth adaptive hierarchical hole filling for DIBR-based 3D videos

Mashhour Solh, Ghassan AlRegib

Show abstract

In this paper we introduce a depth adaptive approach for disocclusion removal in depth image-based rendering (DIBR). This approach extends the hierarchical hole-filling (HHF) presented in an earlier work. Similar to HHF, the depth adaptive approach results in synthesized 3D videos that are free of geometric distortions. Furthermore, the edges and texture around the disoccluded areas can be sharpened and enhanced through adding the depth adaptive preprocessing step before applying the hierarchal hole-filling. The subjective and objective results show a significant improvement in quality for the synthesized views using the depth adaptive approach.

Space carving MVD sequences for modeling natural 3D scenes

Youssef Alj, Guillaume Boisson, Philippe Bordes, et al.

Show abstract

This paper presents a 3D modeling system designed for Multi-view Video plus Depth (MVD) sequences. The aim is to remove redundancy in both texture and depth information present in the MVD data. To this end, a volumetric framework is employed in order to merge the input depth maps. Hereby a variant of the Space Carving algorithm is proposed. Voxels are iteratively carved by ray-casting from each view, until the 3D model be geometrically consistent with every input depth map. A surface mesh is then extracted from this volumetric representation thanks to the Marching Cubes algorithm. Subsequently, to address the issue of texture modeling, a new algorithm for multi-texturing the resulting surface is presented. This algorithm selects from the set of input images the best texture candidate to map a given mesh triangle. The best texture is chosen according to a photoconsistency metric. Tests and results are provided using still images from usual MVD test-sequences.

Stereo and Multiview Imaging II

A locally content-dependent filter for inter-perspective anti-aliasing

Mårten Sjöström, Sylvain Tourancheau, Xusheng Wang, et al.

Show abstract

Presentations on multiview and lightfield displays have become increasingly popular. The restricted number of views implies an unsmooth transition between views if objects with sharp edges are far from the display plane. The phenomenon is explained by inter-perspective aliasing. This is undesirable in applications where a correct perception of the scene is required, such as in science and medicine. Anti-aliasing filters have been proposed in the literature, and are defined according to the minimum and maximum depth present in the scene. We suggest a method that subdivides the ray-space and adjusts the anti-aliasing filter to the scene contents locally. We further propose new filter kernels based on the ray space frequency domain that assures no aliasing, yet keeping maximum information unaltered. The proposed method outperforms filters of earlier works. Different filter kernels are compared. Details of the output are sharper using a proposed filter kernel, which also preserves the most information.

Photometric and geometric rectification for stereoscopic images

Seung-Ryong Han, Jongsul Min, Taesung Park, et al.

Show abstract

Stereoscopic images are captured by two cameras at different positions. In general, the two images often have geometric distortions including vertical misalignment, rotation and keystone as well as photometric distortions such as luminance or color differences. Even in case of a carefully designed parallel stereo camera configuration, the captured images pairs may have the distortions that cause uncomfortable 3D experiences to users. In this paper, we develop an algorithm to correct the captured image pairs to give a comfortable stereoscopic experience to users. The algorithm provides a practical method for compensating the photometric and geometrical distortions.

Time-Of-Flight Data, Depth Maps Analysis

Depth map upscaling through edge-weighted optimization

Sebastian Schwarz, Mårten Sjöström, Roger Olsson

Show abstract

Accurate depth maps are a pre-requisite in three-dimensional television, e.g. for high quality view synthesis, but this information is not always easily obtained. Depth information gained by correspondence matching from two or more views suffers from disocclusions and low-texturized regions, leading to erroneous depth maps. These errors can be avoided by using depth from dedicated range sensors, e.g. time-of-flight sensors. Because these sensors only have restricted resolution, the resulting depth data need to be adjusted to the resolution of the appropriate texture frame. Standard upscaling methods provide only limited quality results. This paper proposes a solution for upscaling low resolution depth data to match high resolution texture data. We introduce We introduce the Edge Weighted Optimization Concept (EWOC) for fusing low resolution depth maps with corresponding high resolution video frames by solving an overdetermined linear equation system. Similar to other approaches, we take information from the high resolution texture, but additionally validate this information with the low resolution depth to accentuate correlated data. Objective tests show an improvement in depth map quality in comparison to other upscaling approaches. This improvement is subjectively confirmed in the resulting view synthesis.

Adaptive switching filter for noise removal in highly corrupted depth maps from Time-of-Flight image sensors

Seunghee Lee, Kwanghyuk Bae, Kyu-min Kyung, et al.

Show abstract

In this work, we present an adaptive switching filter for noise reduction and sharpness preservation in depth maps provided by Time-of-Flight (ToF) image sensors. Median filter and bilateral filter are commonly used in cost-sensitive applications where low computational complexity is needed. However, median filter blurs fine details and edges in depth map while bilateral filter works poorly with impulse noise present in the image. Since the variance of depth is inversely proportional to amplitude, we suggest an adaptive filter that switches between median filter and bilateral filter based on the level of amplitude. If a region of interest has low amplitude indicating low confidence level of measured depth data, then median filter is applied on the depth at the position while regions with high level of amplitude is processed with bilateral filter using Gaussian kernel with adaptive weights. Results show that the suggested algorithm performs surface smoothing and detail preservation as well as median filter and bilateral filter, respectively. By using the suggested algorithm, significant gain in visual quality is obtained in depth maps while low computational cost is maintained.

Parametric model-based noise reduction for ToF depth sensors

Yong Sun Kim, Byongmin Kang, Hwasup Lim, et al.

Show abstract

This paper presents a novel Time-of-Flight (ToF) depth denoising algorithm based on parametric noise modeling. ToF depth image includes space varying noise which is related to IR intensity value at each pixel. By assuming ToF depth noise as additive white Gaussian noise, ToF depth noise can be modeled by using a power function of IR intensity. Meanwhile, nonlocal means filter is popularly used as an edge-preserving denoising method for removing additive Gaussian noise. To remove space varying depth noise, we propose an adaptive nonlocal means filtering. According to the estimated noise, the search window and weighting coefficient are adaptively determined at each pixel so that pixels with large noise variance are strongly filtered and pixels with small noise variance are weakly filtered. Experimental results demonstrate that the proposed algorithm provides good denoising performance while preserving details or edges compared to the typical nonlocal means filtering.

Silhouette extraction using color and depth information

Ekaterina Tolstaya, Victor Bucha

Show abstract

Recently applications involving capture of scenes with object of interest among surroundings gained high popularity. Such applications include video surveillance, human motion capture, human-computer interaction, etc. For proper analysis of the object of interest a necessary step is to separate the object of interest from surroundings, i. e. perform background subtraction (or silhouette extraction). This is a challenging task because of several problems, which are slight changes in background; shadows caused by the object of interest; and similarly colored objects. In this work we propose a new method for extracting the silhouette of an object of interest, based upon the joint use of both depth (range) and color data. Depth-based data is independent of color image data, and hence not affected by the limitations associated with color-based segmentation, such as shadows and similarly colored objects. At the initial moment an image of the background (not containing the object of interest) is present, and it is updated in every frame taking into account extracted silhouette and using "running average". Silhouette extraction method is based on k-means clustering of depth data and color difference data, and per-pixel silhouette mask computation, using clusters' centroids. The proposed solution is very fast and allows real-time processing of video. Developed algorithm has been successfully applied in human recognition application and provided good results for modeling human figure.

Discrete and continuous optimizations for depth image super-resolution

Ouk Choi, Hwasup Lim, Byongmin Kang, et al.

Show abstract

Recently a Time-of-Flight 2D/3D image sensor has been developed, which is able to capture a perfectly aligned pair of a color and a depth image. To increase the sensitivity to infrared light, the sensor electrically combines multiple adjacent pixels into a depth pixel at the expense of depth image resolution. To restore the resolution we propose a depth image super-resolution method that uses a high-resolution color image aligned with an input depth image. In the first part of our method, the input depth image is interpolated into the scale of the color image, and our discrete optimization converts the interpolated depth image into a high-resolution disparity image, whose discontinuities precisely coincide with object boundaries. Subsequently, a discontinuity-preserving filter is applied to the interpolated depth image, where the discontinuities are cloned from the high-resolution disparity image. Meanwhile, our unique way of enforcing the depth reconstruction constraint gives a high-resolution depth image that is perfectly consistent with its original input depth image. We show the effectiveness of the proposed method both quantitatively and qualitatively, comparing the proposed method with two existing methods. The experimental results demonstrate that the proposed method gives sharp high-resolution depth images with less error than the two methods for scale factors of 2, 4, and 8.

Superpixel-based depth image super-resolution

Yongseok Soh, Jae-Young Sim, Chang-Su Kim, et al.

Show abstract

Due to the development of depth sensors, such as time-of-flight (ToF) cameras, it becomes easier to acquire depth information directly from a scene. Although such devices enable us to obtain depth maps at video frame rates, the depth maps often have low resolutions only. A typical ToF camera retrieves depth maps of resolution 320 x 200, which is much lower than the resolutions of high definition color images. In this work, we propose a depth image super-resolution algorithm, which operates robustly even when there is a large resolution gap between a depth image and a reference color image. To prevent edge smoothing artifacts, which are the main drawback of conventional techniques, we adopt a superpixel-based approach and develop an edge enhancing scheme. Simulation results demonstrate that the proposed algorithm aligns the edges of a depth map to accurately coincide with those of a high resolution color image.

Efficient spatio-temporal hole filling strategy for Kinect depth maps

Massimo Camplani, Luis Salgado

Show abstract

In this paper we present an efficient hole filling strategy that improves the quality of the depth maps obtained with the Microsoft Kinect device. The proposed approach is based on a joint-bilateral filtering framework that includes spatial and temporal information. The missing depth values are obtained applying iteratively a joint-bilateral filter to their neighbor pixels. The filter weights are selected considering three different factors: visual data, depth information and a temporal-consistency map. Video and depth data are combined to improve depth map quality in presence of edges and homogeneous regions. Finally, the temporal-consistency map is generated in order to track the reliability of the depth measurements near the hole regions. The obtained depth values are included iteratively in the filtering process of the successive frames and the accuracy of the hole regions depth values increases while new samples are acquired and filtered.

3D Shape Modeling, Retrieval

Experimental results of bispectral invariants discriminative power

Karol Kubicki, Ramakrishna Kakarala

Show abstract

One of the main tools in shape matching and pattern recognition are invariants. For three-dimensional data, rotation invariants comprise of two main kinds: moments and spherical harmonic magnitudes. Both are well examined and both suffer from certain limitations. In search for better performance, a new kind of spherical-harmonic invariants have been proposed recently, called bispectral invariants. They are well-established from theoretical point of view. They posses numerous beneficial properties and advantages over other invariants, include the ability to distinguish rotation from reflection, and the sensitivity to phase. However, insufficient research has been conducted to check their behavior in practice. In this paper, results are presented pertaining to the discriminative power of bispectral invariants. Objects from Princeton Shape Benchmark database are used for evaluation. It is shown that the bispectral invariants outperform power spectral invariants, but perform worse than other descriptors proposed in the literature such as SHELLS and SHD. The difference in performance is attributable to the implicit filtering used to compute the invariants.

Evaluation of algorithms for point cloud surface reconstruction through the analysis of shape parameters

Lu Cao, Fons J. Verbeek

Show abstract

In computer graphics and visualization, reconstruction of a 3D surface from a point cloud is an important research area. As the surface contains information that can be measured, i.e. expressed in features, the application of surface reconstruction can be potentially important for application in bio-imaging. Opportunities in this application area are the motivation for this study. In the past decade, a number of algorithms for surface reconstruction have been proposed. Generally speaking, these methods can be separated into two categories: i.e., explicit representation and implicit approximation. Most of the aforementioned methods are firmly based in theory; however, so far, no analytical evaluation between these methods has been presented. The straightforward way of evaluation has been by convincing through visual inspection. Through evaluation we search for a method that can precisely preserve the surface characteristics and that is robust in the presence of noise. The outcome will be used to improve reliability in surface reconstruction of biological models. We, therefore, use an analytical approach by selecting features as surface descriptors and measure these features in varying conditions. We selected surface distance, surface area and surface curvature as three major features to compare quality of the surface created by the different algorithms. Our starting point has been ground truth values obtained from analytical shapes such as the sphere and the ellipsoid. In this paper we present four classical surface reconstruction methods from the two categories mentioned above, i.e. the Power Crust, the Robust Cocone, the Fourier-based method and the Poisson reconstruction method. The results obtained from our experiments indicate that Poisson reconstruction method performs the best in the presence of noise.

3D mesh Reeb graph computation using commute-time and diffusion distances

Rachid EL Khoury, Jean-Philippe Vandeborre, Mohamed Daoudi

Show abstract

3D-model analysis plays an important role in numerous applications. In this paper, we present an approach for Reeb graph extraction using a novel mapping function. Our mapping function computes a real value for each vertex which provides interesting insights to describe topology structure of the 3D-model. We perform discrete contour for each vertex according to our mapping function. Topology changes can be detected by discrete contours analysis to construct the Reeb graph. Our mapping function has some important properties. It is invariant to rigid and non rigid transformations, it is insensitive to noise, it is robust to small topology changes, and it does not depend on parameters. From the extracted graph, these properties show the significant parts of a 3D-model. We retain the evaluation criteria to the properties of the mapping function, and compared them to those used in the state of the art. In the end, we present extracted Reeb graph on various models with different positions.

Geometric modeling of pelvic organs with thickness

T. Bay, Z.-W. Chen, R. Raffin, et al.

Show abstract

Physiological changes in the spatial configuration of the internal organs in the abdomen can induce different disorders that need surgery. Following the complexity of the surgical procedure, mechanical simulations are necessary but the in vivo factor makes complicate the study of pelvic organs. In order to determine a realistic behavior of these organs, an accurate geometric model associated with a physical modeling is therefore required. Our approach is integrated in the partnership between a geometric and physical module. The Geometric Modeling seeks to build a continuous geometric model: from a dataset of 3D points provided by a Segmentation step, surfaces are created through a B-spline fitting process. An energy function is built to measure the bidirectional distance between surface and data. This energy is minimized with an alternate iterative Hoschek-like method. A thickness is added with an offset formulation, and the geometric model is finally exported in a hexahedral mesh. Afterward, the Physical Modeling tries to calculate the properties of the soft tissues to simulate the organs displacements. The physical parameters attached to the data are determined with a feedback loop between finite-elements deformations and ground-truth acquisition (dynamic MRI).

Refined facial disparity maps for automatic creation of 3D avatars

Rafael Pagés, Francisco Morán, Luis Salgado, et al.

Show abstract

We propose a new method to automatically refine a facial disparity map obtained with standard cameras and under conventional illumination conditions by using a smart combination of traditional computer vision and 3D graphics techniques. Our system inputs two stereo images acquired with standard (calibrated) cameras and uses dense disparity estimation strategies to obtain a coarse initial disparity map, and SIFT to detect and match several feature points in the subject's face. We then use these points as anchors to modify the disparity in the facial area by building a Delaunay triangulation of their convex hull and interpolating their disparity values inside each triangle. We thus obtain a refined disparity map providing a much more accurate representation of the the subject's facial features. This refined facial disparity map may be easily transformed, through the camera calibration parameters, into a depth map to be used, also automatically, to improve the facial mesh of a 3D avatar to match the subject's real human features.

Fast human pose estimation using 3D Zernike descriptors

Daniel Berjón, Francisco Morán

Show abstract

Markerless video-based human pose estimation algorithms face a high-dimensional problem that is frequently broken down into several lower-dimensional ones by estimating the pose of each limb separately. However, in order to do so they need to reliably locate the torso, for which they typically rely on time coherence and tracking algorithms. Their losing track usually results in catastrophic failure of the process, requiring human intervention and thus precluding their usage in real-time applications. We propose a very fast rough pose estimation scheme based on global shape descriptors built on 3D Zernike moments. Using an articulated model that we configure in many poses, a large database of descriptor/pose pairs can be computed off-line. Thus, the only steps that must be done on-line are the extraction of the descriptors for each input volume and a search against the database to get the most likely poses. While the result of such process is not a fine pose estimation, it can be useful to help more sophisticated algorithms to regain track or make more educated guesses when creating new particles in particle-filter-based tracking schemes. We have achieved a performance of about ten fps on a single computer using a database of about one million entries.

Analysis of binning of normals for spherical harmonic cross-correlation

Robert L. Larkins, Michael J. Cree, Adrian A. Dorrington

Show abstract

Spherical harmonic cross-correlation is a robust registration technique that uses the normals of two overlapping point clouds to bring them into coarse rotational alignment. This registration technique however has a high computational cost as spherical harmonics need to be calculated for every normal. By binning the normals, the computational efficiency is improved as the spherical harmonics can be pre-computed and cached at each bin location. In this paper we evaluate the efficiency and accuracy of the equiangle grid, icosahedron subdivision and the Fibonacci spiral, an approach we propose. It is found that the equiangle grid has the best efficiency as it can perform direct binning, followed by the Fibonacci spiral and then the icosahedron, all of which decrease the computational cost compared to no binning. The Fibonacci spiral produces the highest achieved accuracy of the three approaches while maintaining a low number of bins. The number of bins allowed by the equiangle grid and icosahedron are much more restrictive than the Fibonacci spiral. The performed analysis shows that the Fibonacci spiral can perform as well as the original cross-correlation algorithm without binning, while also providing a significant improvement in computational efficiency.

Topology reconstruction for B-Rep modeling from 3D mesh in reverse engineering applications

Roseline Bénière, Gérard Subsol, Gilles Gesquière, et al.

Show abstract

Nowadays, most of the manufactured objects are designed using CAD (Computer-Aided Design) software. Nevertheless, for visualization, data exchange or manufacturing applications, the geometric model has to be discretized into a 3D mesh composed of a finite number of vertices and edges. But, in some cases, the initial model may be lost or unavailable. In other cases, the 3D discrete representation may be modified, for example after a numerical simulation, and does not correspond anymore to the initial model. A reverse engineering method is then required to reconstruct a 3D continuous representation from the discrete one. In previous work, we have presented a new approach for 3D geometric primitive extraction. In this paper, to complete our automatic and comprehensive reverse engineering process, we propose a method to construct the topology of the retrieved object. To reconstruct a B-Rep model, a new formalism is now introduced to define the adjacency relations. Then a new process is used to construct the boundaries of the object. The whole process is tested on 3D industrial meshes and bring a solution to recover B-Rep models.

An evaluation of local shape descriptors for 3D shape retrieval

Sarah Tang, Afzal Godil

Show abstract

As the usage of 3D models increases, so does the importance of developing accurate 3D shape retrieval algorithms. A common approach is to calculate a shape descriptor for each object, which can then be compared to determine two objects' similarity. However, these descriptors are often evaluated independently and on different datasets, making them difficult to compare. Using the SHREC 2011 Shape Retrieval Contest of Non-rigid 3D Watertight Meshes dataset, we systematically evaluate a collection of local shape descriptors. We apply each descriptor to the bag-of-words paradigm and assess the effects of varying the dictionary's size and the number of sample points. In addition, several salient point detection methods are used to choose sample points; these methods are compared to each other and to random selection. Finally, information from two local descriptors is combined in two ways and changes in performance are investigated. This paper presents results of these experiments.

3D Analysis, Feature Extraction, Segmentation

Spatial modeling of bone microarchitecture

Hui Li, Kang Li, Taehyong Kim, et al.

Show abstract

We develop and evaluate a novel 3D computational bone framework, which is capable of enabling quantitative assessment of bone micro-architecture, bone mineral density and fracture risks. Our model for bone mineral is developed and its parameters are estimated from imaging data obtained with dual energy x-ray absorptiometry and x-ray imaging methods. Using these parameters, we propose a proper 3D microstructure bone model. The research starts by developing a spatio-temporal 3D microstructure bone model using Voronoi tessellation. Then, we simulate and analyze the architecture of human normal bone network and osteoporotic bone network with edge pruning process in an appropriate ratio. Finally, we design several measurements to analyze Bone Mineral Density (BMD) and bone strength based on our model. The validation results clearly demonstrate our 3D Microstructure Bone Model is robust to reflect the properties of bone in the real world.

A new affine invariant method for image matching

Jean-Louis Palomares, Philippe Montesinos, Daniel Diep

Show abstract

This paper describes a new approach in color or grey-scale image matching by points of interest. As many point matching methods, this method is based on two main steps : computation of points and descriptors, followed by a matching process. The points of interest are extracted thanks to the color Harris points detector, they are then described using rotating anisotropic half-gaussian derivative convolution kernels. The descriptors obtained by this filtering stage provide point signatures, robust enough to be recovered in another image even under important color and viewpoint transformations. The matching process uses a cross comparison of point signatures and a voting method to achieve a robust matching. This paper presents the new descriptor defined and the matching process dealing with the data issued from the descriptor.

2D-3D feature association via projective transform invariants for model-based 3D pose estimation

O. Serdar Gedik, A. Aydin Alatan

Show abstract

The three dimensional (3D) tracking of rigid objects is required in many applications, such as 3D television (3DTV) and augmented reality. Accurate and robust pose estimates enable improved structure reconstructions for 3DTV and reduce jitter in augmented reality scenarios. On the other hand, reliable 2D-3D feature association is one of the most crucial requirements for obtaining high quality 3D pose estimates. In this paper, a 2D-3D registration method, which is based on projective transform invariants, is proposed. Due to the fact that projective transform invariants are highly dependent on 2D and 3D coordinates, the proposed method relies on pose consistencies in order to increase robustness of 2D-3D association. The reliability of the approach is shown by comparisons with RANSAC, perspective factorization and SoftPOSIT based methods on real and artificial data.

Reprocessing anaglyph images

Henry G. Dietz

Show abstract

In related work, we have shown that conventional digital cameras easily can be modified to directly capture anaglyphs. Anaglyph images commonly have been used to encode stereo image pairs for viewing, but anaglyphs also can be treated as an efficient encoding of two-view image data for reprocessing. Each of the two views encoded within an anaglyph has only partial color information, but our preliminary results demonstrate that the "lost" information can be approximately recovered with any of a variety of reasonably efficient algorithms. This not only allows credible full-color stereo pairs be computationally extracted, but also enables more sophisticated computational photography transformations such as creation of depthmaps and various types of point-spread-function (PSF) substitutions.

3D Metrology

X-ray stereo imaging for micro 3D motions within non-transparent objects

Wasil H. M. Salih, Jan A. N. Buytaert, Joris J. J. Dirckx

Show abstract

We propose a new technique to measure the 3D motion of marker points along a straight path within an object using x-ray stereo projections. From recordings of two x-ray projections with 90° separation angle, the 3D coordinates of marker points can be determined. By synchronizing the x-ray exposure time to the motion event, a moving marker leaves a trace in the image of which the gray scale is linearly proportional to the marker velocity. From the gray scale along the motion path, the 3D motion (velocity) is obtained. The path of motion was reconstructed and compared with the applied waveform. The results showed that the accuracy is in order of 5%. The difference of displacement amplitude between the new method and laser vibrometry was less than 5μm. We demonstrated the method on the malleus ossicle motion in the gerbil middle ear as a function of pressure applied on the eardrum. The new method has the advantage over existing methods such as laser vibrometry that the structures under study do not need to be visually exposed. Due to the short measurement time and the high resolution, the method can be useful in the field of biomechanics for a variety of applications.

A stereoscopic imaging system for laser back scatter based trajectory measurement in ballistics: part 2

Uwe Chalupka, Hendrik Rothe

Show abstract

The progress on a laser- and stereo-camera-based trajectory measurement system that we already proposed and described in recent publications is given. The system design was extended from one to two more powerful, DSP-controllable LASER systems. Experimental results of the extended system using different projectile-/weapon combinations will be shown and discussed. Automatic processing of acquired images using common 3DIP techniques was realized. Processing steps to extract trajectory segments from images as representative for the current application will be presented. Used algorithms for backward-calculation of the projectile trajectory will be shown. Verification of produced results is done against simulated trajectories, once in terms of detection robustness and once in terms of detection accuracy. Fields of use for the current system are within the ballistic domain. The first purpose is for trajectory measurement of small and middle caliber projectiles on a shooting range. Extension to big caliber projectiles as well as an application for sniper detection is imaginable, but would require further work. Beside classical RADAR, acoustic and optical projectile detection methods, the current system represents a further projectile location method under the new class of electro-optical methods that have been evolved in recent decades and that uses 3D imaging acquisition and processing techniques.

Single frame coaxial 3D measurement using depth from defocus of projection system

Toru Kurihara, Shigeru Ando

Show abstract

We propose 3D profilometry based on depth from defocus of the projection system with the same axis of imaging system. In this system, the stripe pattern generated by DLP Light commander is projected on the object, and it moves on object's surface generating temporal variation of the light intensity. The projected stripe pattern is defocused depending on its distance from the focal plane. By moving the stripe pattern, defocused spatial frequency component is captured by temporal frequency analysis. We use correlation image sensor (CIS) to capture the temporal frequency component in a single frame. CIS outputs the Fourier coefficients of incident light in each pixel for every frames, and therefore it enables single frame 3D measurement. Evaluation experiments show that projection defocus depends on the distance from the focal plane, and it can be used for 3D measurement.

Multidirectional four-dimensional shape measurement system

Janusz Lenar, Robert Sitnik, Marcin Witkowski

Show abstract

Currently, a lot of different scanning techniques are used for 3D imaging of human body. Most of existing systems are based on static registration of internal structures using MRI or CT techniques as well as 3D scanning of outer surface of human body by laser triangulation or structured light methods. On the other hand there is an existing mature 4D method based on tracking in time the position of retro-reflective markers attached to human body. There are two main drawbacks of this solution: markers are attached to skin (no real skeleton movement is registered) and it gives (x, y, z, t) coordinates only in those points (not for the whole surface). In this paper we present a novel multidirectional structured light measurement system that is capable of measuring 3D shape of human body surface with frequency reaching 60Hz. The developed system consists of two spectrally separated and hardware-synchronized 4D measurement heads. The principle of the measurement is based on single frame analysis. Projected frame is composed from sine-modulated intensity pattern and a special stripe allowing absolute phase measurement. Several different geometrical set-ups will be proposed depending on type of movements that are to be registered.

3D Imaging Systems

Estimation of surface normal vectors based on 3D scanning from heating approach

Olivier Aubreton, Gonen Eren, Youssef Bokhabrine, et al.

Show abstract

The Scanning From Heating is a 3D scanning approach initially developed to realise 3D acquisition of transparent or specular surfaces. A laser source is used to create a local heating point. An infrared camera is used to observe the IR radiation emitted by the scene. The 2D coordinates of the heated point are computed in the 2D image of the camera. Knowing the parameters of the system (which are obtained by a previous calibration), the 3D coordinates of the point are computed using triangulation method. In this article we will present an extension of this technique. We propose here to analyse the shape of the hot spot observed by the IR camera, and, from the analysis to determine information on the local orientation of the surface at each measured point.

General fusion approaches for the age determination of latent fingerprint traces: results for 2D and 3D binary pixel feature fusion

Ronny Merkel, Stefan Gruhn, Jana Dittmann, et al.

Show abstract

Determining the age of latent fingerprint traces found at crime scenes is an unresolved research issue since decades. Solving this issue could provide criminal investigators with the specific time a fingerprint trace was left on a surface, and therefore would enable them to link potential suspects to the time a crime took place as well as to reconstruct the sequence of events or eliminate irrelevant fingerprints to ensure privacy constraints. Transferring imaging techniques from different application areas, such as 3D image acquisition, surface measurement and chemical analysis to the domain of lifting latent biometric fingerprint traces is an upcoming trend in forensics. Such non-destructive sensor devices might help to solve the challenge of determining the age of a latent fingerprint trace, since it provides the opportunity to create time series and process them using pattern recognition techniques and statistical methods on digitized 2D, 3D and chemical data, rather than classical, contact-based capturing techniques, which alter the fingerprint trace and therefore make continuous scans impossible. In prior work, we have suggested to use a feature called binary pixel, which is a novel approach in the working field of fingerprint age determination. The feature uses a Chromatic White Light (CWL) image sensor to continuously scan a fingerprint trace over time and retrieves a characteristic logarithmic aging tendency for 2D-intensity as well as 3D-topographic images from the sensor. In this paper, we propose to combine such two characteristic aging features with other 2D and 3D features from the domains of surface measurement, microscopy, photography and spectroscopy, to achieve an increase in accuracy and reliability of a potential future age determination scheme. Discussing the feasibility of such variety of sensor devices and possible aging features, we propose a general fusion approach, which might combine promising features to a joint age determination scheme in future. We furthermore demonstrate the feasibility of the introduced approach by exemplary fusing the binary pixel features based on 2D-intensity and 3D-topographic images of the mentioned CWL sensor. We conclude that a formula based age determination approach requires very precise image data, which cannot be achieved at the moment, whereas a machine learning based classification approach seems to be feasible, if an adequate amount of features can be provided.

A single-imager, single-lens video camera prototype for 3D imaging

Lauren A. Christopher, Weixu Li

Show abstract

A new method for capturing 3D video from a single imager and lens is introduced. The benefit of this method is that it does not have the calibration and alignment issues associated with binocular 3D video cameras. It also does not require special ranging transmitters and sensors. Because it is a single lens/imager system, it is also less expensive than either the binocular or ranging cameras. Our system outputs a 2D image and associated depth image using the combination of microfluidic lens and Depth from Defocus (DfD) algorithm. The lens is capable of changing the focus to obtain two images at the normal video frame rate. The Depth from Defocus algorithm uses the in focus and out of focus images to infer depth. We performed our experiments on synthetic and on the real aperture CMOS imager with a microfluidic lens. On synthetic images, we found an improvement in mean squared error compared to the literature on a limited test set. On camera images, our research showed that DfD combined with edge detection and segmentation provided subjective improvements in the resulting depth images.

3D multimodal data fusion system

Piotr Garbat

Show abstract

The problem of 3D shape/depth map acquisition in real-time has received increasing attention in recent years. The most popular structure light system based on digital light projection allow to rapid acquisition of data about 3D real objects. This research describes a novel approach for 3D shape measurement system supported by polarization image analysis. Enhancement of fringe image quality is realized using the detector unit with special liquid crystal filter.

Fully automatic 3D digitization of unknown objects using progressive data bounding box

Souhaiel Khalfaoui, Antoine Aigueperse, Ralph Seulin, et al.

Show abstract

The goal of this work is to develop a complete and automatic scanning system with minimum prior information. We aim to establish a methodology for the automation of the 3D digitization process. The paper presents a method based on the evolution of the Bounding Box of the object during the acquisition. The registration of the data is improved through the modeling of the positioning system. The obtained models are analyzed and inspected in order to evaluate the robustness of our method. Tests with real objects have been performed and results of digitization are provided.

3D Compression and Watermarking

3D video compression with the H.264 codec

Nikolaus Karpinsky, Song Zhang

Show abstract

Advances in 3D scanning have enabled the real-time capture of high-resolution 3D video. With these advances comes the challenge of streaming and storing this 3D video in a manner that can be quickly and effectively used. To do this, different approaches have been taken, a popular one being image based encoding, which projects from 3D into 2D, uses 2D compression techniques, and then decodes from 2D back to 3D. One such technique that does this is the Holovideo technique, which has been shown to yield great compression ratios. However, the technique was originally designed for the RGB color space and until recently could not be used with codecs that use the YUV color space such as the H.264 codec. This paper addresses this issue, generalizing Holovideo to the YUV color space, allowing it to leverage the H.264 codec. Compression ratios of over 352 : 1 have been achieved when comparing it to the OBJ file format, with mean squared error as low as .204% making it a viable solution for 3D video compression.

3D multiresolution synchronization scheme based on feature point selection

N. Tournier, W. Puech, G. Subsol, et al.

Show abstract

Multimedia protection is one of the main research challenges in computer sciences. We can encrypt the media in order to make the content unreadable without a secret key of decryption, protect the file with Digital Right Management (DRM), or embed an hidden message in the file (watermarking and steganography). We are interested in data hiding applications for 3D mesh. In this domain, there are various problems, of which the synchronization of the message in the support host. The synchronization is the operation that allows to scan a mesh with a unique path and by selecting the same areas (vertices, triangles, quadrangles, for example) before and after the embedding even if the mesh has been noised. In this paper, we propose a new synchronization approach based on feature point selection in a low resolution of the 3D object. The building of the low resolution is made by decimation and the feature point selection is based on the discrete curvature computing. We evaluate the robustness of the synchronization in the low resolution and in the high resolution.

A content-adaptive scheme for reduced-complexity, multi-view video coding

Aykut Avci, Jan De Cock, Jelle De Smet, et al.

Show abstract

Disparity estimation is a highly complex and time consuming process in multi-view video encoders. Since multiple views taken from a 2D camerea array need to be coded at every time instance, the complexity of the encoder plays an important role besides its rate-distortion performance. In previous papers we have introduced a new frame type called D frame that exploits the stron geometrical correspondence between views, thereby reducing the complexity of the encoder. By employing D frames instead of some of the P frames in the prediction structure, significant compexity gain can be achieved if the trhreshold value which is a keystone element to adjust the complexity at the cost of quality and/or bit-rate is selected wisely. In this work, a new adaptive method to calculate the threshold value automatically from existing information during the encoding process is presented. In this method, the threshold values are generated for each block of each D frame to increase the accuracy. The algorithm is applied to several image sets and 20.6% complexity gain is achieved using the automatically generated threshold values without compromising qaulity or bit-rate.

Interactive Paper Session

Novel time- and depth-stamped imaging for 3D-PIV (particle image velocimetry) using correlation image sensor

Kenji Komiya, Toru Kurihara, Shigeru Ando

Show abstract

We propose a novel and extremely efficient scheme of 3D-Particle Image Velocimetry (3D-PIV), simultaneous time-stamped and depth-stamped imaging, using correlation image sensor (CIS) and a structured illumination. In conventional PIV measurements, 3-D positions of numerous tiny particles inserted in a fluid field must be detected using multiple high-speed cameras. Resultant huge amount of data volume increases the computational cost and reduces the reliability of velocity field estimation. These problems can be solved if a single-frame 4D (3D position and time) trajectory imaging can be realized. The CIS developed by us is the device which outputs the temporal correlation between incident light intensity and two sets of three-phase (3P) reference signal common for whole pixels. When particles are imaged in a frame using a 3P reference signal, CIS records the time information as a phase distribution along their trajectories. CIS can also capture the depth information by exploiting the structured illumination and another 3P reference signal. Combination of these methods provides the time- and depth-stamped imaging. We describe the principle, theoretical foundations, and analysis algorithms. Several experimental results for evaluating accuracy and resolution are shown.

3D imaging for ballistics analysis using chromatic white light sensor

Andrey Makrushin, Mario Hildebrandt, Jana Dittmann, et al.

Show abstract

The novel application of sensing technology, based on chromatic white light (CWL), gives a new insight into ballistic analysis of cartridge cases. The CWL sensor uses a beam of white light to acquire highly detailed topography and luminance data simultaneously. The proposed 3D imaging system combines advantages of 3D and 2D image processing algorithms in order to automate the extraction of firearm specific toolmarks shaped on fired specimens. The most important characteristics of a fired cartridge case are the type of the breech face marking as well as size, shape and location of extractor, ejector and firing pin marks. The feature extraction algorithm normalizes the casing surface and consistently searches for the appropriate distortions on the rim and on the primer. The location of the firing pin mark in relation to the lateral scratches on the rim provides unique rotation invariant characteristics of the firearm mechanisms. Additional characteristics are the volume and shape of the firing pin mark. The experimental evaluation relies on the data set of 15 cartridge cases fired from three 9mm firearms of different manufacturers. The results show very high potential of 3D imaging systems for casing-based computer-aided firearm identification, which is prospectively going to support human expertise.

Computer-aided 3D-shape construction of hearts from CT images for rapid prototyping

Masayuki Fukuzawa, Yutaro Kato, Nobuyuki Nakamori, et al.

Show abstract

By developing a computer-aided modeling system, the 3D shapes of infant's heart have been constructed interactively from quality-limited CT images for rapid prototyping of biomodels. The 3D model was obtained by following interactive steps: (1) rough region cropping, (2) outline extraction in each slice with locally-optimized threshold, (3) verification and correction of outline overlap, (4) 3D surface generation of inside wall, (5) connection of inside walls, (6) 3D surface generation of outside wall, (7) synthesis of self-consistent 3D surface. The manufactured biomodels revealed characteristic 3D shapes of heart such as left atrium and ventricle, aortic arch and right auricle. Their real shape of cavity and vessel is suitable for surgery planning and simulation. It is a clear advantage over so-called "blood-pool" model which is massive and often found in 3D visualization of CT images as volume rendering perspective. The developed system contributed both to quality improvement and to modeling-time reduction, which may suggest a practical approach to establish a routine process for manufacturing heart biomodels. Further study on the system performance is now still in progress.

Semiautomatic generation of semantic building models from image series

Stefan Wirtz, Peter Decker, Dietrich Paulus

Show abstract

We present an approach to generate a 3D model of a building including semantic annotations from image series. In the recent years semantic based modeling, reconstruction of buildings and building recognition became more and more important. Semantic building models have more information than just the geometry, thus making them more suitable for recognition or simulation tasks. The time consuming generation of such models and annotations makes an automatism desirable. Therefore, we present a semiautomatic approach towards semantic model generation. This approach has been implemented as a plugin for the photostitching tool Hugin*. Our approach reduces the interaction with the system to a minimum. The resulting model contains semantic, geometric and appearance information and is represented in City Geography Markup Language (CityGML).

Complex virtual urban environment modeling from CityGML data and OGC web services: application to the SIMFOR project

Jean-Christophe Chambelland, Gilles Gesquière

Show abstract

Due to the advances in computer graphics and network speed it is possible to navigate in 3D virtual world in real time. This technology proposed for example in computer games, has been adapted for training systems. In this context, a collaborative serious game for urban crisis management called SIMFOR is born in France. This project has been designed for intensive realistic training and consequently must allow the players to create new urban operational theatres. In this goal, importing, structuring, processing and exchanging 3D urban data remains an important underlying problem. This communication will focus on the design of the 3D Environment Editor (EE) and the related data processes needed to prepare the data flow to be exploitable by the runtime environment of SIMFOR. We will use solutions proposed by the Open Geospatial Consortium (OGC) to aggregate and share data. A presentation of the proposed architecture will be given. The overall design of the EE and some strategies for efficiently analyzing, displaying and exporting large amount of urban CityGML information will be presented. An example illustrating the potentiality of the EE and the reliability of the proposed data processing will be proposed.

Liquid crystal materials and structures for image processing and 3D shape acquisition

K. Garbat, P. Garbat, L. Jaroszewicz

Show abstract

The image processing supported by liquid crystals device has been used in numerous imaging applications, including polarization imaging, digital holography and programmable imaging. Liquid crystals have been extensively studied and are massively used in display and optical processing technology. We present here the main relevant parameters of liquid crystal for image processing and 3D shape acquisition and we compare the main liquid crystal options which can be used with their respective advantages. We propose here to compare performance of several types of liquid crystal materials: nematic mixtures with high and medium optical and dielectrical anisotropies and relatively low rotational viscosities nematic materials which may operate in TN mode in mono and dual frequency addressing systems.

Piece-wise linear function estimation for platelet-based depth maps coding using edge detection

Dorsaf Sebai, Faten Chaieb, Khaled Mammou, et al.

Show abstract

Many researches on efficient depth maps coding issues have been carried out giving particular attention to sharp edge preservation. Platelet-based coding method is an edge-aware coding scheme that uses a segmentation procedure based on recursive quadtree decomposition. Then, the depth map is modeled using piecewise linear platelet and wedgelet functions. However, the estimation of these functions is a computationally expensive task making the platelet-based techniques not adapted to online applications. In this paper, we propose to exploit edge detection in order to reduce the encoding delay of the platelet/wedgelet estimation process. The proposed approach shows significant gain in terms of encoding delay, while providing competitive R-D performances w.r.t. the original platelet-based codec. The subjective evaluation shows significant less degradation along sharp edges.

A study on the impact of compression and packet losses on rendered 3D views

Chaminda T. E. R. Hewage, Maria G. Martini, Harsha D Appuhami

Show abstract

In 3D video delivery, the rendered 3D video quality at the receiver-side can be affected by rendering artifacts as well as by concealment errors which occur in the process of recovering missing 3D video packets. Therefore it is vital to have an understanding of the artifacts prior to transmitting data. This work proposes a model to quantify rendering and concealment errors at the sender-side and to use the information generated through the model to effectively deliver 3D video content.

New technique for capturing images containing invisible depth information on objects using brightness modulated light

Sae Isaka, Kazutake Uehira

Show abstract

We present a new technique for capturing images where depth information on an object is invisibly and simultaneously embedded in its 2-D image when its image is taken with a camera. An object is illuminated by light that contains invisible information whose characteristics change depending on depth; therefore, the images of objects captured with a camera also invisibly contain such information. This invisible information on depth can be extracted by appropriate image processing from the captured image of the object. Images taken with this technique can be treated as conventional 2-D images because the image format is for conventional 2-D images. 3-D images can also be constructed by abstracting depth information embedded in the image. We carried out experiments including a subjective test and confirmed that the projected pattern could be embedded in the captured image invisibly and its frequency component, from which the depth information on the object can be obtained, could be read out from the captured image. Moreover, we demonstrate that the depth map on a captured image of a practical scene can be obtained using this frequency component although it can now only be applied to scenes with simple configurations such as foregrounds and backgrounds.

Interactive 3D segmentation by tubular envelope model for the aorta treatment

Pawel J. Lubniewski, Bruno Miguel, Vincent Sauvage, et al.

Show abstract

We propose a novel interactive 3D segmentation approach and geometric model definition called tubular envelope model. It is conceived to express the shape of tubular objects. The main challenges we have achieved are the speed and interactivity of the construction. A computer program designed for this task gives the user full control of the shape and precision, with no significant computational errors. Six CT (computed tomography) aortic dissection images have been used for the tubular envelopes construction. Hence, we have proposed a generic parametric model of the aorta for its interactive construction. It leads us to rapid visualization and navigation inside the artery (rough virtual angioscopy). The low complexity of the model and the ease of interactive design makes the tubular envelope suitable for aorta segmentation in comparison to the other segmentation methods. The model accuracy is adjustable by the user according to his requirements; the time of construction is approved by clinicians. More generally, the tubular envelope could be used in other applications, e.g. to define a region of interest for more precise segmentation or feature extraction inside, to develop a parametric model with deformation capabilities.

A parallel stereo reconstruction algorithm with applications in entomology (APSRA)

Rajesh Bhasin, Won Jun Jang, John C. Hart

Show abstract

We propose a fast parallel algorithm for the reconstruction of 3-Dimensional point clouds of insects from binocular stereo image pairs using a hierarchical approach for disparity estimation. Entomologists study various features of insects to classify them, build their distribution maps, and discover genetic links between specimens among various other essential tasks. This information is important to the pesticide and the pharmaceutical industries among others. When considering the large collections of insects entomologists analyze, it becomes difficult to physically handle the entire collection and share the data with researchers across the world. With the method presented in our work, Entomologists can create an image database for their collections and use the 3D models for studying the shape and structure of the insects thus making it easier to maintain and share. Initial feedback shows that the reconstructed 3D models preserve the shape and size of the specimen. We further optimize our results to incorporate multiview stereo which produces better overall structure of the insects. Our main contribution is applying stereoscopic vision techniques to entomology to solve the problems faced by entomologists.