Proceedings Volume 9013

Three-Dimensional Image Processing, Measurement (3DIPM), and Applications 2014

Atilla M. Baskurt, Robert Sitnik
cover
Proceedings Volume 9013

Three-Dimensional Image Processing, Measurement (3DIPM), and Applications 2014

Atilla M. Baskurt, Robert Sitnik
View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 11 March 2014
Contents: 5 Sessions, 16 Papers, 0 Presentations
Conference: IS&T/SPIE Electronic Imaging 2014
Volume Number: 9013

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Front Matter: Volume 9013
  • 3D Processing, Indexing, and Modeling
  • 3D Pattern Recognition and Real Time Processing
  • 3D Imaging Systems
  • Interactive Paper Session
Front Matter: Volume 9013
icon_mobile_dropdown
Front Matter: Volume 9013
This PDF file contains the front matter associated with SPIE Proceedings Volume 9013, including the Title Page, Copyright information, Table of Contents, and Conference Committee listing.
3D Processing, Indexing, and Modeling
icon_mobile_dropdown
Temporal consistent depth map upscaling for 3DTV
Sebastian Schwarz, Mårten Sjöström, Roger Olsson
The ongoing success of three-dimensional (3D) cinema fuels increasing efforts to spread the commercial success of 3D to new markets. The possibilities of a convincing 3D experience at home, such as three-dimensional television (3DTV), has generated a great deal of interest within the research and standardization community. A central issue for 3DTV is the creation and representation of 3D content. Acquiring scene depth information is a fundamental task in computer vision, yet complex and error-prone. Dedicated range sensors, such as the Time­ of-Flight camera (ToF), can simplify the scene depth capture process and overcome shortcomings of traditional solutions, such as active or passive stereo analysis. Admittedly, currently available ToF sensors deliver only a limited spatial resolution. However, sophisticated depth upscaling approaches use texture information to match depth and video resolution. At Electronic Imaging 2012 we proposed an upscaling routine based on error energy minimization, weighted with edge information from an accompanying video source. In this article we develop our algorithm further. By adding temporal consistency constraints to the upscaling process, we reduce disturbing depth jumps and flickering artifacts in the final 3DTV content. Temporal consistency in depth maps enhances the 3D experience, leading to a wider acceptance of 3D media content. More content in better quality can boost the commercial success of 3DTV.
Feature enhancing aerial lidar point cloud refinement
Zhenzhen Gao, Ulrich Neumann
Raw aerial LiDAR point clouds often suffer from noise and under-sampling, which can be alleviated by feature preserving refinement. However, existing approaches are limited to only preserving normal discontinuous features (ridges, ravines and crest lines) while position discontinuous features (boundaries) are also universal in urban scenes. We present a new refinement approach to accommodate unique properties of aerial LiDAR building points. By extending recent developments in geometry refinement to explicitly regularize boundary points, both normal and position discontinuous features are preserved and enhanced. The refinement includes two steps: i) the smoothing step applies a two-stage feature preserving bilateral filtering, which first filters normals and then updates positions under the guidance of the filtered normals. In a separate similar process, boundary points are smoothed directed by tangent directions of underlying lines, and ii) the up-sampling step interpolates new points to fill gaps/holes for both interior surfaces and boundary lines, through a local gap detector and a feature-aware bilateral projector. Features can be further enhanced by limiting the up-sampling near discontinuities. The refinement operates directly on points with diverse density, shape and complexity. It is memory-efficient, easy to implement, and easily extensible.
VOLUMNECT: measuring volumes with Kinect
Beatriz Quintino Ferreira, Miguel Griné, Duarte Gameiro, et al.
This article presents a solution to volume measurement object packing using 3D cameras (such as the Microsoft KinectTM). We target application scenarios, such as warehouses or distribution and logistics companies, where it is important to promptly compute package volumes, yet high accuracy is not pivotal. Our application auto- matically detects cuboid objects using the depth camera data and computes their volume and sorting it allowing space optimization. The proposed methodology applies to a point cloud simple computer vision and image processing methods, as connected components, morphological operations and Harris corner detector, producing encouraging results, namely an accuracy in volume measurement of 8mm. Aspects that can be further improved are identified; nevertheless, the current solution is already promising turning out to be cost effective for the envisaged scenarios.
3D mesh indexing based on structural analysis
M. Hachani, A. Ouled Zaid, W. Puech
This paper presents a novel pattern recognition method based on Reeb graph representation. The main idea of this approach is to reinforce the topological consistency conditions of the graph-based description. This approach enfolds an off-line step and an on-line step. In the off-line one, 3D shape is represented by a Reeb graph associated with geometrical signatures based on parametrization approaches. The similarity estimation is performed in the on-line step. It consists to compute a global similarity measure which quantifies the similitude degree between any pair of 3D-models in the given dataset. The experimental results obtained on the SHREC 2012 database show the system effectiveness in 3D shape recognition.
3D Pattern Recognition and Real Time Processing
icon_mobile_dropdown
Real-time 3D human pose recognition from reconstructed volume via voxel classifiers
ByungIn Yoo, Changkyu Choi, Jae-Joon Han, et al.
This paper presents a human pose recognition method which simultaneously reconstructs a human volume based on ensemble of voxel classifiers from a single depth image in real-time. The human pose recognition is a difficult task since a single depth camera can capture only visible surfaces of a human body. In order to recognize invisible (self-occluded) surfaces of a human body, the proposed algorithm employs voxel classifiers trained with multi-layered synthetic voxels. Specifically, ray-casting onto a volumetric human model generates a synthetic voxel, where voxel consists of a 3D position and ID corresponding to the body part. The synthesized volumetric data which contain both visible and invisible body voxels are utilized to train the voxel classifiers. As a result, the voxel classifiers not only identify the visible voxels but also reconstruct the 3D positions and the IDs of the invisible voxels. The experimental results show improved performance on estimating the human poses due to the capability of inferring the invisible human body voxels. It is expected that the proposed algorithm can be applied to many fields such as telepresence, gaming, virtual fitting, wellness business, and real 3D contents control on real 3D displays.
Model-based 3D human shape estimation from silhouettes for virtual fitting
Shunta Saito, Makiko Kouchi, Masaaki Mochimaru, et al.
We propose a model-based 3D human shape reconstruction system from two silhouettes. Firstly, we synthesize a deformable body model from 3D human shape database consists of a hundred whole body mesh models. Each mesh model is homologous, so that it has the same topology and same number of vertices among all models. We perform principal component analysis (PCA) on the database and synthesize an Active Shape Model (ASM). ASM allows changing the body type of the model with a few parameters. The pose changing of our model can be achieved by reconstructing the skeleton structures from implanted joints of the model. By applying pose changing after body type deformation, our model can represents various body types and any pose. We apply the model to the problem of 3D human shape reconstruction from front and side silhouette. Our approach is simply comparing the contours between the model's and input silhouettes', we then use only torso part contour of the model to reconstruct whole shape. We optimize the model parameters by minimizing the difference between corresponding silhouettes by using a stochastic, derivative-free non-linear optimization method, CMA-ES.
3D face recognition via conformal representation
In this paper, we propose a 3D face recognition approach based on the conformal representation of facial surfaces. Firstly, facial surfaces are mapped onto the 2D unit disk by Riemann mapping. Their conformal representation (i.e. the pair of mean curvature (MC) and conformal factor (CF) ) are then computed and encoded to Mean Curvature Images (MCIs) and Conformal Factor Images (CFIs). Considering that different regions of face deform unequally due to expression variation, MCIs and CFIs are divided into five parts. LDA is applied to each part to obtain the feature vector. At last, five parts are fused on the distance level for recognition. Extensive experiments carried out on the BU-3DFE database demonstrate the effectiveness of the proposed approach.
3D Imaging Systems
icon_mobile_dropdown
Real-time 3D shape measurement system with full temporal resolution and spatial resolution
Kai Zhong, Zhongwei Li, Xiaohui Zhou, et al.
Numerous fast 3D shape measurement systems based on pattern projection method have been developed in the recent years, but measuring arbitrary dynamic 3D shape with full resolution, including temporal resolution and spatial resolution, is still a big challenging problem. This paper presents a real-time 3D measurement system with full spatial resolution and temporal resolution. In this system, three-step phase-shifting algorithm is employed for full spatial resolution measurement, and a multi-view phase-shifting correspondence is used to search the corresponding point independently without additional images. So any adjacent three phase-shifting images in the continuous capturing stream can be used to reconstruct arbitrary 3D shape, the 3D acquisition speed can be as fast as the camera capturing speed to achieve full temporal resolution. Moreover, for developing an accurate measurement system the calibration method is also presented, and it can obtain more accurate internal and external parameters than traditional method in presence of inaccuracy of calibration target. And a hybrid computing architecture based accelerate calculation method is introduced to achieve real-time 3D measurement, and the computation speed can be above 280 times faster than before. The experiments indicate that our system can perform 1024 × 768 full spatial resolution and 220 fps full temporal resolution 3D measurement, and it can also realize real-time 3D measurement at average speed of 50 fps.
New concept of technology chain for 3D/4D content generation and display
The main goal of this paper is to introduce the concept of next-generation of 3D and 4D imaging technology, which is capable of multi-source capture, multi-representation processing, multi-data-content mixing and provides a holographic content and display with ultrahigh resolution. The technology allows the generation and display of color holographic data content that is composed of real life and synthetic scenes. The diversity, complexity and extreme high resolution of this 3D/4D imaging system requires the development of tools that compose the holographic content captured from a multi-source system (dynamic objects of human sized dimensions, far distant backgrounds, and synthetic scene elements). In the area of capture technology the system includes a dynamic, high resolution, color holographic capture systems and multimodal 3D and 4D (3D+time) structured light systems. Progress in innovative data and content processing technology is going to be made as several methods and tools for processing 3D/4D holographic representations (amplitude-phase) and the composition of different data types (surface representation, 2D images) into a single extreme high resolution holographic video are introduced. In addition, the concept includes procedures for the generation of events that support interaction of both 4D surface and amplitude-phase data and introduce a 3D/4D data representation exchange format (surface to/from holographic).
Low-cost structured-light based 3D capture system design
Jing Dong, Kurt R. Bengtson, Barrett F. Robinson, et al.
Most of the 3D capture products currently in the market are high-end and pricey. They are not targeted for consumers, but rather for research, medical, or industrial usage. Very few aim to provide a solution for home and small business applications. Our goal is to fill in this gap by only using low-cost components to build a 3D capture system that can satisfy the needs of this market segment. In this paper, we present a low-cost 3D capture system based on the structured-light method. The system is built around the HP TopShot LaserJet Pro M275. For our capture device, we use the 8.0 Mpixel camera that is part of the M275. We augment this hardware with two 3M MPro 150 VGA (640 × 480) pocket projectors. We also describe an analytical approach to predicting the achievable resolution of the reconstructed 3D object based on differentials and small signal theory, and an experimental procedure for validating that the system under test meets the specifications for reconstructed object resolution that are predicted by our analytical model. By comparing our experimental measurements from the camera-projector system with the simulation results based on the model for this system, we conclude that our prototype system has been correctly configured and calibrated. We also conclude that with the analytical models, we have an effective means for specifying system parameters to achieve a given target resolution for the reconstructed object.
Interactive Paper Session
icon_mobile_dropdown
Human machine interface by using stereo-based depth extraction
Chao-Kang Liao, Chi-Hao Wu, Hsueh-Yi Lin, et al.
The ongoing success of three-dimensional (3D) cinema fuels increasing efforts to spread the commercial success of 3D to new markets. The possibilities of a convincing 3D experience at home, such as three-dimensional television (3DTV), has generated a great deal of interest within the research and standardization community. A central issue for 3DTV is the creation and representation of 3D content. Acquiring scene depth information is a fundamental task in computer vision, yet complex and error-prone. Dedicated range sensors, such as the Time­ of-Flight camera (ToF), can simplify the scene depth capture process and overcome shortcomings of traditional solutions, such as active or passive stereo analysis. Admittedly, currently available ToF sensors deliver only a limited spatial resolution. However, sophisticated depth upscaling approaches use texture information to match depth and video resolution. At Electronic Imaging 2012 we proposed an upscaling routine based on error energy minimization, weighted with edge information from an accompanying video source. In this article we develop our algorithm further. By adding temporal consistency constraints to the upscaling process, we reduce disturbing depth jumps and flickering artifacts in the final 3DTV content. Temporal consistency in depth maps enhances the 3D experience, leading to a wider acceptance of 3D media content. More content in better quality can boost the commercial success of 3DTV.
Tabu search for human pose recognition
W. Dyce, N. Rodriguez, B. Lange, et al.
The use of computer vision techniques to build hands-free input devices has long been a topic of interest to researchers in the field of natural interaction. In recent years Microsoft’s Kinect has brought these technologies to the layman, but the most commonly used libraries for Kinect human pose recognition are closed-source. There is not yet an accepted, effective open-source alternative upon which highly specific applications can be based. We propose a novel technique for extracting the appendage configurations of users from the Kinect camera’s depth feed, based on stochastic local search techniques rather than per-pixel classification.
A multiple wavelength unwrapping algorithm for digital fringe profilometry based on spatial shift estimation
Pu Cao, Jiangtao Xi, Yanguang Yu, et al.
In this paper, a new approach is presented for solving the problem of spatial shift wrapping associated with Spatial Shift Estimation (SSE)-based Fringe Pattern Profilometry (FPP). The problem arises as the result of fringe reuse (that is, fringes periodic light intensity variance), and the spatial shift can only be identified without ambiguity with the range of a fringe width. It is demonstrated that the problem is similar to the phase unwrapping problem associated with the phase detection based FPP, and the proposed method is inspired by the existing ideas of using multiple images with different wavelengths proposed for phase unwrapping. The effectiveness of the proposed method is verified by experimental results on an object with complex surface shape.
Experimental demonstration of parallel phase-shifting digital holography under weak light condition
Lin Miao, Tatsuki Tahara, Peng Xia, et al.
One of advantages of parallel phase-shifting digital holography (PPSDH) compared with other digital holography techniques is the fast recording of three-dimensional (3D) objects. During the fast recording of a multiplexed hologram that contains at least three amounts of phase retardation, the optical energy of the hologram becomes smaller. Therefore, it is important to assess the minimum optical energy that can reconstruct the object with moderate reconstruction error. In this paper, we investigate experimentally the optical energy to reconstruct the object under weak light condition in PPSDH. We compare the numerical and experimental results. The experiment is in good agreement with numerical results when the sensitivity of the image sensor is taken into account.
A novel global color correction method for 3D content
Pierre Yver, Sébastien Kramm, Abdelaziz Bensrhair, et al.
Today 3D content is part of many applicative fields, consumer or industrial. Its production requires perfectly matched images. This paper focuses on the color calibration, so that both images have the same chromatic content on the same image elements. Previous methods have shown theirs limits, thus we propose a novel technique based on the disparity map produced by modern matching algorithms. Using this map, we produce for each color plane a 2D data set from corresponding pixels. From this set, we compute the best fitting of a normalised polynomial model. Additional techniques are used to remove outliers and improve the fitting. The polynomial model is then used to correct one image so that its chromatic content fits the other image. Experimental results are provided that show benefits of this technique on some sample image pairs.