Proceedings Volume 6056

Three-Dimensional Image Capture and Applications VII

cover
Proceedings Volume 6056

Three-Dimensional Image Capture and Applications VII

View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 26 January 2006
Contents: 6 Sessions, 34 Papers, 0 Presentations
Conference: Electronic Imaging 2006 2006
Volume Number: 6056

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • 3D Scanning Hardware
  • 3D Object Capture from Static Scans and Video I
  • 3D Object Capture from Static Scans and Video II
  • 3D Scans of the Human I
  • 3D Scans of the Human II
  • Poster Session
3D Scanning Hardware
icon_mobile_dropdown
A novel design of grating projecting system for 3D reconstruction of wafer bumps
Yuan Shu, Ronald Chung, Zheng Tan, et al.
A challenge in the semiconductor industry is the 3D inspection of solder bumps grown on wafers for direct die-to-die bonding. In an earlier work we proposed a mechanism for reconstructing wafer bump surface in 3D, which is based upon projecting a binary grating to the surface with an inclined angle. For the purpose of 3D reconstruction with high speed and accuracy, the requirements for the projection lens system are the followings: (1) having a tilted angle between the projection plane and the optical axis; (2) having high bandwidth to let high-spatial-frequency harmonics contained in the binary grating pass through the lens and be projected onto the inspected surface properly; (3) having high Modulation Transfer Function (MTF); (4) having large Field of View (FOV); and (5) having a large Depth of Field (DOF) that corresponds to the depth range or height of the inspected surface. The above requirements lead to great challenges in the design of the projection lens system. In this paper, we describe a design consisting of a grating and several pieces of spherical lens, that addresses the requirements. To reduce the lens aberrations, the grating is laid out with a tilting angle specifically to make the grating, the lens, and the image plane intersect at the same line. Such a system can project a high spatial-frequency binary grating onto the inspected surface properly. Simulation results, including performance analysis and tolerance analysis, are shown to demonstrate the feasibility of the design.
Measurement of discontinuities on 3D objects using digital moiré
In this paper, a two-dimensional, binary fringe pattern is designed as structured light for 3D measurement. A feature, i.e., a white cross, is placed in the center of the fringe grating. The cross serves as the axes of a reference frame. White square grids are alternated by black stripes vertically or horizontally elsewhere. The relative position of a given point on the fringe with respect to the center can be identified. When the fringe pattern is projected onto the surface of the object, its image is distorted. Therefore, image processing and pattern recognition algorithms are designed to calculate which row and column the particular point belongs to in the original fringe-frame. The pair of emitting and receiving angles for each point in the fringe and CCD frames, respectively, is acquired. Also the coordinate of each 3D point can be calculated. Compared with traditional digital moire methods, this method achieves an absolute measurement of 3D surfaces because the information contained in the pattern is globally structured. Therefore, discontinuity measurement can be solved more easily. Resolution of the proposed method is higher than that of current methods of coding patterns under the same line width limitation due to the principle of pattern design.
High-speed and high-sensitive demodulation pixel for 3D imaging
Bernhard Büttgen, Thierry Oggier, Michael Lehmann, et al.
Optical time-of-flight (TOF) distance measurements can be performed using so-called smart lock-in pixels. By sampling the optical signal 2, 4 or n times in each pixel synchronously with the modulation frequency, the phase between the emitted and reflected signal is extracted and the object's distance is determined. The high integration-level of such lock-in pixels enables the real-time acquisition of the three-dimensional environment without using any moving mechanical components. A novel design of the 2-tap lock-in pixel in a 0.6 μm semiconductor technology is presented. The pixel was implemented on a sensor with QCIF resolution. The optimized pixel design allows for high-speed operation of the device, resulting in a nearly-optimum demodulation performance and precise distance measurements which are almost exclusively limited by photon shot noise. In-pixel background-light suppression allows the sensor to be operated in an outdoor environment with sunlight incidence. The highly complex pixel functionality of the sensor was successfully demonstrated on the new SwissRanger SR3000 3D-TOF camera design. Distance resolutions in the millimeter range have been achieved while the camera is operating with frame rates of more than 20Hz.
A QVGA-size CMOS time-of-flight range image sensor with background light charge draining structure
Takeo Ushinaga, Izhal Abdul Halin, Tomonari Sawada, et al.
3-D imaging systems can be used in a variety of applications such as in automobile, medicine, robot vision systems, security and so on. Recently many kinds of range finding methods have been proposed for 3-D imaging systems. This paper presents a new type of CMOS range image sensor based on the Time-of-Flight (TOF)principle with a spatial resolution of 336 × 252 (QVGA) and pixels of 15 × 15 μm2 size. A pixel structure of the sensor consists of single layer polysilicon gates on thick field oxide and has a function of background light induced charge reduction. The chip was fabricated in a 0.35 μm standard CMOS process with two poly and three metal layers. The presented sensor achieves a minimum range resolution of 2.8cm at framerate of 30fps and the resolution is improved to 4.2mm for 10 frames averaging, which corresponds to 3fps.
3D Object Capture from Static Scans and Video I
icon_mobile_dropdown
Overview of 3D surface digitization technologies in Europe
This paper presents an overview of the different 3D surface digitization technologies commercially available in the European market. The solutions for 3D surface measurement offered by major European companies can be divided into different groups depending on various characteristics, such as technology (e.g. laser scanning, white light projection), system construction (e.g. fix, on CMM/robot/arm) or measurement type (e.g. surface scanning, profile scanning). Crossing between the categories is possible, however, the majority of commercial products can be divided into the following groups: (a) laser profilers mounted on CMM, (b) portable coded light projection systems, (c) desktop solutions with laser profiler or coded light projectin system and multi-axes platform, (d) laser point measurement systems where both sensor and object move, (e) hand operated laser profilers, hand held laser profiler or point measurement systems, (f) dedicated systems. This paper presents the different 3D surface digitization technologies and describes them with their advantages and disadvantages. Various examples of their use are shown for different application fields. A special interest is given to applications regarding the 3D surface measurement of the human body.
Virtual confocal microscopy
There is a need for persistent-surveillance assets to capture high-resolution, three-dimensional data for use in assisted target recognizing systems. Passive electro-optic imaging systems are presently limited by their ability to provide only 2-D measurements. We describe a methodology and system that uses existing technology to obtain 3-D information from disparate 2-D observations. This data can then be used to locate and classify objects under obscurations and noise. We propose a novel methodology for 3-D object reconstruction through use of established confocal microscopy techniques. A moving airborne sensing platform captures a sequence of geo-referenced, electro-optic images. Confocal processing of this data can synthesize a large virtual lens with an extremely sharp (small) depth of focus, thus yielding a highly discriminating 3-D data collection capability based on 2-D imagery. This allows existing assets to be used to obtain high-quality 3-D data (due to the fine z-resolution). This paper presents a stochastic algorithm for reconstruction of a 3-D target from a sequence of affine projections. We iteratively gather 2-D images over a known path, detect target edges, and aggregate the edges in 3-D space. In the final step, an expectation is computed resulting in an estimate of the target structure.
A robust algorithm for estimation of depth map for 3D shape recovery
Three-dimensional shape recovery from one or multiple observations is a challenging problem of computer vision. In this paper, we present a new focus measure for calculation of depth map. That depth map can further be used in techniques and algorithms leading to recovery of three dimensional structure of object which is required in many high level vision applications. The focus measure presented has shown robustness in presence of noise as compared to the earlier focus measures. This new focus measure is based on an optical transfer function using Discrete Cosine Transform and its results are compared with the earlier focus measures including Sum of Modified Laplacian (SML) and Tenenbaum focus measures. With this new focus measure, the results without any noise are almost similar in nature to the earlier focus measures however drastic improvement is observed with respect to others in the presence of noise. The proposed focus measure is applied on a test image, on a sequence of 97 simulated cone images and on a sequence of 97 real cone images. The images were added with the Gaussian noise which arises due to factors such as electronic circuit noise and sensor noise due to poor illumination and/or high temperature.
3D Object Capture from Static Scans and Video II
icon_mobile_dropdown
Formation of stereoscopic image pairs from a sequence of frames
Transverse relative motion between an object and an observer, as well as specific rotations of that object provide a means of forming stereoscopic pairs from a sequence of 2D images of the object. An algorithm that transforms consecutive frames in this sequence in a manner that results in a series of optimally viewable stereoscopic pairs has been developed. This operation is done without any human intervention, and without any a priori knowledge or control of the subject's motion. Orientations of the stereoscopic axis found in the sequence will generally not be aligned with that of the viewer, and may change throughout the sequence. In order to perceive the stereo effect to an optimal degree, the frames must be rotated to bring this axis into alignment with that of the viewer. Current techniques for this purpose assume precise knowledge of the subject's orientation and angular velocity, or require subjective human intervention. The algorithm is based upon vector algebra of a displacement field derived from correlations made between a "reference" frame and its corresponding "comparison" frame. Possible applications of this technique include ground-based imaging of orbiting satellites and space probe imagery, and could result in new mapping techniques and new methods of creating 3D imagery.
3D model generation using unconstrained motion of a hand-held video camera
C. Baker, C. Debrunner, M. Whitehorn
We have developed a shape and structure capture system which constructs accurate, realistic 3D models from video imagery taken with a single freely moving handheld camera. Using an inexpensive off the shelf acquisition system such as a hand-held video camera, we demonstrate the feasibility of fast and accurate generation of these 3D models at a very low cost. In our approach the operator freely moves the camera within some very simple constraints. Our process identifies and tracks high interest image features and computes the relative pose of the camera based on those tracks. Using a RANSAC-like approach we solve for the camera pose and 3D structure based on a homography or essential matrix. Once we have the pose for many frames in the sequence we perform correlation-based stereo to obtain dense point clouds. After these point clouds are computed we integrate them into an octree. By replacing the points in a particular cell with statistics representing the point distribution we can efficiently store the computed model. While being efficient, the integration technique also enables filtering based on occupancy counts which eliminates many stereo outliers and results in an aesthetic viewable 3D model. In this paper we describe our approach in detail as well as show reconstructed results of a synthetic room, an empty room, a lightly furnished room, and an experimental vehicle.
3D from arbitrary 2D video
In this paper, we present methods to synthesize 3D video from arbitrary 2D video. The 2D video is analyzed by computing frame-by-frame motion maps. For this computation, several methods were tested, including optical flow, segmentation and correlation based target location. Using the computed motion maps, the video undergoes analysis and the frames are segmented to provide object-wise depth ordering. The frames are then used to synthesize stereo pairs. This is performed by resampling frames on a grid that is governed by a corresponding depth-map. In order to improve the quality of the synthetic video, as well as to enable 2D viewing where 3D visualization is not possible, several techniques for image enhancement are used. In our test case, anaglyph projection was selected as the 3D visualization method, as the method is mostly suited to standard displays. The drawback of this method is ghosting artifacts. In our implementation we minimize these unwanted artifacts by modifying the computed depth-maps using non-linear transformations. Defocusing of one anaglyph color component was also used to counter such artifacts. Our results show that the suggested methods enable synthesis of high quality 3D videos.
Nonintrusive viewpoint tracking for 3D for perception in smart video conference
Xavier Desurmont, Isabel Martinez-Ponte, Jerome Meessen, et al.
Globalisation of people's interaction in the industrial world and ecological cost of transport make video-conference an interesting solution for collaborative work. However, the lack of immersive perception makes video-conference not appealing. TIFANIS tele-immersion system was conceived to let users interact as if they were physically together. In this paper, we focus on an important feature of the immersive system: the automatic tracking of the user's point of view in order to render correctly in his display the scene from the ther site. Viewpoint information has to be computed in a very short time and the detection system should be no intrusive, otherwise it would become cumbersome for the user, i.e. he would lose the feeling of "being there". The viewpoint detection system consists of several modules. First, an analysis module identifies and follows regions of interest (ROI) where faces are detected. We will show the cooperative approach between spatial detection and temporal tracking. Secondly, an eye detector finds the position of the eyes within faces. Then, the 3D positions of the eyes are deduced using stereoscopic images from a binocular camera. Finally, the 3D scene is rendered in real-time according to the new point of view.
Internal shape-deformation invariant 3D surface matching using 2D principal component analysis
Mehmet Celenk, Inad Aljarrah
This paper describes a method that overcomes the problem of internal deformations in three-dimensional (3D) range image identification. Internal deformations can be caused by several factors including stereo camera-pair misalignment, surface irregularities, active vision methods' incompatibilities, image imperfections, and changes in illumination sources. Most 3D surface matching systems suffer from these changes and their performances are significantly degraded unless deformations' effect is compensated. Here, we propose an internal compensation method based on the two-dimensional (2D) principal component analysis (PCA). The depth map of a 3D range image is first thresholded using Otsu's optimal threshold selection criterion to discard the background information. The detected volumetric shape is normalized in the spatial plane and aligned with a reference coordinate system for rotation-, translation- and scaling-invariant classification. The preprocessed range image is then divided into 16x16 sub-blocks, each of which is smoothed to minimize the local variations. The 2DPCA is applied to the resultant range data and the corresponding principal vectors are used as the characteristic features of the object to determine its identity in the database of pre-recorded shapes. The system's performance is tested against the several 3D facial images possessing arbitrary deformation. Experiments have resulted in 92% recognition accuracy for the GavaDB 3D-face database entries and their Gaussian- or Poisson-type noisy versions using the minimum Euclidean-distance classification strategy in an optimally constructed eigen-face feature space.
Digital Hammurabi: design and development of a 3D scanner for cuneiform tablets
Daniel V. Hahn, Donald D. Duncan, Kevin C. Baldwin, et al.
Cuneiform is an ancient form of writing in which wooden reeds were used to impress shapes upon moist clay tablets. Upon drying, the tablets preserved the written script with remarkable accuracy and durability. There are currently hundreds of thousands of cuneiform tablets spread throughout the world in both museums and private collections. The global scale of these artifacts presents several problems for scholars who wish to study them. It may be difficult or impossible to obtain access to a given collection. In addition, photographic records of the tablets many times prove to be inadequate for proper examination. Photographs lack the ability to alter the lighting conditions and view direction. As a solution to these problems, we describe a 3D scanner capable of acquiring the shape, color, and reflectance of a tablet as a complete 3D object. This data set could then be stored in an online library and manipulated by suitable rendering software that would allow a user to specify any view direction and lighting condition. The scanner utilizes a camera and telecentric lens to acquire images of the tablet under varying controlled illumination conditions. Image data are processed using photometric stereo and structured light techniques to determine the tablet shape; color information is reconstructed from primary color monochrome image data. The scanned surface is sampled at 26.8 μm lateral spacing and the height information is calculated on a much smaller scale. Scans of adjacent tablet sides are registered together to form a 3D surface model.
Three-dimensional surface reconstruction for evaluation of the abrasion effects on textile fabrics
A. O. Mendes, P. T. Fiadeiro, R. A. L. Miguel
Abrasion is responsible for many surface changes that occur on garments. For this reason, the evaluation of its effects becomes very important for the textile industry. In particular, pilling formation is a phenomenon that results of the abrasion process and affects fabrics more significantly altering their surface severely. The present work presents a method based on optical triangulation that enables topographic reconstructions of textile fabric samples and consequently, makes possible the evaluation and the quantification of the pilling formation that results from their topographic changes. Specific algorithms, written in the MatLab programming language, were developed and implemented to control the image data acquisition, storage and processing procedures. Finally, with the available processed data was possible to reconstruct the surface of fabric samples in three-dimensions and also, a coefficient to express the pilling formation occurred on the analyzed fabrics was achieved. Several tests and experiences have been carried out and the obtained results shown that this method is robust and precise.
3D environment capture from monocular video and inertial data
This paper presents experimental methods and results for 3D environment reconstruction from monocular video augmented with inertial data. One application targets sparsely furnished room interiors, using high quality handheld video with a normal field of view, and linear accelerations and angular velocities from an attached inertial measurement unit. A second application targets natural terrain with manmade structures, using heavily compressed aerial video with a narrow field of view, and position and orientation data from the aircraft navigation system. In both applications, the translational and rotational offsets between the camera and inertial reference frames are initially unknown, and only a small fraction of the scene is visible in any one video frame. We start by estimating sparse structure and motion from 2D feature tracks using a Kalman filter and/or repeated, partial bundle adjustments requiring bounded time per video frame. The first application additionally incorporates a weak assumption of bounding perpendicular planes to minimize a tendency of the motion estimation to drift, while the second application requires tight integration of the navigational data to alleviate the poor conditioning caused by the narrow field of view. This is followed by dense structure recovery via graph-cut-based multi-view stereo, meshing, and optional mesh simplification. Finally, input images are texture-mapped onto the 3D surface for rendering. We show sample results from multiple, novel viewpoints.
The effects of different shape-based metrics on the identification of military targets from 3D ladar data
Gregory J. Meyer, James R. Weber
The choice of shape metrics is important to effectively identify three-dimensional targets. The performance (expressed as a probability of correct classification) of four metrics using point clouds of military targets rendered using Irma, a government tool that simulates the output of an active ladar system, is compared across multiple ranges, sampling densities, target types, and noise levels. After understanding the range of operating conditions a classifier would be expected to see in the field, a process for determining the upper-bound of a classifier and the significance of this result is assessed. Finally, the effect of sampling density and variance in the position estimates on classification performance is shown. Classification performance significantly decreases when sampling density exceeds 10 degrees and the voxelized histogram metric outperforms the other three metrics used in this paper because of its performance in high-noise environments. Most importantly, this paper highlights a step-by-step method to test and evaluate shape metrics using accurate target models.
3D Scans of the Human I
icon_mobile_dropdown
Digital 3D facial reconstruction of George Washington
Anshuman Razdan, Jeff Schwartz, Mathew Tocheri, et al.
PRISM is a focal point of interdisciplinary research in geometric modeling, computer graphics and visualization at Arizona State University. Many projects in the last ten years have involved laser scanning, geometric modeling and feature extraction from such data as archaeological vessels, bones, human faces, etc. This paper gives a brief overview of a recently completed project on the 3D reconstruction of George Washington (GW). The project brought together forensic anthropologists, digital artists and computer scientists in the 3D digital reconstruction of GW at 57, 45 and 19 including detailed heads and bodies. Although many other scanning projects such as the Michelangelo project have successfully captured fine details via laser scanning, our project took it a step further, i.e. to predict what that individual (in the sculpture) might have looked like both in later and earlier years, specifically the process to account for reverse aging. Our base data was GWs face mask at Morgan Library and Hudons bust of GW at Mount Vernon, both done when GW was 53. Additionally, we scanned the statue at the Capitol in Richmond, VA; various dentures, and other items. Other measurements came from clothing and even portraits of GW. The digital GWs were then milled in high density foam for a studio to complete the work. These will be unveiled at the opening of the new education center at Mt Vernon in fall 2006.
The study of craniofacial growth patterns using 3D laser scanning and geometric morphometrics
Throughout childhood, braincase and face grow at different rates and therefore exhibit variable proportions and positions relative to each other. Our understanding of the direction and magnitude of these growth patterns is crucial for many ergonomic applications and can be improved by advanced 3D morphometrics. The purpose of this study is to investigate this known growth allometry using 3D imaging techniques. The geometry of the head and face of 840 children, aged 2 to 19, was captured with a laser surface scanner and analyzed statistically. From each scan, 18 landmarks were extracted and registered using General Procrustes Analysis (GPA). GPA eliminates unwanted variation due to position, orientation and scale by applying a least-squares superimposition algorithm to individual landmark configurations. This approach provides the necessary normalization for the study of differences in size, shape, and their interaction (allometry). The results show that throughout adolescence, boys and girls follow a different growth trajectory, leading to marked differences not only in size but also in shape, most notably in relative proportions of the braincase. These differences can be observed during early childhood, but become most noticeable after the age of 13 years, when craniofacial growth in girls slows down significantly, whereas growth in boys continues for at least 3 more years.
A three-dimensional analysis of the geometry and curvature of the proximal tibial articular surface of hominoids
This study uses new three-dimensional imaging techniques to compare the articular curvature of the proximal tibial articular surface of hominoids. It has been hypothesized that the curvature of the anteroposterior contour of the lateral condyle in particular can be used to differentiate humans and apes and reflect locomotor function. This study draws from a large comparative sample of extant hominoids to obtain quantitative curvature data. Three-dimensional models of the proximal tibiae of 26 human, 15 chimpanzee, 15 gorilla, 17 orangutan, 16 gibbon and four Australopithecus fossil casts (AL 129-1b, AL 288-1aq, AL 333x-26, KNM-KP 29285A) were acquired with a Cyberware Model 15 laser digitizer. Curvature analysis was accomplished using a software program developed at Arizona State University's Partnership for Research In Stereo Modeling (PRISM) lab, which enables the user to extract curvature profiles and compute the difference between analogous curves from different specimens. Results indicate that the curvature of chimpanzee, gorilla and orangutan tibiae is significantly different from the curvature of human tibiae, thus supporting the hypothesized dichotomy between humans and great apes. The non-significant difference between gibbons and all other taxa indicates that gibbons have an intermediate pattern of articular curvature. All four Australopithecus tibia were aligned with the great apes.
3D head model classification using optimized EGI
With the general availability of 3D digitizers and scanners, 3D graphical models have been used widely in a variety of applications. This has led to the development of search engines for 3D models. Especially, 3D head model classification and retrieval have received more and more attention in view of their many potential applications in criminal identifications, computer animation, movie industry and medical industry. This paper addresses the 3D head model classification problem using 2D subspace analysis methods such as 2D principal component analysis (2D PCA[3]) and 2D fisher discriminant analysis (2DLDA[5]). It takes advantage of the fact that the histogram is a 2D image, and we can extract the most useful information from these 2D images to get a good result accordingingly. As a result, there are two main advantages: First, we can perform less calculation to obtain the same rate of classification; second, we can reduce the dimensionality more than PCA to obtain a higher efficiency.
3 D face structure extraction using shape matching morphing model
Feng Xue, Xiaoqing Ding
In general, the point correspondence and automatic face structure extraction are challenging problems. This is due to the fact that automatic extraction and matching of a set of significant feature points on different image views on the face, which are needed to recover the individual's 3-D face modal, is a very hard machine task. In this paper, in order to bypass this problem, our method recovers both the pose and the 3-D face coordinates using shape matching morphing model and iterative minimization of a metric based on the structure matching. A radial basis function (RBF) in 3-D is used to morph a generic face into the specific face structure and shape context (SC) is used to descript point shape. Basing on RBF and SC, shape distance is used to measure the similarity of two shapes. Experiment results are shown for images of real faces and promising result are obtained.
3D Scans of the Human II
icon_mobile_dropdown
Posture and re-positioning considerations for a complete torso topographic imaging system for assessing scoliosis
Peter O. Ajemba, Nelson G. Durdle, Doug L. Hill, et al.
The influence of posture and re-positioning (sway and breathing) on the accuracy of a torso imaging system for assessing scoliosis was evaluated. The system comprised of a rotating positioning platform and one or two laser digitizers. It required four partial-scans taken at 90o intervals over 10 seconds to generate two complete torso scans. Its accuracy was previously determined to be 1.1±0.9mm. Ten evenly spaced cross-sections obtained from forty scans of five volunteers in four postures (free-standing, holding side supports, holding front supports and with their hands on their shoulders) were used to assess the variability due to posture. Twenty cross-sections from twenty scans of two volunteers holding side supports were used to assess the variability due to positioning. The variability due to posture was less than 4mm at each cross-section for all volunteers. Variability due to sway ranged from 0-3.5mm while that due to breathing ranged from 0-3mm for both volunteers. Holding side supports was the best posture. Taking the four shots within 10 seconds was optimal. As major torso features that are indicative of scoliosis are larger than 4mm in size, the system could be used in obtaining complete torso images used in assessing and managing scoliosis.
Reverse engineering and rapid prototyping techniques to innovate prosthesis socket design
Giorgio Colombo, Massimiliano Bertetti, Daniele Bonacini, et al.
The paper presents an innovative approach totally based on digital data to optimize lower limb socket prosthesis design. This approach is based on a stump's detailed geometric model and provides a substitute to plaster cast obtained through the traditional manual methodology with a physical model, realized with Rapid Prototyping technologies; this physical model will be used for the socket lamination. The paper discusses a methodology to reconstruct a 3D geometric model of the stump able to describe with high accuracy and detail the complete structure subdivided into bones, soft tissues, muscular masses and dermis. Some different technologies are used for stump acquisition: non contact laser technique for external geometry, CT and MRI imaging technologies for the internal structure, the first one dedicated to bones geometrical model, the last for soft tissues and muscles. We discuss problems related to 3D geometric reconstruction: the patient and stump positioning for the different acquisitions, markers' definition on the stump to identify landmarks, alignment's strategies for the different digital models, in order to define a protocol procedure with a requested accuracy for socket's realization. Some case-studies illustrate the methodology and the results obtained.
4D data processing for dynamic human body analysis
It is expected that the next generation of full 3D optical scanning systems will be able to measure volumetric objects in motion. Standard data representations like point clouds or sets of triangle meshes, which are used nowadays for static 3D objects, will no longer be an efficient solution in this field. Systems of this kind will have to use other data processing and representation methods. We propose our own solution in this paper, using an arbitrary full 3D mesh which is scaled and wrapped around a merged point cloud obtained from the measurements, instead of a standard point cloud representation. This solution was specifically prepared for a prototype of a full-field 4D scanning system. This system is based on a dynamic laser triangulation. Four scanners capture a surface of a moving object from four different directions simultaneously. They are calibrated in time and space so finally we can obtain a full 3D object surface which changes in time. In this paper we present some details of the scanning system, 4D surface representation, general 4D data processing pipeline, developed algorithms and we finally show some exemplary results of our work in this field.
Measuring human movement for biomechanical applications using markerless motion capture
Lars Mündermann, Stefano Corazza, Ajit M. Chaudhari, et al.
Modern biomechanical and clinical applications require the accurate capture of normal and pathological human movement without the artifacts associated with standard marker-based motion capture techniques such as soft tissue artifacts and the risk of artificial stimulus of taped-on or strapped-on markers. In this study, the need for new markerless human motion capture methods is discussed in view of biomechanical applications. Three different approaches for estimating human movement from multiple image sequences were explored. The first two approaches tracked a 3D articulated model in 3D representations constructed from the image sequences, while the third approach tracked a 3D articulated model in multiple 2D image planes. The three methods are systematically evaluated and results for real data are presented. The role of choosing appropriate technical equipment and algorithms for accurate markerless motion capture is critical. The implementation of this new methodology offers the promise for simple, time-efficient, and potentially more meaningful assessments of human movement in research and clinical practice.
Poster Session
icon_mobile_dropdown
Development of measurement system of three-dimensional shape and surface reflectance
Takeo Miyasaka, Kazuo Araki
We describe a three-dimensional measurement system which can acquire not only three-dimensional shapes of target objects but also these surface reflectance parameters. The system is constructed by one or some digital cameras, digital projector, and a computer which controls cameras and projectors. For 3-D geometrical reconstruction, we use well known gray code structured light method. The method projects gray code light patterns from the projector and obtain illuminated scenes by cameras. We add additional light patterns for surface reflectance measurement. These patterns are all white and gray light pattern. To recover complete shape of the target object, the object is measured from various viewpoints repeatedly, or measured repeatedly from fixed viewpoint while be moving by hand or turn table. To end the measurement, relative positions of each obtained range data are calculated by ICP algorithm. For each small region of the target object surface, we calculate reflectance parameters from surface normals, viewpoint (camera viewpoint), and light position (the projector viewpoint). Enough sampling of these three information sources are obtained for each small surface, we estimate reflectance parameters for each surface points. We demonstrate this geometrical and reflectance measurement method by experiments for fewer objects.
Use of laser 3D surface digitizer in data collection and 3D modeling of anatomical structures
Kelly Tse, Hans Van Der Wall, Dzung H. Vu M.D.
A laser digitizer (Konica-Minolta Vivid 910) is used to obtain 3-dimensional surface scans of anatomical structures with a maximum resolution of 0.1mm. Placing the specimen on a turntable allows multiple scans allaround because the scanner only captures data from the portion facing its lens. A computer model is generated using 3D modeling software such as Geomagic. The 3D model can be manipulated on screen for repeated analysis of anatomical features, a useful capability when the specimens are rare or inaccessible (museum collection, fossils, imprints in rock formation.). As accurate measurements can be performed on the computer model, instead of taking measurements on actual specimens only at the archeological excavation site e.g., a variety of quantitative data can be later obtained on the computer model in the laboratory as new ideas come to mind. Our group had used a mechanical contact digitizer (Microscribe) for this purpose, but with the surface digitizer, we have been obtaining data sets more accurately and more quickly.
Volume intersection with imprecise camera parameters
Sayaka Sakamoto, Kenji Shoji, Hiroki Iwase, et al.
Volume intersection is one of the simplest techniques for reconstructing 3D shapes from 2D silhouettes. 3D shapes can be reconstructed from multiple view images by back-projecting them from the corresponding viewpoints and intersecting the resulting solid cones. The camera position and orientation (extrinsic camera parameters) of each viewpoint with respect to the object are needed to accomplish reconstruction. However, even a little variation in the camera parameters makes the reconstructed 3D shape smaller than that with the exact parameters. The problem of optimizing camera parameters dealt with in this paper is determining good approximations from multiple silhouette images and imprecise camera parameters. This paper examines attempts to optimize camera parameters by reconstructing a 3D shape via the method of volume intersection. Reprojecting the reconstructed 3D shape to image planes, the camera parameters are determined by finding the projected silhouette images that result in minimal loss of area when compared to the original silhouette images. For relatively large displacement of camera parameters we propose a method repeating the optimization using dilated silhouettes which gradually shrink to original ones. Results of experiment show the effect of it.
Development of ultra thin three-dimensional image capturing system
Kenji Yamada, Hiroko Mitsui, Toshiro Asano, et al.
We have developed the ultra thin three dimensional image capture system. The system uses a micro-lens array to form multiple images, which are captured on a photo-detector array. Digital processing of the captured multiple images is used to extract the surface profile. Preliminary experiments were executed on an evaluation system to verify the principles of the system. In this paper, we have proposed ultra thin three dimensional capture system. A compound-eye imaging system and post-processing are employed. Experimental results verify the principle of the proposed method and show the potential capability of the proposed system architecture.
Run-based volume intersection for shape recovery of objects from their silhouettes
Kenji Shoji, Sayaka Sakamoto, Hiroki Iwase, et al.
Volume intersection (VI) is a successful technique for reconstructing 3-D shapes from 2-D images (silhouettes) of multiple views. It consists of intersecting the cones formed by back-projecting each silhouette. The 3-D shapes reconstructed by VI are called visual hull (VH). In this paper we propose a fast method obtaining the VH. The method attempts to reduce the computational cost by using a run representation for 3-D objects called SPXY table that is previously proposed by us. It makes cones by back-projecting the 2-D silhouettes to the 3-D space through the centers of the lens and intersects them keeping the run representation. To intersect the cones of multiple views keeping the run representation, we must align the direction of runs representing the cones. To align them we use the method of swapping two axes of a run-represented object at the time cost of O(n) where n is a number of runs, which is also previously proposed by us. The results of experiments using VRML objects such as human bodies show that the proposed method can reconstruct a 3-D object in less than 0.17 s at the resolution of 220 × 220 × 220 voxels from a set of silhouettes of 8 viewpoints on a single CPU.
A prototype system for 3D measurement using flexible calibration method
We developed a 3D measurement system consists of a camera and a projector which can be calibrated easily at short times. In this system, the camera and the projector are calibrated in advance with Zhang's calibration method. The measurement procedure in this system is as follows. A calibrated camera and a calibrated projector are put in front of the calibration plane. Then the relative pose between the camera and the projector can be computed by projecting a number of light patterns from the projector onto the calibration plane and taking those images with the camera. And this system performs a 3D measurement with the gray code pattern projection. Since this system can be calibrated easily, this system does not need to be fixed exactly and the configuration of this system, which is the baseline and the measurement range, can be changed freely depending on the target and the purpose. This system can obtain a range data in a high accuracy of an error about 0.1% in spite of the fact that this system can be calibrated easily.
Synthesizing wide-angle and arbitrary view-point images from a circular camera array
We propose a technique of Imaged-Based Rendering(IBR) using a circular camera array. By the result of having recorded the scene as surrounding the surroundings, we can synthesize a more dynamic arbitrary viewpoint images and a wide angle images like a panorama . This method is based on Ray- Space, one of the image-based rendering, like Light Field. Ray-Space is described by the position (x, y) and a direction (θ, φ) of the ray's parameter which passes a reference plane. All over this space, when the camera has been arranged circularly, the orbit of the point equivalent to an Epipor Plane Image(EPI) at the time of straight line arrangement draws a sin curve. Although described in a very clear form, in case a rendering is performed, pixel of which position of which camera being used and the work for which it asks become complicated. Therefore, the position (u, v) of the position (s, t) pixel of a camera like Light Filed redescribes space expression. It makes the position of a camera a polar-coordinates system (r, theta), and is making it close to description of Ray-Space. Thereby, although the orbit of a point serves as a complicated periodic function of periodic 2pi, the handling of a rendering becomes easy. From such space, the same as straight line arrangement, arbitrary viewpoint picture synthesizing is performed only due to a geometric relationship between cameras. Moreover, taking advantage of the characteristic of concentrating on one circular point, we propose the technique of generating a wide-angle picture like a panorama. When synthesizing a viewpoint, since it is overlapped and is recording the ray of all the directions of the same position, this becomes possible. Having stated until now is the case where it is a time of the camera fully having been arranged and a plenoptic sampling being filled. The discrete thing which does not fill a sampling is described from here. When arranging a camera in a straight line and compounding a picture, in spite of assuming the pinhole camera model, an effect like a focus shows up. This is an effect peculiar to Light Field when a sampling is not fully performed, and is called a synthetic aperture. We have compounded all focal images by processing called an "Adaptive Filter" to such a phenomenon. An adaptive filter is the method of making the parallax difference map of perfect viewpoint dependence centering on a viewpoint to make. This is a phenomenon produced even when it has arranged circularly. Then, in circular camera arrangement, this adaptive filter is extended, and all focal pictures are compounded. Although there is a problem that an epipor line is not parallel etc. when it has arranged circularly, extension obtains enough, it comes only out of geometric information, and a certain thing is clarified By taking such a method, it succeeded in performing a wide angle and arbitrary viewpoint image synthesis also from discrete space also from the fully sampled space.
Procedure and algorithm of 3D reconstruction of large-scale ancient architecture
Song Xia, Yixuan Zhu, Xin Li
3D reconstruction plays an essential role in the documentation and protection of ancient architecture. 3D reconstruction and photogrammetry are mainly used to conserve the datum and restore the 3D model of large-scale ancient architecture in our work. The whole procedure and an algorithm on space polyhedron are investigated in this paper. Firstly lots of conspicuous feature points are laid around the huge granite in order to construct a local and temporary 3D controlling field with sufficiently high precision. And feature points on the granite are obtained by means of photogrammetry. We use DLT (Direct Linear Transform) to calculate coordinates of feature points and accuracy evaluation of all feature points can be obtained simultaneously. A new generation algorithm for spatial convex polyhedron is presented and realized efficiently in our research. And we can get 3D model of the granite. In order to reduce duplicate storage of points and edges of the model, model connection and optimization are performed to complete the modeling process. Realistic material can be attached to the 3D model in 3DMAX. At last rendering and animation of the 3D model are completed and we got the reconstructive model of the granite. We use the approach mentioned above to realize the 3D reconstruction of large-scale ancient architecture successfully.
Real-time 3D image-guided patient positioning in radiation therapy
Dezhi Liu, Gongjie Yin, Shidong Li
Patient positioning in modern radiotherapy is becoming far more important because a small positioning error may result in missing target and irradiating normal tissues in treatment of small lesions. Clinical outcome of radiotherapy can potentially be improved by increasing the precision of tumor localization and dose delivery during the treatment. In this paper an accurate and precise patient positioning system has been achieved through alignment of real-time three dimensional (3D) surface images with a reference surface image. The real-time 3D surface is captured using a state-of-art 3D stereovision system, and then is matched with the pre-defined reference image generated from treatment planning data. Positioning parameters are calculated by automatically aligning the real-time surface and the reference surface via a modified Iterative Closest Points (ICP) algorithm. Results from phantom experiments and clinical applications demonstrated the excellent efficacy of <2 minutes and the desired accuracy and precision of <1 mm in isocenter shifts and of <1 degree in rotations.