A single-camera system captures high-resolution 3D images in one shot
Although we live in a 3D world, most of the data we record and display is still in 1D or 2D formats. As a result, data from the third dimension, which is often important in aiding human judgments, cannot be stored. In many cases, the depth information that is lost when data is stored in 1D or 2D formats is critical for measuring an object's surface shape and profile, as well as perceiving its distance. To address these issues, researchers have been investigating methods of acquiring 3D information from objects and scenes for many years.1 There are several existing techniques for 3D measurements, including laser scanning techniques, and grating projection, stereo vision, and time-of-flight methods.
We introduce a prism/mirror optical setup to combine two stereo-view pairs into one image using a single digital camera. This 3D imaging technology combines structured light projection and stereo vision techniques in a unique way: it splits a single camera view into stereo vision, with an option to project a color-coded structure light onto the object using a synchronized flash light source, as shown in Figure 1.2,3 It achieves 3D imaging in a single flash of less than 1/100 of a second, and is therefore robust to object motion and changing environments. It is accurate down to 0.1mm in depth resolution. We have also developed advanced algorithms for generating 3D models fast and accurately.
Before it is used to capture data, the 3D camera is first calibrated. The calibration procedure calculates three external parameters: the rotation matrix R, the translation vector t, and the camera internal calibration matrix K. The parameters are further optimized by applying a bundle adjustment algorithm,4 which can be used to determine the optimal position of the camera and 3D data points during the capture process.
After the camera's internal parameters have been defined and set, the image can be rectified. The 3D point pairs are on the same parallel lines between two stereo image pairs, as shown in Figure 2. In this way, the stereo matching process shifts in only one dimension, saving time.
We have used the 3D imaging system to capture both live and static objects. A digital camera with a resolution of 5Mpixels was used, and single shots were taken for each image view. In addition to the hardware setup, software was implemented to calibrate the camera parameters, download the images from the digital camera via a USB link, reconstruct the 3D models, stitch the models together, and display them.
Figure 3 shows the 3D reconstruction of the face of a model. A color-coded grid pattern was flashed onto the surface as the camera captured the stereo image pair: see Figure 3(a). The software then reconstructed the 3D model with detailed facial features, as shown in Figure 3(b). Finally, smoothing of the surface, shown in Figure 3(c), gives a better visual appearance, but sacrifices the fine grain of the image.
The 3D camera has a resolution of 1.546mm/pixel, and a sub-pixel matching algorithm can be used in the 3D calculation, increasing the resolution between 1/3 and 1/10 of a pixel. The final minimum resolution is thus increased to 0.515mm, and the mean error is decreased to 0.17mm.
There are numerous potential applications for the system. Three-dimensional vision can be used to improve the robustness of object recognition, and our imaging technique enables real-time acquisition of 3D features and profiles that a regular 2D camera cannot achieve. A 3D model of a target gives complete information, including physical size, surface profile, illumination and shading, viewing angles, and distance information, that cannot be obtained using conventional imaging modalities.
The system can be adapted to capture a 3D face model instead of a 2D mug shot. Figure 4 shows an example of an unwrapped texture map of a 3D head model. It can be used to register a medical patient's body with ultrasound, magnetic resonance or computed tomography imaging, or in plastic and reconstructive surgery for 3D modeling, simulation, visualization, and quantitative measurement of body parts. The technology also has potential in facial mask generation for noncontact burn patients, and can be used for industry applications such as machine vision, real-time 3D imaging and inspection, rapid prototyping, and modeling of machine parts.
The authors wish to thank Frank Wu and John J. Zhang for valuable input and software implementation.
Thomas Lu is a senior engineer at the Jet Propulsion Laboratory. He led the research and development of high-speed 3D imaging systems, hyperspectral image processing, and parallel optical processors. He has coauthored two book chapters and more than 50 professional papers, and has seven US patents and several international ones. In addition, in February of 1998 he was an organizer and co-chairman of the SPIE International Conference on Multidimensional Spectroscopy: Acquisition, Interpretation, and Automation in San Jose.
Tien-Hsin Chao is a principal scientist and group leader at the Jet Propulsion Laboratory. His research interests include optical and digital pattern recognition, neural network signal/image processing and applications, and development of hyperspectral Fourier transform imaging spectrometers and holographic data storage systems. A SPIE Fellow, he has published more than 80 technical papers and has organized and chaired the annual Optical Pattern Recognition Conference for SPIE since 1990.