Denoising 3D images from time-of-flight cameras using extended anisotropic diffusion
Computer vision relies heavily on images and movies collected from camera recordings. Because these are essentially measurements taken from the real world, they are intrinsically noisy. Often, one of the first processing steps for advanced applications is to reduce the level of noise in the images. Standard image denoising techniques usually operate on single (streams of) images and assume a uniform noise level across them. If multiple streams of mutually dependent images (i.e., they share edges and so forth) are available and local noise levels can be estimated, such information can be exploited to improve denoising quality. We extend three standard denoising methods to exploit the additional information available in time-of-flight (TOF) camera data. One of these methods—extended anisotropic diffusion—can be implemented to operate in real time. It also improves TOF camera image quality, especially in high-noise regions, while still preserving edges and details in low-noise regions.
Time-of-flight range imaging records intensity and distance (range) information for each pixel by measuring the reflectance level and travel time of light issued by a light source close to the camera and reflected by objects in the recorded scene. Measurements of intensity and range made using this acquisition scheme are affected by certain kinds of noise.1 In particular, very high levels of noise occur in range-image regions with low levels of reflected light2 (e.g., dark image patches) because less data is available for estimating range. By the same token, regions of high reflection exhibit noise levels orders of magnitude lower. This inhomogeneity makes several standard denoising approaches unsuitable.
We have described three alternatives that extend standard denoising approaches to make use of the additional information available in TOF imaging by virtue of two linked (intensity and range) images.1,3 The intensity image gives information about noise levels in range-image regions that can be used for local adaptation of the denoising parameters. The alternative methods are these: We adapt the threshold in wavelet thresholding to local noise levels. We perform image segmentation to determine image regions of similar noise characteristics and the appropriate denoising parameters. Or we modify the diffusion (i.e., smoothing) process of anisotropic (actually, Perona-Malik) diffusion filtering to take different noise levels into account, which we call extended anisotropic diffusion, or EAD.
|Scene characteristics||Averaging||EAD||Clustering||Wavelet||SNR (dB)|
|Very low noise||0.15±0.020||0.03±0.005||0.09±0.039||0.07±0.011||11.69±2.336|
|Very strong noise||1.17±0.137||0.28±0.041||0.26±0.164||0.46±0.098||−4.66±3.219|
Each of these methods improves the denoising technique traditionally used on TOF image data—temporal averaging—given the same number of images for each method. Still, the most robust and effective variant is EAD, which allows real-time processing of images (at least with GPU—graphics processing unit —computing support) and qualitatively achieves the best and fairly robust results (see Table 1 and Figure 1).
Standard isotropic Gaussian smoothing can be formulated as differential equations.4 Here, ϕ is the range image, the smoothing strength is influenced by the length of time tof diffusion propagation, and ∇ is the operator that computes the vector of partial derivatives, ∂). Thus,
similar to the common formulation of anisotropic diffusion,
which introduces (an example for) the term D to inhibit diffusion between neighboring image pixels if they are located on an edge in the image (c sets the strength of diffusion inhibition).5 This formulation takes into account prior knowledge that strong edges are less likely to be caused by noise than are isolated peaks or weaker edges.
We extend anisotropic diffusion along these lines by incorporating further prior knowledge taken from the intensity image. Specifically, this is knowledge about the local noise level, derived from the reciprocal relationship between intensity and range noise level, and knowledge about edges in the intensity image that makes edges in the range image more probable. Color and material changes are often caused by their belonging to different objects in different ranges.
As a result, in EAD we extend D to be
The local (i.e., location-dependent) noise variance term in the numerator scales diffusion according to the local noise level, while the terms in the denominator inhibit diffusion at edges in the range image ϕ, edges in the intensity image I, and edges in the noise level image σ2ϕ derived from I or subsequent range images. To increase robustness, these images, and specifically σ2ϕ, can be smoothed before application of this term.
In summary, we have taken a number of standard image denoising approaches, extended them to take into account the special noise structure in TOF camera images, and evaluated the modified versions on synthetic and real image data. A comparison with common temporal denoising shows that, given the same amount of image data, all these methods are superior to averaging or to their non-extended denoising counterparts. The EAD method in particular proved to be very useful in terms of computation, as well as in denoising. We are continuing our research on TOF camera data with respect to scene analysis as well as further improving denoising for streamed image data.
The author is grateful for the support of the Austrian Funding Program COMET.
Holger Schöner received an MS from University of Colorado at Boulder and from the Technical University of Berlin, where, in 2005, he also received his PhD in computer science. At SCCH he is working on and researching applied problems in data mining, machine learning, and image processing.