Proceedings Volume 9273

Optoelectronic Imaging and Multimedia Technology III

Qionghai Dai, Tsutomu Shimura
cover
Proceedings Volume 9273

Optoelectronic Imaging and Multimedia Technology III

Qionghai Dai, Tsutomu Shimura
View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 10 December 2014
Contents: 10 Sessions, 100 Papers, 0 Presentations
Conference: SPIE/COS Photonics Asia 2014
Volume Number: 9273

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Front Matter: Volume 9273
  • Optical Information Processing
  • Multispectral and Hyperspectral Imaging
  • High-Speed and High-Resolution Imaging
  • 3D Image/Video Systems
  • Machine Vision Methods, Architectures, and Applications
  • Computational Imaging
  • Image/Video Analysis, Processing, and Retrieval
  • Time-of-Flight Imaging
  • Poster Session
Front Matter: Volume 9273
icon_mobile_dropdown
Front Matter: Volume 9273
This PDF file contains the front matter associated with SPIE Proceedings Volume 9273, including the Title Page, Copyright information, Table of Contents, and Conference Committee listing.
Optical Information Processing
icon_mobile_dropdown
Illuminant spectrum estimation using a digital color camera and a color chart
Junsheng Shi, Hongfei Yu, Xiaoqiao Huang, et al.
Illumination estimation is the main step in color constancy processing, also an important prerequisite for digital color image reproduction and many computer vision applications. In this paper, a method for estimating illuminant spectrum is investigated using a digital color camera and a color chart under the situation when the spectral reflectance of the chart is known. The method is based on measuring CIEXYZ of the chart using the camera. The first step of the method is to gain camera′s color correction matrix and gamma values by taking a photo of the chart under a standard illuminant. The second step is to take a photo of the chart under an estimated illuminant, and the camera′s inherent RGB values are converted to the standard sRGB values and further converted to CIEXYZ of the chart. Based on measured CIEXYZ and known spectral reflectance of the chart, the spectral power distribution (SPD) of the illuminant is estimated using the Wiener estimation and smoothing estimation. To evaluate the performance of the method quantitatively, the goodnessfitting coefficient (GFC) was used to measure the spectral match and the CIELAB color difference metric was used to evaluate the color match between color patches under the estimated and actual SPDs. The simulated experiment was carried to estimate CIE standard illuminant D50 and C using X-rite ColorChecker 24-color chart, the actual experiment was carried to estimate daylight and illuminant A using two consumergrade cameras and the chart, and the experiment results verified feasible of the investigated method.
High dynamic range coding imaging system
We present a high dynamic range (HDR) imaging system design scheme based on coded aperture technique. This scheme can help us obtain HDR images which have extended depth of field. We adopt Sparse coding algorithm to design coded patterns. Then we utilize the sensor unit to acquire coded images under different exposure settings. With the guide of the multiple exposure parameters, a series of low dynamic range (LDR) coded images are reconstructed. We use some existing algorithms to fuse and display a HDR image by those LDR images. We build an optical simulation model and get some simulation images to verify the novel system.
Compressive photography based on lens array with coded mask
Yuanchao Du, Xingzheng Wang, Haoqian Wang, et al.
Plenoptic camera records the 4D light field data by storing the spatial information and angular information. Meanwhile, it introduces the trade-off between spatial resolution and angular resolution. We proposed a new camera design which has been modulated in Fourier domain. High resolution 4D light field could be reconstructed from the coded image by sparse reconstruction. A simulation is carried out to evaluate the performance of the camera design. The reconstructed light field has a better performance than the conventional plenoptic camera.
A Model of PSF Estimation for Coded Mask Infrared Imaging
The point spread function (PSF) of imaging system with coded mask is generally acquired by practical measure- ment with calibration light source. As the thermal radiation of coded masks are relatively severe than it is in visible imaging systems, which buries the modulation effects of the mask pattern, it is difficult to estimate and evaluate the performance of mask pattern from measured results. To tackle this problem, a model for infrared imaging systems with masks is presented in this paper. The model is composed with two functional components, the coded mask imaging with ideal focused lenses and the imperfection imaging with practical lenses. Ignoring the thermal radiation, the systems PSF can then be represented by a convolution of the diffraction pattern of mask with the PSF of practical lenses. To evaluate performances of different mask patterns, a set of criterion are designed according to different imaging and recovery methods. Furthermore, imaging results with inclined plane waves are analyzed to achieve the variation of PSF within the view field. The influence of mask cell size is also analyzed to control the diffraction pattern. Numerical results show that mask pattern for direct imaging systems should have more random structures, while more periodic structures are needed in system with image reconstruction. By adjusting the combination of random and periodic arrangement, desired diffraction pattern can be achieved.
Multispectral and Hyperspectral Imaging
icon_mobile_dropdown
Real-time automatic small infrared target detection using local spectral filtering in the frequency domain
Accurate and fast detection of small infrared target has very important meaning for infrared precise guidance, early warning, video surveillance, etc. Based on human visual attention mechanism, an automatic detection algorithm for small infrared target is presented. In this paper, instead of searching for infrared targets, we model regular patches that do not attract much attention by our visual system. This is inspired by the property that the regular patches in spatial domain turn out to correspond to the spikes in the amplitude spectrum. Unlike recent approaches using global spectral filtering, we define the concept of local maxima suppression using local spectral filtering to smooth the spikes in the amplitude spectrum, thereby producing the pop-out of the infrared targets. In the proposed method, we firstly compute the amplitude spectrum of an input infrared image. Second, we find the local maxima of the amplitude spectrum using cubic facet model. Third, we suppress the local maxima using the convolution of the local spectrum with a low-pass Gaussian kernel of an appropriate scale. At last, the detection result in spatial domain is obtained by reconstructing the 2D signal using the original phase and the log amplitude spectrum by suppressing local maxima. The experiments are performed for some real-life IR images, and the results prove that the proposed method has satisfying detection effectiveness and robustness. Meanwhile, it has high detection efficiency and can be further used for real-time detection and tracking.
Texture-adaptive hyperspectral video acquisition system with a spatial light modulator
We present a new hybrid camera system based on spatial light modulator (SLM) to capture texture-adaptive high-resolution hyperspectral video. The hybrid camera system records a hyperspectral video with low spatial resolution using a gray camera and a high-spatial resolution video using a RGB camera. The hyperspectral video is subsampled by the SLM. The subsampled points can be adaptively selected according to the texture characteristic of the scene by combining with digital imaging analysis and computational processing. In this paper, we propose an adaptive sampling method utilizing texture segmentation and wavelet transform (WT). We also demonstrate the effectiveness of the sampled pattern on the SLM with the proposed method.
Color correction of underwater images using spectral data
Due to the absorption and scattering of water, images acquired in underwater environment have different colors from those in air, which can cause problem for image processing and object recognition. Addressing the problem of color correction, this paper presents a method of color restoration based on water absorption spectrum. Considering the nonlinear attenuate of light in different wavelength at different depths, the changes of tri-stimulus values are calculated. Experiments are carried out in coastal seawater. The change of tri-stimulus values are used to compensate color loss. The results demonstrate the feasibility of our method.
Non-negative structural sparse representation for high-resolution hyperspectral imaging
Guiyu Meng, Guangyu Li, Weisheng Dong, et al.
High resolution hyperspectral images have important applications in many areas, such as anomaly detection, target recognition and image classification. Due to the limitation of the sensors, it is challenging to obtain high spatial resolution hyperspectral images. Recently, the methods that reconstruct high spatial resolution hyperspectral images from the pair of low resolution hyperspectral images and high resolution RGB image of the same scene have shown promising results. In these methods, sparse non-negative matrix factorization (SNNMF) technique was proposed to exploit the spectral correlations among the RGB and spectral images. However, only the spectral correlations were exploited in these methods, ignoring the abundant spatial structural correlations of the hyperspectral images. In this paper, we propose a novel algorithm combining the structural sparse representation and non-negative matrix factorization technique to exploit the spectral-spatial structure correlations and nonlocal similarity of the hyperspectral images. Compared with SNNMF, our method makes use of both the spectral and spatial redundancies of hyperspectral images, leading to better reconstruction performance. The proposed optimization problem is efficiently solved by using the alternating direction method of multipliers (ADMM) technique. Experiments on a public database show that our approach performs better than other state-of-the-art methods on the visual effect and in the quantitative assessment.
High-Speed and High-Resolution Imaging
icon_mobile_dropdown
Implementing two compressed sensing reconstruct algorithms on GPU
Compressed sensing (CS) is a new branch for information theory from the development of mathematical in 21st. CS provides a state-of-art technique that we can reconstruct sparse signal from a very limited number of measurements. In CS, reconstruct algorithm often need dense computation. The well-know algorithms like Basis Pursuit (BP) or Matching Pursuit (MP) is not likely to implement in PCs in practice. In this paper, we consider to use GPU (Graphic Processing Unit) and its large-scale computation ability to solve this problem. Based on the recently released NVIDIA CUDA 6.0 Tool Kit and CUBLAS library we study the GPU implementation of Orthogonal Matching Pursuit (OMP), and Two-Step Iterative Shrinkage algorithm (TwIST) implementing on GPU. The result shows that compared with CPU, implementing those algorithms on GPU can get an obvious speed up without losing any accuracy.
High-resolution light field camera based on a hybrid imaging system
Feng Dai, Jing Lu, Yike Ma, et al.
Compared to traditional digital cameras, light field (LF) cameras measure not only the intensity of rays, but also their light field information. As LF cameras trade a good deal of spatial resolution for extra angular information, they provide lower spatial resolution than traditional digital cameras. In this paper, we show a hybrid imaging system consisting of a LF camera and a high-resolution traditional digital camera, achieving both high spatial resolution and high angular resolution. We build an example prototype using a Lytro camera and a DSLR camera to generate a LF image with 10 megapixel spatial resolution and get high-resolution digital refocused images, multi-view images and all-focused images.
3D Image/Video Systems
icon_mobile_dropdown
A semi-automatic 2D-to-3D video conversion with adaptive key-frame selection
To compensate the deficit of 3D content, 2D to 3D video conversion (2D-to-3D) has recently attracted more attention from both industrial and academic communities. The semi-automatic 2D-to-3D conversion which estimates corresponding depth of non-key-frames through key-frames is more desirable owing to its advantage of balancing labor cost and 3D effects. The location of key-frames plays a role on quality of depth propagation. This paper proposes a semi-automatic 2D-to-3D scheme with adaptive key-frame selection to keep temporal continuity more reliable and reduce the depth propagation errors caused by occlusion. The potential key-frames would be localized in terms of clustered color variation and motion intensity. The distance of key-frame interval is also taken into account to keep the accumulated propagation errors under control and guarantee minimal user interaction. Once their depth maps are aligned with user interaction, the non-key-frames depth maps would be automatically propagated by shifted bilateral filtering. Considering that depth of objects may change due to the objects motion or camera zoom in/out effect, a bi-directional depth propagation scheme is adopted where a non-key frame is interpolated from two adjacent key frames. The experimental results show that the proposed scheme has better performance than existing 2D-to-3D scheme with fixed key-frame interval.
Joint bit allocation for 3D video coding based on virtual view distortion
In multi-view plus depth (MVD) 3D video coding, texture maps and depth maps are coded jointly. The depth maps provide the scene geometry information and are used to render the virtual view at the terminal through a Depth-Image-Based-Rendering (DIBR) technique. The distortion of the coded texture maps and depth maps will induce synthesized virtual view distortion. Besides the coding efficiency of texture maps and depth maps, bit allocation between texture maps and depth maps also has a great effect on the virtual view quality. In this paper, the virtual view distortion is divided into texture maps induced distortion and depth maps induced distortion separately, models of texture maps induced virtual view distortion and depth maps induced virtual view distortion are derived respectively. Based on the depth maps induced virtual view distortion model, depth maps coding Rate Distortion Optimization (RDO) is modified and the depth maps coding efficiency is increased. Meanwhile, we also propose a Rate-distortion (R-D) model to solve the joint bit allocation problem. Experimental results demonstrate the high accuracy of the proposed virtual view distortion model. The R-D performance of the proposed algorithm is close to the full search algorithm that can give the best R-D performance, while the coding complexity of the proposed algorithm is lower. Compared with fixed texture and depth bits ratio (5:1), an average 0.3 dB gains can be achieved by the proposed algorithm. The proposed algorithm has high rate control accuracy with the average error less than 1%.
A depth video processing algorithm for high encoding and rendering performance
Mingsong Guo, Fen Chen, Chengkai Sheng, et al.
In free viewpoint video system, the color and the corresponding depth video are utilized to synthesize the virtual views by depth image based rendering (DIBR) technique. Hence, high quality of depth videos is a prerequisite for high quality of virtual views. However, depth variation, caused by scene variance and limited depth capturing technologies, may increase the encoding bitrate of depth videos and decrease the quality of virtual views. To tackle these problems, a depth preprocess method based on smoothing the texture and abrupt changes of depth videos is proposed to increase the accuracy of depth videos in this paper. Firstly, a bilateral filter is adopted to smooth the whole depth videos and protect the edge of depth videos at the same time. Secondly, abrupt variation is detected by a threshold calculated according to the camera parameter of each video sequence. Holes of virtual views occur when the depth values of left view change obviously from low to high in horizontal direction or the depth values of right view change obviously from high to low. So for the left view, depth value difference in left side gradually becomes smaller where it is greater than the thresholds. And then, in right side of right view is processed likewise. Experimental results show that the proposed method can averagely reduce the encoding bitrate by 25% while the quality of the synthesized virtual views can be improve by 0.39dB on average compared with using original depth videos. The subjective quality improvement is also achieved.
A novel virtual viewpoint merging method based on machine learning
Di Zheng, Zongju Peng, Hui Wang, et al.
In multi-view video system, multiple video plus depth is main data format of 3D scene representation. Continuous virtual views can be generated by using depth image based rendering (DIBR) technique. DIBR process includes geometric mapping, hole filling and merging. Unique weights, inversely proportional to the distance between the virtual and real cameras, are used to merge the virtual views. However, the weights might not the optimal ones in terms of virtual view quality. In this paper, a novel virtual view merging algorithm is proposed. In the proposed algorithm, machine learning method is utilized to establish an optimal weight model. In the model, color, depth, color gradient and sequence parameters are taken into consideration. Firstly, we render the same virtual view from left and right views, and select the training samples by using a threshold. Then, the eigenvalues of the samples are extracted and the optimal merging weights are calculated as training labels. Finally, support vector classifier (SVC) is adopted to establish the model which is used for guiding virtual views rendering. Experimental results show that the proposed method can improve the quality of virtual views for most sequences. Especially, it is effective in the case of large distance between the virtual and real cameras. And compared to the original method of virtual view synthesis, the proposed method can obtain more than 0.1dB gain for some sequences.
A foreground object features-based stereoscopic image visual comfort assessment model
Xin Jin, G. Jiang, H. Ying, et al.
Since stereoscopic images provide observers with both realistic and discomfort viewing experience, it is necessary to investigate the determinants of visual discomfort. By considering that foreground object draws most attention when human observing stereoscopic images. This paper proposes a new foreground object based visual comfort assessment (VCA) metric. In the first place, a suitable segmentation method is applied to disparity map and then the foreground object is ascertained as the one having the biggest average disparity. In the second place, three visual features being average disparity, average width and spatial complexity of foreground object are computed from the perspective of visual attention. Nevertheless, object’s width and complexity do not consistently influence the perception of visual comfort in comparison with disparity. In accordance with this psychological phenomenon, we divide the whole images into four categories on the basis of different disparity and width, and exert four different models to more precisely predict its visual comfort in the third place. Experimental results show that the proposed VCA metric outperformance other existing metrics and can achieve a high consistency between objective and subjective visual comfort scores. The Pearson Linear Correlation Coefficient (PLCC) and Spearman Rank Order Correlation Coefficient (SROCC) are over 0.84 and 0.82, respectively.
Machine Vision Methods, Architectures, and Applications
icon_mobile_dropdown
Unsupervised abnormal crowd activity detection using interaction power model
Abnormal event detection in crowded scenes is one of the most challenging tasks in the video surveillance for the public security control. Different from previous work based on learning. We proposed an unsupervised Interaction Power model with an adaptive threshold strategy to detect abnormal group activity by analyzing the steady state of individuals’ behaviors in the crowed scene. Firstly, the optical flow field of the potential pedestrians is only calculated within the extracted foreground to reduce the computational cost. Secondly, each pedestrian can be divided into patches of the same size, and the interaction power of the pedestrians will be represented by the motion particles which describe the motion status at the center pixels of the patches. The motion status of each patch is computed by using the optical flows of the pixels within the patch. For each motion particle, its interaction power, defined as its steady state of the current behavior, is computed among all its neighboring motion particles. Finally, the dense crowds’ steady state can be represented as a collection of motion particles’ interaction power. Here, an adaptive threshold strategy is proposed to detect abnormal events by examining the frame power field which is a fixed-size random sampling of the interaction power of motion particles. Experimental results on the standard UMN dataset and online videos show that our method could detect the crowd anomalies and achieve a higher accuracy compared to the other competitive methods published recently.
A crude to fine method to detect the salient region
Xiaodong Hu, Hong Zhang, Hao Chen, et al.
The task of salient region detection aims at establishing the most important and informative regions of an image. In this work, we propose a novel method that tackles such task as a process from superpixel-level locating to pixel-level refining. Firstly, we over-segment the image into superpixels and compute an affinity matrix to estimate the similarity between each two superpixels according to both color contrast and space distribution. The matrix is then applied to aggregate superpixels into several clusters by using affinity propagation. To measure the saliency of each cluster, three parameters are taken into account including color contrast, cluster compactness and proximity to the focus. We appoint the most salient one to three clusters as the crude salient region. For the refining step, we regard each selected superpixel as an influential center. Hence, the saliency value of a pixel is simultaneously determined by all the selected superpixels. Practically, several Gauss curves are constructed based on the selected superpixels. Pixel-wise saliency value is decided by the color distinction and spatial distance between one pixel and the curves’ centers. We evaluate our algorithm on the publicly available dataset with human annotations, and experimental results show that our approach has competitive performance.
Tablet-based two-dimensional measurement for estimating the embryo area of brown rice
Yuttana Intaravanne, Sarun Sumriddetchkajorn, Kosom Chaitavon, et al.
The embryo or germ of a rice seed is growing to the shoot and the root parts of a seedling. In the early stage, the germinated embryo directly receives food from the endosperm. How healthy of the seedling can be physically predicted by measuring the areas of the embryo and endosperm. In this work, we show for the first time how the embryo and endosperm areas of a brown rice can be spatially measured. Our key design is based on the utilization of a tablet equipped with our lens module for capturing the rice seed image under white light illumination. Our Windows-based program is developed to analyze and separate the image of the whole brown rice into the embryo and endosperm parts within 2 seconds per seed. Our tablet-based system is just 30×30×6 cm3 with 1 kilogram in weight, capable to easily carry to perform in the field.
A vein display system based on three-dimensional reconstruction
Venipuncture is the most common way of all invasive medical procedures. A vein display system can make vein access easier by capturing the vein information and projecting a visible vein image onto the skin, which is correctly aligned with the subject’s vein. The existing systems achieve correct alignment by the design of coaxial structure. Such a structure causes complex optical and mechanical design and big physical dimensions inevitably. In this paper, we design a stereovision- based vein display system, which consists of a pair of cameras, a DLP projector and a near-infrared light source. We recover the three-dimensional venous structure from image pair acquired from two near-infrared cameras. Then the vein image from the viewpoint of projector is generated from the three-dimensional venous structure and projected exactly onto skin by the DLP projector. Since the stereo cameras get the depth information of vessels, the system can make sure the alignment of projected veins and the real veins without a coaxial structure. The experiment results prove that we propose a feasible solution for a portable and low-cost vein display device.
Detection for manipulation history of seam insertion and contrast enhancement
Jianwei Li, Yao Zhao, Rongrong Ni
With the development of manipulations techniques of digital images, digital image forensic technology is becoming more and more necessary. However, the determination of processing history of multi-operation is still a challenge problem. In this paper, we improve the traditional seam insertion algorithm, and propose corresponding detection method. Then an algorithm that focuses on detecting the processing history of seam insertion and contrast enhancement is proposed, which can be widely used in practical image forgery. Based on comprehensive analysis, we have discovered the inherent relationship between seam insertion and contrast enhancement. Different orders of processing make different impacts on images. By using the newly proposed algorithm, both contrast enhancement followed by seam insertion and seam insertion followed by contrast enhancement can be detected correctly. Plenty of experiments have been implemented to prove the accuracy.
Computational Imaging
icon_mobile_dropdown
Light-field-based phase imaging
Jingdan Liu, Tingfa Xu, Weirui Yue, et al.
Phase contains important information about the diffraction or scattering property of an object, and therefore the imaging of phase is vital to many applications including biomedicine and metrology, just name a few. However, due to the limited bandwidth of image sensors, it is not possible to directly detect the phase of an optical field. Many methods including the Transport of Intensity Equation (TIE) have been well demonstrated for quantitative and non-interferometric imaging of phase. The TIE offers an experimentally simple technique for computing phase quantitatively from two or more defocused images. Usually, the defocused images were experimentally obtained by shifting the camera along the optical axis with slight intervals. Note that light field imaging has the capability to take an image stack focused at different depths by digital refocusing the captured light field of a scene. In this paper, we propose to combine Light Field Microscopy and the TIE method for phase imaging, taking the digital-refocusing advantage of Light Field Microscopy. We demonstrate the propose technique by simulation results. Compare with the traditional camera-shifting technique, light-field imaging allows the capturing the defocused images without any mechanical instability and therefore demonstrate advantage in practical applications.
Light field creating and imaging with different order intensity derivatives
Yu Wang, Huan Jiang
Microscopic image restoration and reconstruction is a challenging topic in the image processing and computer vision, which can be widely applied to life science, biology and medicine etc. A microscopic light field creating and three dimensional (3D) reconstruction method is proposed for transparent or partially transparent microscopic samples, which is based on the Taylor expansion theorem and polynomial fitting. Firstly the image stack of the specimen is divided into several groups in an overlapping or non-overlapping way along the optical axis, and the first image of every group is regarded as reference image. Then different order intensity derivatives are calculated using all the images of every group and polynomial fitting method based on the assumption that the structure of the specimen contained by the image stack in a small range along the optical axis are possessed of smooth and linear property. Subsequently, new images located any position from which to reference image the distance is Δz along the optical axis can be generated by means of Taylor expansion theorem and the calculated different order intensity derivatives. Finally, the microscopic specimen can be reconstructed in 3D form using deconvolution technology and all the images including both the observed images and the generated images. The experimental results show the effectiveness and feasibility of our method.
Image/Video Analysis, Processing, and Retrieval
icon_mobile_dropdown
A compression method of EIs in integral imaging
Xuesong Li, Shigang Wang, Yuanzhi Lu, et al.
Integral Imaging is a technique capable of reproducing a continuous parallax, full-color, continuous point of view, and real perspectives of the scene.Since the amount of information contained in an element image array(EIA) is far greater than ordinary image, the storage and transmission caused great difficulties. When the difference between the depths of most objects in the scene is not great, and the distance from the camera to the objects is not far. For the above case, this paper proposes a method to compress the element images(EIs) . Since the resolution of each element image(EI) is small, so the matching displacements of all pixels in one EI are nearly the same. For instance, one Integral Image is composed of 12×1 EIs, and each resolution of each EI is 20×20. If the matching displacement between adjacent EIs is 5 pixels, then pick out 1 EI from every 4 EIs at the same interval, so we can get 3 EIs. Next, splice the 3 EIs together to form one image. Processing the remaining EIs in the same way, at last we can get a total of 4 spliced images. Compress the 4 spliced images with video compression method.
Full-reference quality assessment of stereoscopic images by learning sparse monocular and binocular features
Kemeng Li, Feng Shao, Gangyi Jiang, et al.
Perceptual stereoscopic image quality assessment (SIQA) aims to use computational models to measure the image quality in consistent with human visual perception. In this research, we try to simulate monocular and binocular visual perception, and proposed a monocular-binocular feature fidelity (MBFF) induced index for SIQA. To be more specific, in the training stage, we learn monocular and binocular dictionaries from the training database, so that the latent response properties can be represented as a set of basis vectors. In the quality estimation stage, we compute monocular feature fidelity (MFF) and binocular feature fidelity (BFF) indexes based on the estimated sparse coefficient vectors, and compute global energy response similarity (GERS) index by considering energy changes. The final quality score is obtained by incorporating them together. Experimental results on four public 3D image quality assessment databases demonstrate that in comparison with the most related existing methods, the devised algorithm achieves high consistency alignment with subjective assessment.
Optical image encryption based on a modified radial shearing interferometer
We present an optical image encryption method based on a modified radial shearing interferometer. In our encryption process, a plaintext image is first encoded into a phase-only mask (POM), and then modulated by a random phase mask (RPM), the result is regarded as the input of the radial shearing interferometer and divided into two coherent lights, one of which will be further modulated by a random amplitude mask (RAM). After all, these two coherent lights will interfere with each other leading to an interferogram, i.e., ciphertext. And the ciphertext can be used to retrieve the plaintext image with the help of a recursive algorithm and all correct keys. The aforementioned encryption procedure can be achieved digitally or optically while the decryption process can be analytically accomplished. Numerical simulation is provided to demonstrate the validity of this method.
Time-of-Flight Imaging
icon_mobile_dropdown
Computational imaging of light in flight
Many computer vision tasks are hindered by image formation itself, a process that is governed by the so-called plenoptic integral. By averaging light falling into the lens over space, angle, wavelength and time, a great deal of information is irreversibly lost. The emerging idea of transient imaging operates on a time resolution fast enough to resolve non-stationary light distributions in real-world scenes. It enables the discrimination of light contributions by the optical path length from light source to receiver, a dimension unavailable in mainstream imaging to date. Until recently, such measurements used to require high-end optical equipment and could only be acquired under extremely restricted lab conditions. To address this challenge, we introduced a family of computational imaging techniques operating on standard time-of-flight image sensors, for the first time allowing the user to “film” light in flight in an affordable, practical and portable way. Just as impulse responses have proven a valuable tool in almost every branch of science and engineering, we expect light-in-flight analysis to impact a wide variety of applications in computer vision and beyond.
Partial scene reconstruction using Time-of-Flight imaging
Yuchen Zhang, Hongkai Xiong
This paper is devoted to generating the coordinates of partial 3D points in scene reconstruction via time of flight (ToF) images. Assuming the camera does not move, only the coordinates of the points in images are accessible. The exposure time is two trillionths of a second and the synthetic visualization shows that the light moves at half a trillion frames per second. In global light transport, direct components signify that the light is emitted from a light point and reflected from a scene point only once. Considering that the camera and source light point are supposed to be two focuses of an ellipsoid and have a constant distance at a time, we take into account both the constraints: (1) the distance is the sum of distances which light travels between the two focuses and the scene point; and (2) the focus of the camera, the scene point and the corresponding image point are in a line. It is worth mentioning that calibration is necessary to obtain the coordinates of the light point. The calibration can be done in the next two steps: (1) choose a scene that contains some pairs of points in the same depth, of which positions are known; and (2) take the positions into the last two constraints and get the coordinates of the light point. After calculating the coordinates of scene points, MeshLab is used to build the partial scene model. The proposed approach is favorable to estimate the exact distance between two scene points.
Three-dimensional patterning in transparent materials with spatiotemporally-focused femtosecond laser pulses
Fei He, Zhaohui Wang, Bin Zeng, et al.
According to specific configurations, three-dimensional (3D) patterning involves both 3D bioimaging and laser micromachining. Recent advances in bioimaging have witnessed strong interests in the exploration of novel microscopy methods capable of dynamic imaging of living organisms with high resolution, and large field of view (FOV). For most, applications of bioimaging should be limited by the tradeoff between the speed, resolution, and FOV in common techniques, e.g., confocal laser scanning microscopy and two-photon microscopy. However, a recently proposed temporal focusing (TF) technique, based on spatio/temporal shaping of femtosecond laser pulses, enables depth-resolved bioimaging in a wide-field illumination. This lecture firstly provides a glimpse into the state-of-the-art progress of temporal focusing for bioimaging applications. Then we reveal a bizarre point spread function (PSF) of the temporal focusing system, both experimentally and theoretically. It can be expected that this newly emerged technique will exhibited new advances in not only 3D nonlinear bioimaging but also femtosecond laser micromachining in the future.
Depth map super-resolution and enhancement for time-of-flight cameras
Yangguang Li, Lei Zhang, Yongbing Zhang
This paper presents a novel method for solving the super-resolution (SR) and enhancement problem of depth maps captured by the Time-of-Flight (ToF) cameras. Using the registered color images combined with the edge information of original depth image as a prior, and employ the joint sparse representation model to obtain the common representation coefficients as another prior, we can get the solution—the high-resolution (HR) depth maps with low noise and accurate values through the two priors. The results show that our approach possess many advantages compared with the previous state-of-art methods.
Poster Session
icon_mobile_dropdown
Image super-resolution via adaptive filtering and regularization
Jingbo Ren, Hao Wu, Weisheng Dong, et al.
Image super-resolution (SR) is widely used in the fields of civil and military, especially for the low-resolution remote sensing images limited by the sensor. Single-image SR refers to the task of restoring a high-resolution (HR) image from the low-resolution image coupled with some prior knowledge as a regularization term. One classic method regularizes image by total variation (TV) and/or wavelet or some other transform which introduce some artifacts. To compress these shortages, a new framework for single image SR is proposed by utilizing an adaptive filter before regularization. The key of our model is that the adaptive filter is used to remove the spatial relevance among pixels first and then only the high frequency (HF) part, which is sparser in TV and transform domain, is considered as the regularization term. Concretely, through transforming the original model, the SR question can be solved by two alternate iteration sub-problems. Before each iteration, the adaptive filter should be updated to estimate the initial HF. A high quality HF part and HR image can be obtained by solving the first and second sub-problem, respectively. In experimental part, a set of remote sensing images captured by Landsat satellites are tested to demonstrate the effectiveness of the proposed framework. Experimental results show the outstanding performance of the proposed method in quantitative evaluation and visual fidelity compared with the state-of-the-art methods.
Light field reconstruction robust to signal dependent noise
Capturing four dimensional light field data sequentially using a coded aperture camera is an effective approach but suffers from low signal noise ratio. Although multiplexing can help raise the acquisition quality, noise is still a big issue especially for fast acquisition. To address this problem, this paper proposes a noise robust light field reconstruction method. Firstly, scene dependent noise model is studied and incorporated into the light field reconstruction framework. Then, we derive an optimization algorithm for the final reconstruction. We build a prototype by hacking an off-the-shelf camera for data capturing and prove the concept. The effectiveness of this method is validated with experiments on the real captured data.
An improved hybrid opto-digital joint transform correlator reducing the influence of defocus on image motion measurement
Hui Zhao, Hongwei Yi, Jingxuan Wei, et al.
Joint transform correlator (JTC) is a highly efficient way to measure image motion and a hybrid opto-digital JTC (HODJTC) has been proposed by us in [CHIN. OPT. LETT., Vol. 8, No. 8]. Being different from the traditional JTC, only one optical Fourier transform is needed and the optically generated joint power spectrum (JPS) is used to compute the image motion in a digital way. Although a high measurement precision can be obtained through HODJTC, the defocus will counteract the final result. In this paper, the influence of defocus is analyzed and an improved HODJTC, whose sensitiveness to defocus is reduced, is proposed. By introducing randomly generated defocus, a series of cross-correlation peak images is obtained and a subsequent spatial averaging procedure is applied to these images to generate the final cross-peak image which is used to compute the defocus invariant motion value.
Accurate point spread function (PSF) estimation for coded aperture cameras
Jingyu Yang, Bin Jiang, Jinlong Ma, et al.
Accurate Point Spread Function (PSF) estimation of coded aperture cameras is a key to deblur defocus images. There are mainly two kinds of approaches to estimate PSF: blind-deconvolution-based methods, and measurement-based methods with point light sources. Both these two kinds of methods cannot provide accurate and convenient PSFs due to the limit of blind deconvolution or imperfection of point light sources. Inaccurate PSF estimation introduces pseudo-ripple and ringing artifacts which influence the effects of image deconvolution. In addition, there are many inconvenient situation for the PSF estimation. This paper proposes a novel method of PSF estimation for coded aperture cameras. It is observed and verified that the spatially-varying point spread functions are well modeled by the convolution of the aperture pattern and Gaussian blurring with appropriate scales and bandwidths. We use the coded aperture camera to capture a point light source to get a rough estimate of the PSF. Then, the PSF estimation method is formulated as the optimization of scale and bandwidth of Gaussian blurring kernel to fit the coded pattern with the observed PSF. We also investigate the PSF estimation at arbitrary distance with a few observed PSF kernels, which allows us to fully characterize the response of coded imaging systems with limited measurements. Experimental results show that our method is able to accurately estimate PSF kernels, which significantly make the deblurring performance convenient.
Non-intrusive gesture recognition system combining with face detection based on Hidden Markov Model
Jing Jin, Yuanqing Wang, Liujing Xu, et al.
A non-intrusive gesture recognition human-machine interaction system is proposed in this paper. In order to solve the hand positioning problem which is a difficulty in current algorithms, face detection is used for the pre-processing to narrow the search area and find user’s hand quickly and accurately. Hidden Markov Model (HMM) is used for gesture recognition. A certain number of basic gesture units are trained as HMM models. At the same time, an improved 8-direction feature vector is proposed and used to quantify characteristics in order to improve the detection accuracy. The proposed system can be applied in interaction equipments without special training for users, such as household interactive television
Oil tank detection based on salient region and geometric features
Yuan Yao, Zhiguo Jiang, Haopeng Zhang
Automatic target detection in remote sensing images remains a challenging problem. In this paper, we present a new oil tank detection method based on salient region and geometric features. Salient region detection and Otsu threshold are used for image segmentation to get candidate regions effectively, and four geometric features are employed for reducing the false alarms. Experimental results show that our method can provide a promising way to detect oil tanks accurately, and it is also robust in complicated conditions such as occlusion, shadow or deformation.
Experimental calibration of x-ray camera performance: spatial resolution, flat field response, and radiation sensitivity
Hongwei Xie, Jinchuan Chen, Linbo Li, et al.
Major parameters of X-rays camera include spatial resolution, flat field response and dynamic range. Such parameters were calibrated on a pulsed X-rays source with about 0.3MeV energy. Fluorophotometric method was used for the measurement of spatial resolutions of the penetrating lights and reflecting lights. Results indicated they were both basically same. And the spatial resolution of the camera was measured with edge method. Corresponding to 10% intensity, the modulator transfer function (MTS) of the resolution was about 5lp/mm, while the size of the point spread function (PSF) was about 0.8mm. Due to the system design with both short distance and big filed of view, the flat field non-homogeneity was about 15%. In addition, because of the relatively big gain of the scintillator and MCP image intensifier and the limited detecting efficiency of the X-rays and scintillator, the image intensity of the flat field response demonstrated a big standard deviation of about 1375. Due to the crosstalk throughout the system, the maximal signal-to-noise ratio (SNR) of the X-rays camera was about 10:1.These results could provide important technical specifications for both applications of X-rays camera and data processing of other relevant images.
An example image super-resolution algorithm based on modified k-means with hybrid particle swarm optimization
Kunpeng Feng, Tong Zhou, Jiwen Cui, et al.
This paper presents a novel example-based super-resolution (SR) algorithm with improved k-means cluster. In this algorithm, genetic k-means (GKM) with hybrid particle swarm optimization (HPSO) is employed to improve the reconstruction of high-resolution (HR) images, and a pre-processing of classification in frequency is used to accelerate the procedure. Self-redundancy across different scales of a natural image is also utilized to build attached training set to expand example-based information. Meanwhile, a reconstruction algorithm based on hybrid supervise locally linear embedding (HSLLE) is proposed which uses training sets, high-resolution images and self-redundancy across different scales of a natural image. Experimental results show that patches are classified rapidly in training set processing session and the runtime of reconstruction is half of traditional algorithm at least in super-resolution session. And clustering and attached training set lead to a better recovery of low-resolution (LR) image.
Multimodal visual dictionary learning via heterogeneous latent semantic sparse coding
Chenxiao Li, Guiguang Ding, Jile Zhou, et al.
Visual dictionary learning as a crucial task of image representation has gained increasing attention. Specifically, sparse coding is widely used due to its intrinsic advantage. In this paper, we propose a novel heterogeneous latent semantic sparse coding model. The central idea is to bridge heterogeneous modalities by capturing their common sparse latent semantic structure so that the learned visual dictionary is able to describe both the visual and textual properties of training data. Experiments on both image categorization and retrieval tasks demonstrate that our model shows superior performance over several recent methods such as K-means and Sparse Coding.
Magnifying arbitrarily selected areas of fractal Chinese characters
Wei Zhang, Ning Xu, Zhengbing Zhang
Iterated Function System (IFS) has been used to generate fractal graphics and fractal Chinese characters. A fractal Chinese character magnification method is proposed in this paper to zoom in on arbitrarily selected areas within a fractal Chinese character. For any selected area, a geometric transform is done to make the selected area occupy the full display area. The mapping coefficients of the IFS for the Chinese character are modified such that the fractal pattern of the Chinese character in the selected area can be just shown in the full display area. The experimental results demonstrate that details are shown clearly with the magnification factor being more than 10000.
Adaptive block size selection for inter-layer backward view synthesis prediction
Li Chen, Miska M. Hannuksela, Houqiang Li
Traditional forward view synthesis prediction enables the efficient use of depth to provide synthesized frames for texture reference in non-base layers. But asserted drawbacks of high complexity that results from edge detection, hole-filling, up sampling and down sampling in forward warping technique compromise the positive performance. Hence, backward view synthesis prediction is proposed to remove these drawbacks while maintaining the performance. However, fixed depth block used in backward view synthesis prediction limits the performance gain and the number of motion compensation operations, which is a requisite concern of complexity analysis. In this paper, a block based BVSP for inter-layer prediction with only high-level syntax changes is implemented and an adaptive depth block size selection method is proposed. The experimental results show that an average gain of 3.5% bitrate reduction was achieved and after enabling adaptive depth block size selection, this performance gain is relatively maintained while the number of motion compensation operations was reduced to a designated level.
Phase Recovery Based on Quadratic Programming
Quan Bing Zhang, Xiao Juan Ge, Ya Dong Cheng, et al.
Most of the information of optical wavefront is encoded in the phase which includes more details of the object. Conventional optical measuring apparatus is relatively easy to record the intensity of light, but can not measure the phase of light directly. Thus it is important to recovery the phase from the intensity measurements of the object. In recent years, the methods based on quadratic programming such as PhaseLift and PhaseCut can recover the phase of general signal exactly for overdetermined system. To retrieve the phase of sparse signal, the Compressive Phase Retrieval (CPR) algorithm combines the l1-minimization in Compressive Sensing (CS) with low-rank matrix completion problem in PhaseLift, but the result is unsatisfied. This paper focus on the recovery of the phase of sparse signal and propose a new method called the Compressive Phase Cut Retrieval (CPCR) by combining the CPR algorithm with the PhaseCut algorithm. To ensure the sparsity of the recovered signal, we use CPR method to solve a semi-definite programming problem firstly. Then apply linear transformation to the recovered signal, and set the phase of the result as the initial value of the PhaseCut problem. We use TFOCS (a library of Matlab-files) to implement the proposed CPCR algorithm in order to improve the recovered results of the CPR algorithm. Experimental results show that the proposed method can improve the accuracy of the CPR algorithm, and overcome the shortcoming of the PhaseCut method that it can not recover the sparse signal effectively.
Laser 3D imaging technology based on digital micromirror device and the performance analysis
Xiaochun Han, Zheng-fang Deng, Ya-lan Xue, et al.
Current research on scannerless three dimensional imaging LiDAR mainly focus on the phase scannerless imaging LiDAR, the multiple-slit streak tube imaging lidar and the flash LiDAR. But there are the disadvantages, such as short detection range, the complicated structure of vacuum unit and lacking the grayscale images corresponding to the three kinds of LiDAR listed above. In this paper we develop a novel 3D imaging LiDAR that works in the way of pushbroom. It converts the time of flight (TOF) into the space with digital mirror device (DMD). When pulse arrives at the DMD, the micromirrors are shifting from a status to another. Because the TOFs of pulses hit on different targets are different, there will be the streak on the focal plane array (FPA) of the sensor, which shows the relative position. The relative position of the streak can be used to reconstruct the range profile of the target. Compared with other three dimensional imaging method, this new method has the advantages of high rate imaging, large field of view, simple structure and small size. First, this article introduces the theory of digital micromirror laser 3D imaging LiDAR, and then it analyses the technical indicator of the core component. At last, it gives the process of computing the detection range, theoretically demonstrating the feasibility of this technology.
Kernel based discriminant image filter learning: application in face recognition
The extraction of discriminative and robust feature is a crucial issue in pattern recognition and classification. In this paper, we propose a kernel based discriminant image filter learning method (KDIFL) for local feature enhancement and demonstrate its superiority in the application of face recognition. Instead of designing the image filter in a handcraft or analytical way, we propose to learn the image filter so that after filtering the between-class difference is attenuated and the within-class difference is amplified, thus facilitate the following recognition. During filter learning, the kernel trick is employed to cope with the nonlinear feature space problem caused by expression, pose, illumination, and so on. We show that the proposed filter is generalized and it can be concatenated with classic feature descriptors (e.g. LBP) to further increase the discriminability of extracted features. Our extensive experiments on Yale, ORL and AR face databases validate the effectiveness and robustness of the proposed method.
An effective representation for action recognition with human skeleton joints
Xingyang Cai, Wengang Zhou, Houqiang Li
In this paper, we propose a novel method to recognize human actions using 3D human skeleton joint points. First, we represent a skeleton pose by a feature vector with three descriptors: limb orientation, joint motion orientation and body part relation. Then, we mine discriminative local basic motions based on the sequences of feature vectors. These local basic motions contain the discriminative motions of key joints and can well represent human actions. Experiments conducted on MSR Action3D Dataset and MSR Daily Activity3D Dataset demonstrate the effectiveness of the proposed algorithm and a superior performance over the state-of-the-art techniques.
A combined ASEF and pictorial structure method for facial landmark detection
Yan Wang, Sui Wei, Lei Qu
Facial landmark localization is a crucial step in many facial image analysis applications. In this paper, we propose a combined ASEF (the average of synthetic exact filter) and pictorial structure method for facial landmark detection. First, the local-maximums of the ASEF response image for each landmark are extracted as candidates. Then, the ASEF response of candidates for each landmark and their relative positions are evaluated by the pictorial structure model. Finally, the combination of candidates with highest score is selected as the final detection result. We show that by introducing the position constraint to ASEF, the detection accuracy can be highly improved. The experimental results on the BioID dataset verify the efficiency and accuracy of proposed method.
Image segmentation using an improved differential algorithm
Hao Gao, Yujiao Shi, Dongmei Wu
Among all the existing segmentation techniques, the thresholding technique is one of the most popular due to its simplicity, robustness, and accuracy (e.g. the maximum entropy method, Otsu’s method, and K-means clustering). However, the computation time of these algorithms grows exponentially with the number of thresholds due to their exhaustive searching strategy. As a population-based optimization algorithm, differential algorithm (DE) uses a population of potential solutions and decision-making processes. It has shown considerable success in solving complex optimization problems within a reasonable time limit. Thus, applying this method into segmentation algorithm should be a good choice during to its fast computational ability. In this paper, we first propose a new differential algorithm with a balance strategy, which seeks a balance between the exploration of new regions and the exploitation of the already sampled regions. Then, we apply the new DE into the traditional Otsu’s method to shorten the computation time. Experimental results of the new algorithm on a variety of images show that, compared with the EA-based thresholding methods, the proposed DE algorithm gets more effective and efficient results. It also shortens the computation time of the traditional Otsu method.
Modeling of polarimetric BRDF characteristics of painted surfaces
Ying Zhang, Zeying Wang, Huijie Zhao
In this paper a pBRDF (polarimetric Bidirectional Reflectance Distribution Function) model of painted surfaces coupled with atmospheric polarization characteristics is built and the method of simulating polarimetric radiation reaching the imaging system is advanced. Firstly, the composition of the radiation reaching the sensor is analyzed. Then, the pBRDF model of painted surfaces is developed according to the microfacet theory presented by G. Priest and the downwelled skylight polarization is modeled based on the vector radiative transfer model RT3. Furthermore, the modeled polarization state of reflected light from the surfaces was achieved through integrating the directional polarimetric information of the whole hemisphere, adding the modeled polarimetric factors of incident diffused skylight. Finally, the polarimetric radiance reaching the sensor is summed up with the assumption that the target-sensor path is assumed to be negligible since it is relatively short in the current imaging geometry. The modeled results are related to the solar-sensor geometry, atmospheric conditions and the features of the painted surfaces. This result can be used to simulate the imaging under different weather conditions and further work for the validation experiments of the model need to be done.
A three-dimensional shape measurement system based on fiber-optic image bundles
Cheng Zhen, Huijie Zhao, Xiaoyue Liang, et al.
A three-dimensional shape measurement system based on fiber-optic image bundles was proposed to measure three-dimensional shape of object in confined space. Fiber-optic image bundles have the advantage of flexibility. Firstly, based on the principle of phase-shifting and advantages of fiber-optic image bundles, the mathematical model of the measurement system was established, hardware and software platform of the system was set up. Then, the problems of calibration and poor quality images brought by fiber-optic image bundles were analyzed, after which a viable solution was proposed. Finally, experiments for objects in confined space were performed by using the three-dimensional shape measurement system. As the transmission media of the system, fiber-optic image bundles could achieve picture’s flexible acquisition and projection. The three-dimensional shape of the object was reconstructed after data processing of images. Experimental results indicated that the system was miniature and flexible enough to measure the three-dimensional shape of objects in confined space. It expanded the application range of structured-light three-dimensional shape measurement technique.
Image fusion driven by the analysis of sparse coefficients
Xiujuan Yu, Hanwen Zhao, Xiaoyan Luo, et al.
This paper proposes an efficient fusion method for multiple remote sensing images based on sparse representation, in which we mainly solve the fusion rules of the sparse coefficients. In the proposed fusion method, first is to obtain the sparse coefficients of different source images based on three dictionaries. Considering the sparsity, the source coefficients can be divided into large, middle, and small correlation classer. According to the analysis and comparison of permutations, the final coefficients are fused in the term of different fusion rules according to the correlation. Finally, the fused image can be reconstructed via combining the fused coefficients and trained dictionaries.
Synthesis multi-projector content for multi-projector three dimension display using a layered representation
Chen Qin, Bin Ren, Longfei Guo, et al.
Multi-projector three dimension display is a promising multi-view glass-free three dimension (3D) display technology, can produce full colour high definition 3D images on its screen. One key problem of multi-projector 3D display is how to acquire the source images of projector array while avoiding pseudoscopic problem. This paper analysis the displaying characteristics of multi-projector 3D display first and then propose a projector content synthetic method using tetrahedral transform. A 3D video format that based on stereo image pair and associated disparity map is presented, it is well suit for any type of multi-projector 3D display and has advantage in saving storage usage. Experiment results show that our method solved the pseudoscopic problem.
Coupled data association and L1 minimization for multiple object tracking under occlusion
Xue Wang, Qing Wang
We propose a novel multiple object tracking algorithm in a particle filter framework, where the input is a set of candidate regions obtained from Robust Principle Component Analysis (RPCA) in each frame, and the goals is to recover trajectories of objects over time. Our method adapts to the changing appearance of objects, due to occlusion, illumination changes and large pose variations, by incorporating a l1 minimization-based appearance model into the Maximize A Posterior (MAP) inference. Though L1 trackers have showed impressive tracking accuracy, they are computationally demanding for multiple object tracking. Conventional data association methods using simple nonparametric appearance model, such as histogram-based descriptor, may suffer from drastic changing object appearance. The robust tracking performance of our approach has been validated with a comprehensive evaluation involving several challenging sequences and state-of-the-art multiple object trackers.
Hierarchical feature selection for erythema severity estimation
Li Wang, Chenbo Shi, Chang Shu
At present PASI system of scoring is used for evaluating erythema severity, which can help doctors to diagnose psoriasis [1-3]. The system relies on the subjective judge of doctors, where the accuracy and stability cannot be guaranteed [4]. This paper proposes a stable and precise algorithm for erythema severity estimation. Our contributions are twofold. On one hand, in order to extract the multi-scale redness of erythema, we design the hierarchical feature. Different from traditional methods, we not only utilize the color statistical features, but also divide the detect window into small window and extract hierarchical features. Further, a feature re-ranking step is introduced, which can guarantee that extracted features are irrelevant to each other. On the other hand, an adaptive boosting classifier is applied for further feature selection. During the step of training, the classifier will seek out the most valuable feature for evaluating erythema severity, due to its strong learning ability. Experimental results demonstrate the high precision and robustness of our algorithm. The accuracy is 80.1% on the dataset which comprise 116 patients’ images with various kinds of erythema. Now our system has been applied for erythema medical efficacy evaluation in Union Hosp, China.
Automatic segmentation of psoriasis lesions
Yang Ning, Chenbo Shi, Li Wang, et al.
The automatic segmentation of psoriatic lesions is widely researched these years. It is an important step in Computer-aid methods of calculating PASI for estimation of lesions. Currently those algorithms can only handle single erythema or only deal with scaling segmentation. In practice, scaling and erythema are often mixed together. In order to get the segmentation of lesions area,this paper proposes an algorithm based on Random forests with color and texture features. The algorithm has three steps. The first step, the polarized light is applied based on the skin’s Tyndall-effect in the imaging to eliminate the reflection and Lab color space are used for fitting the human perception. The second step, sliding window and its sub windows are used to get textural feature and color feature. In this step, a feature of image roughness has been defined, so that scaling can be easily separated from normal skin. In the end, Random forests will be used to ensure the generalization ability of the algorithm. This algorithm can give reliable segmentation results even the image has different lighting conditions, skin types. In the data set offered by Union Hospital, more than 90% images can be segmented accurately.
3D reconstruction of large target by range gated laser imaging
Sining Li, Xu Yan, Wei Wang, et al.
We have developed a whole set of range gated laser imaging system with ~3km maximum acquisition distance, the system uses a Nd:YAG electro-optical Q-switched 532nm laser as transmitter, a double micro channel plate as gated sensor, all the components are controlled by the a trigger control unit with accuracy of subnanosecond. A imaging scheme is designed for imaging the large building ~500m away, and a sequence of images are obtained in the experiment, which are the basic data for 3D reconstruction; to improve the range resolution, we study the temporal distribution of intensity of the received signal, and use centroid algorithm for data processing. We compare the 3D image with the theoretical model, and the results are corresponding.
A no-reference contourlet-decomposition-based image quality assessment method for super-resolution recontruction
Wei Zhang, Zhongcheng Fan
A no-reference image quality assessment method for super-resolution reconstruction is proposed. The basic idea is to perform a contourlet multiscale decomposition of low resolution image and reconstructed super resolution image first. According to the relativity of the contourlet coefficient, the reconstructed image is divided into sharp edges, image texture and flat region. Then, calculate the ringing intensity index of sharp edges, the blur extent index of the image texture and the directional entropy index of the high frequency components. Finally, the result to evaluate the reconstructed image quality is obtained by integrated these indexes into one total image quality index. Several experimental results using simulated images demonstrate the new index is efficient and stable for evaluating the quality of the reconstructed super-resolution image. It performs well in accordance with human subjective vision.
Real-time remote three-dimensional superresolution range-gated imaging based on inter-frame correlation
Xinwei Wang, Yinan Cao, Wei Cui, et al.
High-resolution real-time three-dimensional imaging is important in 3D video surveillance, robot vision, and automatic navigation. In this paper, a three-dimensional superresolution range-gated imaging based on inter-frame correlation is proposed to realize high-resolution real-time 3D imaging. In this method, a CCD/CMOS with a gated image intensifier is used as image sensor, and depth information collapsed in 2D images is reconstructed by spatial-temporal inter-frame correlation with a resolution of about 1000×1000 full-frame pixels within a frame. Furthermore, under inter-frame correlation a 3D point cloud frame is generated at video rates corresponding to CCD/CMOS utilized. Finally, some proof simulation experiments are demonstrated.
Improved sequential search algorithms for classification in hyperspectral remote sensing images
Two new sequential search algorithms for feature selection in hyperspectral remote sensing images are proposed. Since many wavebands in hyperspectral images are redundant and irrelevant, the use of feature selection to improve classification results is highly needed. First, we present a new generalized steepest ascent (GSA) feature selection technique that improves upon the prior steepest ascent algorithm by selecting a better starting search point and performing a more thorough search. It is guaranteed to provide solutions that equal or exceed those of the classical sequential forward floating selection algorithm. However, when the number of available wavebands is large, the computational load required for the GSA algorithm becomes excessive. We thus propose a modification of the improved floating forward selection algorithm which is more computationally efficient. Experimental results for two hyperspectral data sets show that our proposed algorithms yield better classification results than other suboptimal search algorithms.
Image feature point detection method based on the pixels of high-resolution sensors
Xingchun Liu, Zhe Wang, Zhipeng Hu, et al.
Through analyzing the characteristic of high resolution image obtained by high resolution sensor when the size of sensor is fixed, a new fast feature point detecting method is put forward. Firstly, detect effective points by sampling in fixed step, which are used to filter to get extreme feature points, and realize the extraction process of extreme feature points simplified, then take points with neighborhood domain features as the description of effective points, and obtain extreme feature points through the preset threshold calculation, finally, obtain correct feature points by filtering. At last the effect of the extraction method was validated by the image matching result. And the matching result shows that image’s features extracted by this method could ensure the precision and decrease the computation at the same time.
Asymmetric multiview image coding based on feature matching
Wenjun Tao, Huihui Bai, Meiqin Liu, et al.
In this paper, we propose a method of feature matching based asymmetric three-dimensional (3D) image coding with hierarchical reconstruction quality. At the encoder, for the main view the standard intra coding can be applied to obtain high reconstruction quality while for its neighboring views extracted feature descriptors can be utilized to calculate transformation matrix between views. The parameters of transformation matrix can be transmitted by very low bit rate and achieve the preliminary reconstruction. Furthermore, the residues can be exploited to improve the performance. The experimental results have shown that the proposed scheme can reach a very high compression ratio.
High quality underwater imaging platform with laser range gated technique combining with image denoising and restoration
Huachuan Huang, Rongbo Wang, Keding Yan, et al.
Underwater laser imaging is of great significance in underwater search and marine science, etc. However, traditional underwater laser imaging is often of poor quality with noises and blurs, moreover, the resolution of the image is also low. In order to obtain clear underwater images with high resolution and quality, here, we have designed a range gated imaging underwater imaging system and realized an image restoration approach. In this paper, based on the introduction to the imaging system and image restoration algorithm, the experiment is established by setting the imaging system under water in the lake to capture the underwater targets. With the proposed underwater image restoration approach, images of high quality could be retrieved which proves that the method is able to identify the target ~10 meters away underwater.
A fast high-dynamic range algorithm based on HSI color space
Jiancheng Zhang, Xiaohua Liu, Liquan Dong, et al.
This paper presents a High Dynamic Range algorithm based on HSI color space. To keep hue and saturation of original image and conform to human eye vision effect is the first problem, convert the input image data to HSI color space which include intensity dimensionality. To raise the speed of the algorithm is the second problem, use integral image figure out the average of every pixel intensity value under a certain scale, as local intensity component of the image, and figure out detail intensity component. To adjust the overall image intensity is the third problem, we can get an S type curve according to the original image information, adjust the local intensity component according to the S type curve. To enhance detail information is the fourth problem, adjust the detail intensity component according to the curve designed in advance. The weighted sum of local intensity component after adjusted and detail intensity component after adjusted is final intensity. Converting synthetic intensity and other two dimensionality to output color space can get final processed image.
Amplitude and phase of single nanoparticle calculated using finite-difference time-domain method
Xin Hong, Huan Liu
In this work, we present a model to calculate the electric amplitude and phase field distribution of single nanoparticle by using finite-difference time-domain (FDTD) method. We model the light-nanoparticle interaction by using a liner polarization light to illuminate the single nanoparticle through immersion oil and glass substrate. The illumination is set as a cone of plane waves limited by the aperture of the objective. The scattering field summarized on a single detector is amplified by heterodyne interference with a reference light. The amplitude and phase distribution of particles with different diameters ranging from 50 nm to 2 micron are calculated.
Combining heterogeneous features for 3D hand-held object recognition
Xiong Lv, Shuang Wang, Xiangyang Li, et al.
Object recognition has wide applications in the area of human-machine interaction and multimedia retrieval. However, due to the problem of visual polysemous and concept polymorphism, it is still a great challenge to obtain reliable recognition result for the 2D images. Recently, with the emergence and easy availability of RGB-D equipment such as Kinect, this challenge could be relieved because the depth channel could bring more information. A very special and important case of object recognition is hand-held object recognition, as hand is a straight and natural way for both human-human interaction and human-machine interaction. In this paper, we study the problem of 3D object recognition by combining heterogenous features with different modalities and extraction techniques. For hand-craft feature, although it reserves the low-level information such as shape and color, it has shown weakness in representing hiconvolutionalgh-level semantic information compared with the automatic learned feature, especially deep feature. Deep feature has shown its great advantages in large scale dataset recognition but is not always robust to rotation or scale variance compared with hand-craft feature. In this paper, we propose a method to combine hand-craft point cloud features and deep learned features in RGB and depth channle. First, hand-held object segmentation is implemented by using depth cues and human skeleton information. Second, we combine the extracted hetegerogenous 3D features in different stages using linear concatenation and multiple kernel learning (MKL). Then a training model is used to recognize 3D handheld objects. Experimental results validate the effectiveness and gerneralization ability of the proposed method.
An auto-gain control algorithm for EMCCD based on dynamic gray-level
Yuehong Qian, Wenwen Zhang, Jingjing Liu, et al.
According to the adjustability of the gain multiplier of Electron Multiplying CCD , an image gain adjustment method based on dynamic gray-level is proposed. Compared to a fixed value adjustment algorithm, the automatic gain algorithm here is more adaptive,even in low-light conditions , it can achieve better gain values. Experimental results show that the automatic gain algorithm which combines mean values with the dynamic range of histograms meets the requirements. Whether it is during the day or at night , the brightness of image can quickly converge to the optimum range of gray histogram distribution, gray-level dynamic range is also accounted for more than 90% . Judging from the images obtained: the brightness is moderate, details are clear .
A flexible design for coded aperture snapshot spectral imager
By the success of compressive sensing (CS), coded aperture snapshot spectral imager (CASSI) computationally obtains 3D spectral images from 2D compressive measurement. In CASSI, each pixel of the detector captures spectral information only from one voxel in each band with binary weights (i.e., 0 or 1), which limits the variety of superposition relationship among the 3D voxels in the underlying scene. Moreover, the correspondence of each pixel of detector to each pixel of coded aperture cannot be readily achieved in the presence of dispersive prism, due to the small pixel sizes of these elements (often in micrometer). In this paper, we propose a flexible design to improve the performance of CASSI with currently employed optical elements in CASSI. Specifically, the proposed design integrates a kind of flexible alignment relationship along the coded aperture, the dispersive prism and the detector. Each measurement of the detector is manifested as the summation of several voxels in each band with random decimal weights and different measurements corresponds to overlapped voxels, which provides more sufficient superposition relationship of the scene information. This flexible design favors the sensing mechanism better satisfy the requirement of CS theory. Furthermore, the proposed design greatly reduces the alignment complexity and burden of system construction. Preliminary result achieves improved image quality, including higher PSNR and better perceptual effect, compared to the traditional design.
High accuracy hole filling for Kinect depth maps
Jianxin Wang, Ping An, Yifan Zuo, et al.
Hole filling of depth maps is a core technology of the Kinect based visual system. In this paper, we propose a hole filling algorithm for Kinect depth maps based on separately repairing of the foreground and background. There are two-part processing in the proposed algorithm. Firstly, a fast pre-processing to the Kinect depth map holes is performed. In this part, we fill the background holes of Kinect depth maps with the deepest depth image which is constructed by combining the spatio-temporal information of the pixels in Kinect depth map with the corresponding color information in the Kinect color image. The second step is the enhancement for the pre-processing depth maps. We propose a depth enhancement algorithm based on the joint information of geometry and color. Since the geometry information is more robust than the color, we correct the depth by affine transform in prior to utilizing the color cues. Then we determine the filter parameters adaptively based on the local features of the color image which solves the texture copy problem and protects the fine structures. Since L1 norm optimization is more robust to data outliers than L2 norm optimization, we force the filtered value to be the solution for L1 norm optimization. Experimental results show that the proposed algorithm can protect the intact foreground depth, improve the accuracy of depth at object edges, and eliminate the flashing phenomenon of depth at objects edges. In addition, the proposed algorithm can effectively fill the big depth map holes generated by optical reflection.
Orientation selectivity-based structure for texture classification
Jinjian Wu, Weisi Lin, Guangming Shi, et al.
Local structure, e.g., local binary pattern (LBP), is widely used in texture classification. However, LBP is too sensitive to disturbance. In this paper, we introduce a novel structure for texture classification. Researches on cognitive neuroscience indicate that the primary visual cortex presents remarkable orientation selectivity for visual information extraction. Inspired by this, we investigate the orientation similarities among neighbor pixels, and propose an orientation selectivity based pattern for local structure description. Experimental results on texture classification demonstrate that the proposed structure descriptor is quite robust to disturbance.
Realization of a single image haze removal system based on DaVinci DM6467T processor
Video monitoring system (VMS) has been extensively applied in domains of target recognition, traffic management, remote sensing, auto navigation and national defence. However the VMS has a strong dependence on the weather, for instance, in foggy weather, the quality of images received by the VMS are distinct degraded and the effective range of VMS is also decreased. All in all, the VMS performs terribly in bad weather. Thus the research of fog degraded images enhancement has very high theoretical and practical application value. A design scheme of a fog degraded images enhancement system based on the TI DaVinci processor is presented in this paper. The main function of the referred system is to extract and digital cameras capture images and execute image enhancement processing to obtain a clear image. The processor used in this system is the dual core TI DaVinci DM6467T(ARM@500MHz+DSP@1GH. A MontaVista Linux operating system is running on the ARM subsystem which handles I/O and application processing. The DSP handles signal processing and the results are available to the ARM subsystem in shared memory.The system benefits from the DaVinci processor so that, with lower power cost and smaller volume, it provides the equivalent image processing capability of a X86 computer. The outcome shows that the system in this paper can process images at 25 frames per second on D1 resolution.
Embedding Perspective Cue in Holographic Projection Display by Virtual Variable-Focal-Length Lenses
Zhaohui Li, Jianqi Zhang, Xiaorui Wang, et al.
To make a view perspective cue emerging in reconstructed images, a new approach is proposed by incorporating virtual variable-focal-length lenses into computer generated Fourier hologram (CGFH). This approach is based on a combination of monocular vision principle and digital hologram display, thus it owns properties coming from the two display models simultaneously. Therefore, it can overcome the drawback of the unsatisfied visual depth perception of the reconstructed three-dimensional (3D) images in holographic projection display (HPD). Firstly, an analysis on characteristics of conventional CGFH reconstruction is made, which indicates that a finite depthof- focus and a non-adjustable lateral magnification are reasons of the depth information lack on a fixed image plane. Secondly, the principle of controlling lateral magnification in wave-front reconstructions by virtual lenses is demonstrated. And the relation model is deduced, involving the depth of object, the parameters of virtual lenses, and the lateral magnification. Next, the focal-lengths of virtual lenses are determined by considering perspective distortion of human vision. After employing virtual lenses in the CGFH, the reconstructed image on focal-plane can deliver the same depth cues as that of the monocular stereoscopic image. Finally, the depthof- focus enhancement produced by a virtual lens and the effect on the reconstruction quality from the virtual lens are described. Numerical simulation and electro-optical reconstruction experimental results prove that the proposed algorithm can improve the depth perception of the reconstructed 3D image in HPD. The proposed method provides a possibility of uniting multiple display models to enhance 3D display performance and viewer experience.
Simultaneous cartoon-plus-texture image deconvolution by using variational image decomposition
Huasong Chen, Keding Yan, Jun Zhang, et al.
Real images usually have two layers, namely, cartoons(the piece-wise smooth part of image) and textures(the oscillating pattern part of the image). In this paper, we solve the challenging image deconvolution problems by using variation image decomposition method which can regularize the cartoon with total variation and texture in G space respectively. Different from existing schemes in the literature which can only recover the smooth structure of the image, our deconvolution method can not only restore the smooth part of image but also recover the detailed oscillating part of the image. Numerical simulation examples are given to demonstrate the applicability and usefulness of our proposed algorithms in image deconvolution.
Applications of just-noticeable depth difference model in joint multiview video plus depth coding
Chao Liu, Ping An, Yifan Zuo, et al.
A new multiview just-noticeable-depth-difference(MJNDD) Model is presented and applied to compress the joint multiview video plus depth. Many video coding algorithms remove spatial and temporal redundancies and statistical redundancies but they are not capable of removing the perceptual redundancies. Since the final receptor of video is the human eyes, we can remove the perception redundancy to gain higher compression efficiency according to the properties of human visual system (HVS). Traditional just-noticeable-distortion (JND) model in pixel domain contains luminance contrast and spatial-temporal masking effects, which describes the perception redundancy quantitatively. Whereas HVS is very sensitive to depth information, a new multiview-just-noticeable-depth-difference(MJNDD) model is proposed by combining traditional JND model with just-noticeable-depth-difference (JNDD) model. The texture video is divided into background and foreground areas using depth information. Then different JND threshold values are assigned to these two parts. Later the MJNDD model is utilized to encode the texture video on JMVC. When encoding the depth video, JNDD model is applied to remove the block artifacts and protect the edges. Then we use VSRS3.5 (View Synthesis Reference Software) to generate the intermediate views. Experimental results show that our model can endure more noise and the compression efficiency is improved by 25.29 percent at average and by 54.06 percent at most compared to JMVC while maintaining the subject quality. Hence it can gain high compress ratio and low bit rate.
Characteristic extraction and matching algorithms of ballistic missile in near-space by hyperspectral image analysis
Li Lu, Wen Sheng, Shihua Liu, et al.
The ballistic missile hyperspectral data of imaging spectrometer from the near-space platform are generated by numerical method. The characteristic of the ballistic missile hyperspectral data is extracted and matched based on two different kinds of algorithms, which called transverse counting and quantization coding, respectively. The simulation results show that two algorithms extract the characteristic of ballistic missile adequately and accurately. The algorithm based on the transverse counting has the low complexity and can be implemented easily compared to the algorithm based on the quantization coding does. The transverse counting algorithm also shows the good immunity to the disturbance signals and speed up the matching and recognition of subsequent targets.
Ultrasonic televiewer image encoding based on block prediction
Zhengbing Zhang, Wei Zhang
In this paper, an ultrasonic televiewer image encoding method based on block prediction is proposed. The original image is divided into blocks of 8-by-8 pixels. The current block to be encoded is predicted from previously encoded and decoded blocks. The prediction mode that minimizes the differences between the original and predicted is chosen from 9 modes. The prediction difference block is transformed with Discrete Cosine Transform (DCT), and the DCT coefficients are quantized and encoded with lossless algorithm. The prediction modes selected are also encoded. Experimental results show that the performance of the proposed method is much better than JPEG.
Seismic data compression based on wavelet transform
Zhengbing Zhang, Wei Zhang, Zhixian Gui
New technologies such as multi-dimension, multi-components and high precision methods adopted in seismic exploration make seismic exploration data increase explosively. Large volume seismic data results in serious problems in transmission, storage and processing of the data. In this paper a seismic data compression method based on wavelet transform is proposed. The original data is decomposed into 12 detail sub-bands and 1 low-resolution sub-band with 2- dimensional discrete wavelet transform. The wavelet coefficients of seismic data are encoded with embedded zero-tree wavelet coding algorithm. Experimental results show that the proposed method is capable of efficient compression.
A design method of the distortionless catadioptric panoramic imaging based on freeform surface
Yisi Wu, An Li, Chi Chen, et al.
A design method for the distortionless catadioptric panoramic imaging system is proposed in this paper. The panoramic system mainly consists of two parts, a reflecting surface system with relay lens and a CCD camera. A mapping relationship between the real image plane and the projection surface is established to acquires low distorted imaging features easily. And the design of freeform surface is applied to the reflecting surfaces to correct distortion. After iteratively optimize the freeform surfaces, the image quality is gradually improved. The simulation results show that compared with traditional system, the new freeform surface system has simple design, attaining higher performance and has the advantage of small scene distortion and making the image more suitable and convenient for observing.
Compression of multispectral image using HEVC
Feiyu Gao, Xiangyang Ji, Chenggang Yan, et al.
Predictive Lossy Compression has been found to be an interesting alternative to conventional transform coding techniques in multispectral image compression. Recently, High Efficiency Video Coding (HEVC) standard has shown significant improvement over state of the art transformation based still-image coding standard. In this paper we study the properties of multispectral image and propose a predictive lossy compression scheme based on HEVC. Empirical analysis shows that our proposed method is superior to the existing state of the art predictive lossy compression schemes.
Design of HD binocular stereo display system based on ARM11
Bin Zhuo, Junsheng Shi, Yonghang Tai, et al.
Based on the characteristics of a 0.5'' micro AM-OLED and the binocular parallax principle of human being, a HD stereo display system was designed using hardware platform of ARM11 and embedded Linux as the operating system. System used S3C6410 as the MCU. Side-by-Side or Top-and-Bottom 3D video source, which inputted from the HDMI or SD card, was converted to the Frame Timing Mode and Field Timing Mode video format, which processed through the video coding algorithm. At the same time, the outputting 3D synchronous signal controlled the left and right AM-OLED to receive corresponding parallactic images. HD stereo video sources achieved an improvement effect on the dual AM-OLED after the optical system amplified, which presented an image distance equivalent to the human eyes 2.5 meters, the diagonal dimension of 46 feet natural lifelike scene in front of the user. Combined synchronous signal with Frame Timing Mode and Field Timing Mode, the HD binocular stereo system displayed a preferable result for the customs.
A new metric to assess temporal coherence for video retargeting
Ke Li, Bo Yan, Binhang Yuan
In video retargeting, how to assess the performance in maintaining temporal coherence has become the prominent challenge. In this paper, we will present a new objective measurement to assess temporal coherence after video retargeting. It’s a general metric to assess jittery artifact for both discrete and continuous video retargeting methods, the accuracy of which is verified by psycho-visual tests. As a result, our proposed assessment method possesses huge practical significance.
An effective guess for Gerchberg-Saxton-type algorithms
Kaiyun Wei, Xin Jin, Yifu Hu, et al.
Gerchberg–Saxton-type (GS-type) algorithms have been widely applied in photonics to reconstruct the object structures. However, using random guesses as the initial inputs, the reconstruction quality of GS-type algorithms is unpredictable. And, it always leads to a large number of iterations to reach convergence. In this paper, a singular value decomposition (SVD) based method is proposed to generate an effective phase guess for GS-type algorithms using a low rank approximation. Experimental results demonstrate that under the same reconstruction error, the proposed SVD based guesses reduce the iteration times by more than 50% on average compared with that of random guesses. Furthermore, they can outperform random guesses both in terms of steady state error and iteration times. Compared with the average performance of random guesses, the proposed approach reduces the steady state error of recovered images by 70.7% on average and reduces the iteration times by 56.1% on average.
Improved gray world color correction method based on weighted gain coefficients
Bin Pan, Zhiguo Jiang, Haopeng Zhang, et al.
Grey world algorithm is a simple but widely used global white balance method for color cast images. However, this algorithm only assumes that the mean values of the R, G, and B components tend to be equal, which may lead to false alarms in some normal images with large areas of single color background, for example, images in ocean background. Another defect is that grey world algorithm may cause luminance variations in the channels having no cast. We note that though different in mean values, standard deviations of the three channels are supposed to converge in color cast images, which is not suitable for those false alarms. Based on this discrepancy, through a mathematical manipulation both on mean values and standard deviations of the three channels, a novel color correction model is proposed by weighting the gain coefficients in grey world model. All the three weighted gain coefficients in the proposed model tend to be 1 on images containing large single color regions so as to avoid false alarms. For the color cast images, the channel existing color cast is given a weighted gain coefficient much less than 1 to correct color cast, while the other two channels are distributed weighted gain coefficients approximately equal to 1 thus to ensure that the proposed model has little negative effects on channels with no color cast. Experiments show that our model presents better performance in color correction.
Compressive Spectral Video Acquisition with Double-channel Complementary Coded Aperture
Danhua Liu, Huan Li, Guo Li, et al.
Spectral video is crucial for monitoring of dynamic scenes, reconnaissance of moving targets, observation and tracking of living cells, etc. The traditional spectral imaging methods need multiple exposures to capture a full frame spectral image, which leads to a low temporal resolution and thus lose their value as spectral video. The new code aperture snapshot spectral imaging (CASSI) method has been emerging in recent years, which is suitable for spectral video acquisition, due to its high-speed snapshot and few-amount measurements. Based on the CASSI, this paper proposes a compressive spectral video acquisition method with double-channel complementary coded aperture. The method can achieve the spectral video with a high temporal resolution by directly sampling the 3D spectral scene with 2D array sensor in only one snapshot. Furthermore, by using the double-channel complementary coded aperture in compressive measurement and the sparse regularity in the optimization recovery together, we can obtain the higher PSNR and better visual effects compared with the single-channel CASSI. Simulation results demonstrate the efficacy of the proposed method.
Aircraft detection based on probability model of structural elements
Detecting aircrafts is important in the field of remote sensing. In past decades, researchers used various approaches to detect aircrafts based on classifiers for overall aircrafts. However, with the development of high-resolution images, the internal structures of aircrafts should also be taken into consideration now. To address this issue, a novel aircrafts detection method for satellite images based on probabilistic topic model is presented. We model aircrafts as the connected structural elements rather than features. The proposed method contains two major steps: 1) Use Cascade-Adaboost classier to identify the structural elements of aircraft firstly. 2) Connect these structural elements to aircrafts, where the relationships between elements are estimated by hierarchical topic model. The model places strict spatial constraints on structural elements which can identify differences between similar features. The experimental results demonstrate the effectiveness of the approach.
Efficient stereo matching algorithm with edge-detecting
Stereo vision is a hot research topic in the field of computer vision and 3D video display.Disparity map is one of the most crucial steps. A novel constant computational complexity algorithm based on separable successive weight summation (SWS) is presented. The proposed algorithm eliminates iteration and support area independently, which saves computation and memory space .The similar measure of gradient is also applied to improve the original algorithm. Image segmentation and edge detection is used for the stereo matching to accelerate the speed and improve the accuracy of matching algorithm.The image of edge is extracted to reduce the search scope for the stereo matching algorithm. Dense disparity map was obtained through local optimization.Experimental results show that the algorithm is efficient and can well reduce the matching noise and improve the matching precision in depth discontinuities and low-texture region.
Multi-channel super-resolution with Fourier ptychographic microscopy
Weixin Jiang, Yongbing Zhang, Qionghai Dai
Fourier ptychographic microscopy (FPM) is a recently developed imaging method, which stitches together a sequence of low-resolution images in Fourier space in the iterative manner. However, the high-resolution color image super-resolved by this method always has problems of dispersion when compared to the high-resolution image observed under highmagnification lens. In this paper, we propose a new method for super-resolving multi-channel images. Instead of simply applying the FPM algorithm to RGB channels respectively, the method considers the relationship among each channel, which is employed to correct the result. Experimental results demonstrate that the dispersion can be eliminated, compared with the super-resolving multi-channel image got from original FPM algorithm. Besides, the robustness of Fourier ptychographic imaging is improved and the running time of the super-resolution of color images is reduced.
Efficient Mode Decision Algorithm for Scalable High Efficiency Video Coding
Nandi Shi, Ran Ma, Panpan Li, et al.
A scalable extension design is proposed for High Efficiency Video Coding (HEVC), which can provide temporal, spatial, and quality scalability. This technique achieves high coding efficiency and error resilience, but increases the computational complexity. To reduce the complexity of the quality scalable video coding, this paper proposes a fast mode selection method based on mode distribution of coding units(CUs). Some experiments are tested which show that the proposed algorithm can achieve up to 63.70% decrease in encoding time with a negligible loss of video quality.
Explore spatial-temporal relations: transient super-resolution with PMD sensors
Chaosheng Han, Xing Lin, Jingyu Lin, et al.
Transient imaging provides a direct view of how light travel in the scene, which leads to exciting applications such as looking around corners. Low-budget transient imagers, adapted from Time-of-Fight (ToF) cameras, reduce the barrier of entry for performing research of this new imaging modality. However, the image quality is far from satisfactory due to the limited resolution of PMD sensors. In this paper, we improve the resolution of transient images by modulating the illumination. We capture the scene under three linearly independent lighting conditions, and derive a theoretical model for the relationship between the time-profile and the corresponding 3D details of each pixel. Our key idea is that the light flight time in each pixel patch is proportional to the cross product of the illuminating direction and the surface normal. First we capture and reconstruct transient images by Fourier analysis at multiple illumination locations, and then fuse the data of acquired low-spatial resolution images to calculate the surface normal. Afterwards, we use an optimization procedure to split the pixels and finally enhance the image quality. We show that we can not only reveal the fine structure of the object but may also uncover the reflectance properties of different materials. We hope the idea of utilizing spatial-temporal relations will give new insights to the research and applications of transient imaging.
An auto-calibration approach for multi-projector 3D display
Shao Tang, Lei Zhang, Yongbing Zhang
For typical multi-projector 3D display systems, precise calibration of projectors is extremely important for ensuring projected images/videos to coincide exactly in the same region of the screen to obtain high quality 3D display experience. Conventional calibration is achieved by adjusting the pose of projectors manually with the built-in keystone correction function, which is imprecise and time-consuming. In this paper, we propose an auto-calibration approach using feature detection and matching technique via an uncalibrated camera to improve both the calibration efficiency and precision. The whole procedure can be finished in minutes, and the calibration error also hardly increase with the number of projectors. What’s more, to the best of our knowledge, our approach is the first fast auto-calibration approach employed in the multi-projector 3D display systems.
A large-scale multi-projector glass-free 3D display system
Shao Tang, Lei Zhang, Yongbing Zhang
We present a large-scale and glasses-free 3D display system in this paper. The developed prototype consists of a 100-inch display screen and eight synchronized projectors, providing an eight-view glassless 3D experience. The synchronization and calibration between projectors are well addressed in this paper. Our system is also designed to be free from the vertical stripe noise, which is a big drawback for many other projector-based 3D system. Experiment results show that both binocular disparity and motion parallax are well supported. In summary, we provide a feasible solution for large-scale glasses-free 3D cinemas via multiple projectors.
Motion-blurred image restoration based on joint transform correlator
To restore the motion blurred image caused by various vibration and attitude variation in remote imaging, an approach is presented which is based on joint transform correlator (JTC). An auxiliary high-speed CCD is used to capture image sequences When the prime CCD is imaging in exposure period, these image sequences are optically calculated by JTC system, and image motion vector can be effectively detected and point spread function is accurately modeled instantaneously, it will alleviate greatly the complexity of image restoration algorithm. Finally, a simple restoration algorithm is proposed to restore the blurred image. We have also constructed an image restoration system based on joint transform correlator. The experimental results show that the proposed method has improved image quality greatly.
Image reconstruction in Fizeau interferometry
Yuan-Yuan Ding, Xin-Yang Chen, Chao-Yan Wang
Fizeau interferometry is one of the most important technique to measure astronomical objects with high angle resolution. This paper is the part of a series dedicated to research of the Fizeau interferometry carried out by the research team of Shanghai Astronomical Observatory. This paper is mainly concerned the simulation of image restoration based on Y-type telescope and segmented mirrors telescope. It is proved that we can get the high resolution image using RL and OS-EM method.
Image reconstruction in speckle interferometry
Speckle interferometry has beenwidely used in the observational astronomy, especially in binary stars.This paper is the part of a series dedicated to the speckle imaging of binary stars carried out by the research team of Shanghai Astronomical Observatory.The observation experiments were carried out with 1.56-m telescope using a speckle camera,and the high resolution image were reconstructed successfully using speckle interferometry and iterative shiftand- add. In order to speed up the computation speed, we also prepared a reconstruction software based on GPU technology and CUDA programming model, compared with C++ program based on CPU, the speed ratio can reach about 7 times.
Remote stereoscopic video play platform for naked eyes based on the Android system
As people's life quality have been improved significantly, the traditional 2D video technology can not meet people's urgent desire for a better video quality, which leads to the rapid development of 3D video technology. Simultaneously people want to watch 3D video in portable devices,. For achieving the above purpose, we set up a remote stereoscopic video play platform. The platform consists of a server and clients. The server is used for transmission of different formats of video and the client is responsible for receiving remote video for the next decoding and pixel restructuring. We utilize and improve Live555 as video transmission server. Live555 is a cross-platform open source project which provides solutions for streaming media such as RTSP protocol and supports transmission of multiple video formats. At the receiving end, we use our laboratory own player. The player for Android, which is with all the basic functions as the ordinary players do and able to play normal 2D video, is the basic structure for redevelopment. Also RTSP is implemented into this structure for telecommunication. In order to achieve stereoscopic display, we need to make pixel rearrangement in this player's decoding part. The decoding part is the local code which JNI interface calls so that we can extract video frames more effectively. The video formats that we process are left and right, up and down and nine grids. In the design and development, a large number of key technologies from Android application development have been employed, including a variety of wireless transmission, pixel restructuring and JNI call. By employing these key technologies, the design plan has been finally completed. After some updates and optimizations, the video player can play remote 3D video well anytime and anywhere and meet people's requirement.
Study of phase retrieval algorithm from partially coherent light
The goal of phase retrieval is to recover the phase information from intensity distribution which is an important topic in optics and image processing. The algorithm based on the transport of intensity equation only need to measure the spatial intensity of the center plane and adjacent light field plane, and reconstruct the phase object by solving second order differential equations. The algorithm is derived in the coherent light field. And the partially coherent light field is described more complex. The field at any point in the space experiences statistical fluctuations over time. Therefore, traditional TIE algorithms cannot be applied in calculating the phase of partially coherent light field. In this thesis, the phase retrieval algorithm is proposed for partially coherent light field. First, the description and propagation equation of partially coherent light field is established. Then, the phase is retrieved by TIE Fourier transform. Experimental results with simulated uniform and non-uniform illumination demonstrate the effectiveness of the proposed method in phase retrieval for partially coherent light field.
Application of time-resolved glucose concentration photoacoustic signals based on an improved wavelet denoising
Zhong Ren, Guodong Liu, Zhen Huang
Real-time monitoring of blood glucose concentration (BGC) is a great important procedure in controlling diabetes mellitus and preventing the complication for diabetic patients. Noninvasive measurement of BGC has already become a research hotspot because it can overcome the physical and psychological harm. Photoacoustic spectroscopy is a well-established, hybrid and alternative technique used to determine the BGC. According to the theory of photoacoustic technique, the blood is irradiated by plused laser with nano-second repeation time and micro-joule power, the photoacoustic singals contained the information of BGC are generated due to the thermal-elastic mechanism, then the BGC level can be interpreted from photoacoustic signal via the data analysis. But in practice, the time-resolved photoacoustic signals of BGC are polluted by the varities of noises, e.g., the interference of background sounds and multi-component of blood. The quality of photoacoustic signal of BGC directly impacts the precision of BGC measurement. So, an improved wavelet denoising method was proposed to eliminate the noises contained in BGC photoacoustic signals. To overcome the shortcoming of traditional wavelet threshold denoising, an improved dual-threshold wavelet function was proposed in this paper. Simulation experimental results illustrated that the denoising result of this improved wavelet method was better than that of traditional soft and hard threshold function. To varify the feasibility of this improved function, the actual photoacoustic BGC signals were test, the test reslut demonstrated that the signal-to-noises ratio(SNR) of the improved function increases about 40-80%, and its root-mean-square error (RMSE) decreases about 38.7-52.8%.
High-emulation mask recognition with high-resolution hyperspectral video capture system
Jiao Feng, Xiaojing Fang, Shoufeng Li, et al.
We present a method for distinguishing human face from high-emulation mask, which is increasingly used by criminals for activities such as stealing card numbers and passwords on ATM. Traditional facial recognition technique is difficult to detect such camouflaged criminals. In this paper, we use the high-resolution hyperspectral video capture system to detect high-emulation mask. A RGB camera is used for traditional facial recognition. A prism and a gray scale camera are used to capture spectral information of the observed face. Experiments show that mask made of silica gel has different spectral reflectance compared with the human skin. As multispectral image offers additional spectral information about physical characteristics, high-emulation mask can be easily recognized.
Image restoration based on wavelets and curvelet
Yang Yang, Bo Chen
The performance of high-resolution imaging with large optical instruments is severely limited by atmospheric turbulence. Adaptive optics (AO) offers a real-time compensation for turbulence. However, the correction is often only partial, and image restoration is required for reaching or nearing to the diffraction limit. Wavelet-based techniques have been applied in atmospheric turbulencedegraded image restoration. However, wavelets do not restore long edges with high fidelity while curvelets are challenged by small features. Loosely speaking, each transform has its own area of expertise and this complementarity may be of great potential. So, we expect that the combination of different transforms can improve the quality of the result. In this paper, a novel deconvolution algorithm, based on both the wavelet transform and the curvelet transform (NDbWC). It extends previous results which were obtained for the image wavelet-based restoration. Using these two different transformations in the same algorithm allows us to optimally detect in tire same time isotropic features, well represented by the wavelet transform, and edges better represented by the curvelet transform. The NDbWC algorithm works better than classical wavelet-regularization method in deconvolution of the turbulence-degraded image with low SNR.
Accurate 3D reconstruction using multi-phase ToF camera
The depth quality of a time-of-flight (ToF) camera is influenced by many systematic and non-systematic errors1. In this paper we present a simple method to correct and reduce these errors and propose a multi-phase approach to improve the depth acquisition accuracy. Compared with traditional calibration methods, we take the position of light source into account, and calibrate the light source together with the camera to reduce depth distortion. To ameliorate the sensor errors caused in the manufacturing process, a Look-up Table (LUT) is used to correct pixel-related errors. Besides, we capture images with multiple phases and apply FFT to get the true depth. By the proposed approach, we are able to reconstruct an accurate 3D model with RMSE of the measured depth belowing 1.2mm.
Excitation of topological insulator plasmons by two-dimensional periodic structure
Xiao Feng Gu, Wei Bing Lu, Xiao Bing Li, et al.
Plasmons induced by topological insulator (TI) Bi2Se3 micro-ribbon arrays have been experimentally observed recently (Nature nanotechnology 2013, 8, 556-560). In this letter, the surface plasmons excited by TI Bi2Se3 micro-disk arrays are investigated by the methods of full-wave numerical simulations. Numerical simulation results show that thin Bi2Se3 micro-disk arrays can support dipolar plasmon resonances in the terahertz (THz) regimes and the absorptions can be tuned by the structure parameters. In addition to the plasmon mode, two phonon-mode responses are also observed, which confirms the experimental results of micro-ribbon arrays. Our work further proves that TI can be a good candidate of plasmonic platform.
Improvement on the polynomial modeling of digital camera colorimetric characterization
Xiaoqiao Huang, Hongfei Yu, Junsheng Shi, et al.
The digital camera has become a requisite for people’s life, also essential in imaging applications, and it is important to get more accurate colors with digital camera. The colorimetric characterization of digital camera is the basis of image copy and color management process. One of the traditional methods for deriving a colorimetric mapping between camera RGB signals and the tristimulus values CIEXYZ is to use polynomial modeling with 3×11 polynomial transfer matrices. In this paper, an improved polynomial modeling is presented, in which the normalized luminance replaces the camera inherent RGB values in the traditional polynomial modeling. The improved modeling can be described by a two stage model. The first stage, relationship between the camera RGB values and normalized luminance with six gray patches in the X-rite ColorChecker 24-color card was described as "Gamma", camera RGB values were converted into normalized luminance using Gamma. The second stage, the traditional polynomial modeling was improved to the colorimetric mapping between normalized luminance and the CIEXYZ. Meanwhile, this method was used under daylight lighting environment, the users can not measure the CIEXYZ of the color target char using professional instruments, but they can accomplish the task of the colorimetric characterization of digital camera. The experimental results show that: (1) the proposed method for the colorimetric characterization of digital camera performs better than traditional polynomial modeling; (2) it’s a feasible approach to handle the color characteristics using this method under daylight environment without professional instruments, the result can satisfy for request of simple application.