Proceedings Volume 10988

Automatic Target Recognition XXIX

cover
Proceedings Volume 10988

Automatic Target Recognition XXIX

Purchase the printed version of this volume at proceedings.com or access the digital version at SPIE Digital Library.

Volume Details

Date Published: 26 July 2019
Contents: 9 Sessions, 31 Papers, 28 Presentations
Conference: SPIE Defense + Commercial Sensing 2019
Volume Number: 10988

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Front Matter: Volume 10988
  • Advanced Algorithms in ATR I
  • Advances in Machine Learning for ATR I
  • Advanced Algorithms in ATR II
  • Advanced Algorithms in ATR III
  • Advances in Machine Learning for ATR II
  • Advanced Algorithms in ATR IV
  • Advanced Algorithms in Remote Sensing I
  • Advanced Algorithms in Remote Sensing II
Front Matter: Volume 10988
icon_mobile_dropdown
Front Matter: Volume 10988
This PDF file contains the front matter associated with SPIE Proceedings Volume 10988, including the Title Page, Copyright Information, Table of Contents, Author and Conference Committee lists.
Advanced Algorithms in ATR I
icon_mobile_dropdown
HySARNet: a hybrid machine learning approach to synthetic aperture radar automatic target recognition
Automatic Target Recognition (ATR) in Synthetic Aperture Radar (SAR) for wide-area search is a difficult problem for both classic techniques and state-of-the-art approaches. Deep Learning (DL) techniques have been shown to be effective at detection and classification, however they require significant amounts of training data. Sliding window detectors with Convolutional Neural Network (CNN) backbones for classification typically suffer from localization error, poor compute efficiency, and need to be tuned to the size of the target. Our approach to the wide-area search problem is an architecture that combines classic ATR techniques with a ResNet-18 backbone. The detector is dual-stage and consists of an optimized Constant False Alarm Rate (CFAR) screener and a Bayesian Neural Network (BNN) detector which provides a significant speed advantage over standard sliding window approaches. It also reduces false alarms while maintaining a high detection rate. This allows the classifier to run on fewer detections improving processing speed. This paper’s focus tests out the BNN and CNN components of HySARNet through experiments to determine their robustness to variations in graze angle, resolution, and additive noise. Synthetic targets are also experimented with for training the CNN. Synthetic data has the potential to allow for the ability to train on hard to find targets where little or no data exists. SAR simulation software and 3D CAD models are used to generate the synthetic targets. This paper focuses on the utilization of the Moving and Stationary Target Acquisition (MSTAR) dataset which is the widely used, standard data set for SAR ATR publications.
Shape-based ATR for wide-area processing of satellite imagery
Stephen P. DelMarco, Victor Tom, Helen Webb, et al.
While Machine Learning (ML) Automatic Target Recognition (ATR) represents the state-of-the art in target recognition, model-based ATR plays a valuable role. Model-based ATR complements machine learning ATR approaches by filling a near-term niche. While explainable Artificial Intelligence (AI) is not yet fully realized, model-based ATR serves to validate machine learning recognition decisions, and thus instills confidence in ML target calls. Alternatively, model-based ATR can act as a stand-alone ATR component, particularly in scenarios in which a small number of targets are of interest, e.g., “target-of-the-day” engagements. Model-based ATR approaches need no training data, and thus provide an alternative to machine learning approaches in the absence of sufficient quantities of real, or sufficiently high-fidelity synthetic, training data. In this paper, we present an approach to model-based ATR, called Shape-Based ATR (SB-ATR), which captures salient target shape information for recognizing targets in wide-area satellite imagery. SB-ATR finds the right blend of coarse 3-D target shape abstraction and target realism to provide robustness against target variations and environmental operating conditions, while simultaneously providing high-performance target recognition. The approach uses newer, robust forms of image correlation for matching a predicted target shape against the image. Shape prediction searches over target pose, and uses satellite metadata and solar geometry to generate realistic target shape and shadow predictions. The correlation matchers provide tolerance to illumination variations, moderate occlusions, image distortions and noise, and geometric differences between models and real targets. We present technical details of the shape-based approach, and provide numerical target recognition results on real-world satellite imagery demonstrating performance.
Discrimination of forests and man-made targets in SAR images based on spectrum analysis
Bin Zou, Weike Li, Yu Xin, et al.
In SAR images, the forests and man-made targets share a similar scattering power and scattering mechanism and they present a similar roughness or complexity in the local scale of images. However, it can be found that the canopy of forests has different degrees of pixel value changing on the small scale by image and scattering analysis. In this paper, spectrum analysis is employed to construct a novel method named Modified Spectrum Power to extract the differences and to discriminate forests and man-made targets. Fourier transformation is employed to acquire the frequency-matrix which represents the spectrum of pixels on the small scale and a weight-matrix is used to modify the amplitudes of components of different frequencies in the frequency-matrix aiming at enhancing the high-frequency component and weakening the low-frequency component. The summation of all elements in the modified frequency-matrix is defined as the Modified Spectrum Power. Based on the Modified Spectrum Power, forests can be discriminated from other targets with a high accuracy and man-made targets can be discriminated based on it. Experiments validate the ability of the Modified Spectrum Power on the discrimination between forests and man-made targets.
Advances in Machine Learning for ATR I
icon_mobile_dropdown
Explainable automatic target recognition (XATR)
Sundip R. Desai, Nhat X. Nguyen, Moses W. Chan
An explainable automatic target recognition (XATR) algorithm with part-based representation of 2D and 3D objects is presented. The algorithm employs a two-phase approach. In the first phase, a collection of Convolutional Neural Networks (CNNs) recognizes major parts of these objects, also known as the vocabulary. A Markov Logic Network (MLN) and structure learning mechanism are used to learn the geometric and spatial relationships between the parts in the vocabulary that best describe the objects. The resultant network offers three unique features: 1) the inference results are explainable with qualitative information involving the vocabulary that make up the object; 2) the part-based approach achieves robust recognition performance in cases of partially occluded objects or images of hidden object under canopy; and 3) different object representations can be created by varying the vocabulary and permuting learned relationships.
A comparison of target detection algorithms using DSIAC ATR algorithm development data set
In this paper, we present preliminary results of infra-red target detection using the well-known Faster R-CNN network using a publicly available MWIR data set released by NVESD. We characterize the difficulty level of the images in terms of pixels on target (POT) and the local contrast. We then evaluate the performance of the network under challenging conditions and when the number of training images are varied.
Fundamentals of target classification using deep learning
In this paper we examine the application of deep learning for automated target recognition (ATR) using a shallow convolutional neural network (CNN) and infrared images from a public domain data provided by US Army Night Vision Laboratories. This study is motivated by the need for high detection and a low false alarm rate when searching for targets in sensor imagery. The goal of this study was to determine a range of optimal thresholds at which to classify an image as a target using a CNN, and an upper bound of the number of training images required for optimal performance. We used a Difference of Gaussian (DoG) kernel to localize targets by detecting the brightest patches in an image and using these patches as testing data for our network. Our CNN was successful in distinguishing between targets and clutter, and results found by our approach were favorably comparable to ground truth.
Simple linear regression model based data clustering
KMeans is one of most popular algorithms in data mining (ranking number 2) and has be widely used in many fields. KMeans uses Euclidean distance to compare two data. However Euclidean distance is sensitive to linear transform in data collection process. Due to these linear transforms, the distance between two data points for the same class (intra-class distance) may larger than those for different classes (inter-class distance) that may cause low clustering performance for KMeans algorithm. In this paper, we propose simple linear regression approach for data clustering. Instead of using Euclidean distance to measure the difference, we recommend using the goodness of fitting (or normalized cross correlation) to measure the similarity and compare two data points. Using this new data comparison technique, we introduce linear regression approach for data clustering and demonstrate that the proposed method has higher performance and low computational cost than KMeans methods.
Advanced Algorithms in ATR II
icon_mobile_dropdown
Fast and robust detection of oil palm trees using high-resolution remote sensing images
Oil palm tree detection is of great significance for improving the irrigation, estimating the yield of palm oil, and predicting the expansion trend, etc. Existing tree detection methods include traditional image processing, machine learning methods, and sliding window based deep learning methods. In this paper, we proposed a deep learning based end-to-end method for oil palm detection in large scale. First, we built an oil palm sample dataset from 0.1m-resolution Unmanned Aerial Vehicle (UAV) images. Second, we implemented five state-of-the-art object detection algorithms (i.e. Faster- RCNN, VGG-SSD, YOLO-v3, RetinaNet and Mobilenet-SSD) and evaluated their performances for detecting the tree crown size and the location of oil palms. Moreover, we designed an overlapping partition method to improve the oil palm detection results of the UAV images in over 40,000 × 40,000 pixels. Experiment results demonstrate that in terms of the detection accuracy, VGG-SSD achieves the best accuracy of 90.91% on the validation dataset, followed by YOLO-v3, RetinaNet, Mobilenet-SSD and Faster RCNN. Meanwhile, we compared the detection time of the five object detection algorithms. Mobilenet-SSD achieves the highest detection speed among five algorithms (12.81ms per image in 500×500 pixels), with the speedup ratios of 17.5×, 10.2×, 4.51×, and 17.33× compared with Faster-RCNN, VGG-SSD, YOLO-v3 and RetinaNet. The results show that our proposed oil palm detection method is of great practical value to the precision agriculture of the oil palm industry.
Semantic segmentation based large-scale oil palm plantation detection using high-resolution satellite images
Detecting oil palm plantation from high-resolution satellite images can provide the necessary information for palm oil production estimation and oil palm plantation layout planning, etc. In this paper, we proposed a novel semantic segmentation based approach for large-scale oil palm plantation detection using QuickBird images and Google Earth Images (in 0.6-m spatial resolution) in Malaysia. We manually labeled a dataset for pixel-wise semantic segmentation into four categories: oil palm plantation, other vegetation, impervious/cloud, and the others (e.g. water and uncertain pixels). We presented an end-to-end deep convolutional neural network (DCNN) for semantic segmentation followed by fully connected conditional random fields (CRF) and applied an ensemble learning method to improve the localization of boundaries. The overall accuracy and mean IoU of our proposed approach in test regions are 95.27% and 88.46%, which are greatly better than the results of the other three common semantic segmentation methods and patch-based CNN method.
Advanced Algorithms in ATR III
icon_mobile_dropdown
Design of adversarial targets: fooling deep ATR systems
Deep Convolutional Neural Networks (DCNN) have proven to be an exceptional tool for object recognition in various computer vision applications. However, recent findings have shown that such state of the art models can be easily deceived by inserting slight imperceptible perturbations to key pixels in the input image. In this paper, we focus on deceiving Automatic Target Recognition(ATR) classiers. These classiers are built to recognize specified targets in a scene and also simultaneously identify their class types. In our work, we explore the vulnerabilities of DCNN-based target classifiers. We demonstrate significant progress in developing infrared adversarial target by adding small perturbations to the input image such that the image perturbation cannot be easily detected. The algorithm is built to adapt to both targeted and non-targeted adversarial attacks. Our findings reveal promising results that reflect serious implications of adversarial attacks.
Comparing classifiers that exploit random subspaces
Jamie Gantert, David Gray, Don Hulsey, et al.
Many current classification models, such as Random Kitchen Sinks and Extreme Learning Machines (ELM), minimize the need for expert-defined features by transforming the measurement spaces into a set of "features" via random functions or projections. Alternatively, Random Forests exploit random subspaces by limiting tree partitions (i.e. nodes of the tree) to be selected from randomly generated subsets of features. For a synthetic aperture RADAR classification task, and given two orthonormal measurement representations (spatial and multi-scale Haar wavelet), this work compares and contrasts ELM and Random Forest classifier performance as a function of (a) input measurement representation, (b) classifier complexity, and (c) measurement domain mismatch. For the ELM classifier, we also compare two random projection encodings.
Radar target recognition using wavelet-based features extracted from compressively sensed signatures
This paper addresses the loss (if any) in radar target recognition performance if the features are extracted directly in the compressive domain compared to those extracted in the classical (Nyquist rate) domain. This study examines the impact of extracting wavelet features from compressively sampled signatures on recognition performance. Two other comparison schemes involve; 1) signal reconstruction after compressive sampling followed by wavelet decomposition, and 2) wavelet decomposition applied directly onto compressively sampled signatures using the compressive-domain equivalent discrete wavelet transform. These comparisons use real radar signatures collected in a compact range, and include various additive noise and azimuth ambiguity scenarios.
Advances in Machine Learning for ATR II
icon_mobile_dropdown
On generalization of deep learning recognizers in overhead imagery
In many applications, access to large quantities of labeled data is prohibitive due to its cost or lack of access to classes of interest. This problem is exacerbated in the context of specific subclasses and data types that are not easily accessible, such as remotes sensing data. The problem of limited data for specific classes of data is referred to as the low-shot or few-shot problem. Typically in the low-shot problem, there is a wealth of data from a source domain that is leveraged to train a convolutional feature extractor that is then applied to a target domain in innovative ways. In this work we apply this framework to the low-shot and fully sampled problem, in which the convolutional neural network is used as a feature extractor and paired with an alternate classifier. We evaluate the benefits of this approach in two contexts, a baseline problem, and limited training data. Additionally, we investigate the impact of loss function selection and sequestering of low-shot data on the classification performance of this approach. We present an applications of these techniques on the recent public xView dataset.
Automatic machine learning for target recognition
Automatic Target Recognition (ATR) seeks to improve upon techniques from signal processing, pattern recognition (PR), and information fusion. Currently, there is interest to extend traditional ATR methods by employing Artificial Intelligence (AI) and Machine Learning (ML). In support of current opportunities, the paper discusses a methodology entitled: Systems Experimentation efficiency effectives Evaluation Networks (SEeeEN). ATR differs from PR in that ATR is a system deployment leveraging pattern recognition (PR) in a networked environment for mission decision making, while PR/ML is a statistical representation of patterns for classification. ATR analysis has long been part of the COMPrehensive Assessment of Sensor Exploitation (COMPASE) Center utilizing measures of performance (e.g., efficiency) and measures of effectiveness (e.g., robustness) for ATR evaluation. The paper highlights available multimodal data sets for Automated ML Target Recognition (AMLTR).
Multisource deep learning for situation awareness
The resurgence of interest in artificial intelligence (AI) stems from impressive deep learning (DL) performance such as hierarchical supervised training using a Convolutional Neural Network (CNN). Current DL methods should provide contextual reasoning, explainable results, and repeatable understanding that require evaluation methods. This paper discusses DL techniques using multimodal (or multisource) information that extend measures of performance (MOP). Examples of joint multi-modal learning include imagery and text, video and radar, and other common sensor types. Issues with joint multimodal learning challenge many current methods and care is needed to apply machine learning methods. Results from Deep Multimodal Image Fusion (DMIF) using Electro-optical and infrared data demonstrate performance modeling based on distance to better understand DL robustness and quality to provide situation awareness.
Characterization of CNN classifier performance with respect to variation in optical contrast, using synthetic electro-optical data
Christopher Menart, Colin Leong, Olga Mendoza-Schrock, et al.
Deep neural networks demonstrate high performance at classifying high-dimensional signals, but often fail to generalize to data that is different from the data they were trained on. In this paper, we investigate the resilience of convolutional neural networks (CNNs) to unforeseen operating conditions. Specifically, we empirically evaluate the ability of CNN models to generalize across changes in image contrast. Multiple models are trained on electro- optical (EO) or near-infrared (IR) data, and are evaluated in environments with degraded contrast compared to training. Experiments are replicated across varying architectures, including state-of-the-art classification models such as Resnet-152, and across both synthetic and measured datasets. In comparison to models trained and evaluated on identically-distributed data, these models can generalize well when contrast invariance is built up through data augmentation. Future work will investigate CNN ability to generalize to other changes in operating conditions.
Advanced Algorithms in ATR IV
icon_mobile_dropdown
Fast and accurate target detection in overhead imagery using double convolution neural networks
We propose a Double Convolutional Neural Networks (D2CNN) framework for automatic target detection. D2CNN achieved high speed and high positional accuracy on our high-altitude imagery dataset. Translation invariance in convolutional neural network (CNN) is a double-edged sword. A CNN with large translation invariance is fast, but suffers positional accuracy, which is critical for automatic target detection. A CNN with small translation invariance can achieve high positional accuracy at the expense of speed. In a typical target detection case, targets are very sparse. In our D2CNN framework we employ two separate CNN. The first with large translation invariance generates region proposals. The second CNN with small translation invariance detects targets with high positional accuracy in the proposed regions by the first CNN. The two CNN are trained separately using different training strategies. Training examples are shared between the two CNN. However, data augmentation algorithms are very different. For the first CNN data augmentation, we place an object at various locations such as the center, the lower left portion, the up-right portion. For the second CNN data augmentation, an object/target is always centered. We fine tune hyper-parameters of pooling and convolution layers to increase translation invariance for the first CNN, and decrease translation invariance for the second CNN.
Neural network classification of degraded imagery using soft labels: towards human-level performance with “accurate” likelihoods?
The vast majority of recent progress in deep learning for computer vision has been demonstrated on problems with low irreducible error rates. While it is natural to use hard “one-hot” encoded training labels for such problems, this may not be the case in applications with large irreducible error. This includes classification problems on severely degraded or classambiguous imagery. Furthermore, databases primarily consisting of degraded examples are difficult to diagnose in terms of assuring that non-causal statistical correlations across the training and test sets do not exist for certain classes. Expert image analysts however, are typically well-regularized and will not overfit to such correlations. In this work, soft labels are applied to a surrogate problem with large irreducible error, where the labels are generated by an ensemble of networks serving as a proxy for human expert labelers. Results of networks trained on these soft targets, versus their one-hot counterparts are compared. A concept for an “imagination mechanism” in neural networks training on soft labels is also introduced.
Manifestly positive series approximation to probability densities
When one expands a probability density in a series and truncates the series, the result is generally not a manifestly positive density. Such is the case, for example, in the classical Edgeworth and Gram-Charlier series. In contrast, in quantum mechanics, approximation methods always retain the manifestly positive aspect of a probability density. We explore this fundamental difference and attempt to modify standard probability theory using the methods of quantum mechanics so that expansions result in a manifestly positive probability density.
Physically realizable adversarial examples for convolutional object detection algorithms
David R. Chambers, H. Abe Garza
In our work, we make two primary contributions to the field of adversarial example generation for convolutional neural network based perception technologies. First of all, we extend recent work on physically realizable adversarial examples to make them more robust to translation, rotation, and scale in real-world scenarios. Secondly, we demonstrate attacks against object detection neural networks rather than considering only the simpler problem of classification, demonstrating the ability to force these networks to mislocalize as well as misclassify. We demonstrate our method on multiple object detection frameworks, including Faster R-CNN, YOLO v3, and our own single-shot detection architecture.
Advanced Algorithms in Remote Sensing I
icon_mobile_dropdown
Transfer learning for aided target recognition: comparing deep learning to other machine learning approaches
Aided target recognition (AiTR), the problem of classifying objects from sensor data, is an important problem with applications across industry and defense. While classification algorithms continue to improve, they often require more training data than is available or they do not transfer well to settings not represented in the training set. These problems are mitigated by transfer learning (TL), where knowledge gained in a well-understood source domain is transferred to a target domain of interest. In this context, the target domain could represents a poorly-labeled dataset, a different sensor, or an altogether new set of classes to identify. While TL for classification has been an active area of machine learning (ML) research for decades, transfer learning within a deep learning framework remains a relatively new area of research. Although deep learning (DL) provides exceptional modeling flexibility and accuracy on recent real world problems, open questions remain regarding how much transfer benefit is gained by using DL versus other ML architectures. Our goal is to address this shortcoming by comparing transfer learning within a DL framework to other ML approaches across transfer tasks and datasets. Our main contributions are: 1) an empirical analysis of DL and ML algorithms on several transfer tasks and domains including gene expressions and satellite imagery, and 2) a discussion of the limitations and assumptions of TL for aided target recognition - both for DL and ML in general. We close with a discussion of future directions for DL transfer.
Deep learning based super resolution of aerial and satellite imagery
Super-resolution is the process of creating high-resolution (HR) images from low-resolution (LR) images. Single Image Super Resolution (SISR) is challenging because high-frequency image content typically cannot be recovered from the low-resolution image and the absence of high-frequency information thus limits the quality of the HR image. Furthermore, SISR is an ill-posed problem because a LR image can yield several possible high-resolution images. To address this issue, numerous techniques have been proposed but recently deep learning based methods have become popular. Convolutional Neural Network (CNN) approaches to deep learning have shown great success in numerous computer vision tasks. Therefore, it is worthwhile to explore CNN-based approaches to address this challenging problem. This paper presents a deep learning based super resolution (DLSR) approach to find a HR image from its LR counterpart by learning the mapping between them. This mapping is possible because LR and HR images have similar image contents and differ primarily in high-frequency details. In addition, DLSR utilizes residual learning strategy where the network learns to estimate a residual image. DLSR is applied to both aerial and satellite imagery and resulting estimates are compared against the traditional methods using metrics such as Peak Signal to Noise Ratio (PSNR), Structure Similarity Index Metric (SSIM), and Naturalness Image Quality Evaluator (NIQE) also called perceptual quality index. Results obtained depict that DLSR outperform the traditional approaches.
Generalization ability of region proposal networks for multispectral person detection
Kevin Fritz, Daniel König, Ulrich Klauck, et al.
Multispectral person detection aims at automatically localizing humans in images that consist of multiple spectral bands. Usually, the visual-optical (VIS) and the thermal infrared (IR) spectra are combined to achieve higher robustness for person detection especially in insufficiently illuminated scenes. This paper focuses on analyzing existing detection approaches for their generalization ability. Generalization is a key feature for machine learning based detection algorithms that are supposed to perform well across different datasets. Inspired by recent literature regarding person detection in the VIS spectrum, we perform a cross-validation study to empirically determine the most promising dataset to train a well-generalizing detector. Therefore, we pick one reference Deep Convolutional Neural Network (DCNN) architecture as well as three different multispectral datasets. The Region Proposal Network (RPN) that was originally introduced for object detection within the popular Faster R-CNN is chosen as a reference DCNN. The reason for this choice is that a stand-alone RPN is able to serve as a competitive detector for two-class problems such as person detection. Furthermore, all current state-of-the-art approaches initially apply an RPN followed by individual classifiers. The three considered datasets are the KAIST Multispectral Pedestrian Benchmark including recently published improved annotations for training and testing, the Tokyo Multi-spectral Semantic Segmentation dataset, and the OSU Color-Thermal dataset including just recently released annotations. The experimental results show that the KAIST Multispectral Pedestrian Benchmark with its improved annotations provides the best basis to train a DCNN with good generalization ability compared to the other two multispectral datasets. On average, this detection model achieves a log-average Miss Rate (MR) of 29.74% evaluated on the reasonable test subsets of the three analyzed datasets.
Modeling the performance of modern sensor systems
Timothy D. Ross, Jeffrey P. Duffy, Richard J. Thomas, et al.
Sensor system performance models (i.e., ATR performance models (APMs)) are needed online for decision making, multi-source fusion, sensor planning and parameter setting and offline to understand performance in operational settings for ATR development, down selects, and transition decisions and mission level simulations generally. Modern sensor systems involve resolved targets and non-human / ATR exploitation. This brings special modeling challenges, e.g., ATRs' additional sensitivities and data set mismatches. APM development is a multi-disciplinary problem. Perspectives from these disciplines will introduce both APM challenges and challenge surmounting approaches via the COMPASE Model Framework (CMF). From the confusion matrix perspective, the challenge of operating condition (OC) space resolution, which is needed for APM sensitivity and for broad model applicability, is introduced along with CMF's OC tree approach. From the sensor modeling perspective, the sensor system modeling has the challenge of taking full advantage of the rich first principles understanding of sensor effects while including effects beyond sensor image metrics. This challenge is approached in CMF through pixels-on-target adjustments based on uniqueness, clarity and conformity. From the machine learning perspective, there is the challenge of taking advantage of empirical evidence for ATR performance, even though it is non-representative in terms of OC distribution. This is being approached in CMF through stratification of OC space and explicit accounting of OC distributions. The CMF approach is realized in an APM development and assessment framework and has resulted in a suite of APMs that are being used in online and offline applications.
Advanced Algorithms in Remote Sensing II
icon_mobile_dropdown
Cross-spectral face recognition with image quality disparity using image fusion
Cross-spectral matching of active infrared (IR) facial probes to a visible light facial gallery is a new challenging problem. This scenario is brought up by a number of real-world surveillance tasks such as recognition of subjects at night or under severe atmospheric conditions. When combined with long distance, this problem becomes even more challenging due to deteriorated quality of the IR data, causing another issue called image quality disparity between the visible light and the IR imagery. To address this quality disparity in the heterogeneous images due to atmospheric and camera effects - typical degrading factors observed in long range IR data, we propose an image fusion-based method which fuses multiple IR facial images together and yields a higher-quality IR facial image. Wavelet decomposition using the Harr basis is conducted first and then the coefficients are merged according to a rule that treats the high and low frequencies differently, followed by an inverse wavelet transform step to reconstruct the final higher-quality IR facial image. Two sub-bands of the IR spectrum, namely short-wave infrared (SWIR) and near-infrared (NIR), as well as two different long standoffs of 50 m and 106 m are involved. Experiments show that in all cases of different sub-bands and standoffs our image fusion-based method outperforms the one without image fusion, with GARs significantly increased by 3.51% and 1.09% for SWIR 50 m and NIR 50 m at FAR=10%, respectively. The equal error rates are reduced by 2.61% and 0.90% for SWIR 50 m and NIR 50 m, respectively.
Applying image processing techniques to security data: towards cyber target recognition (Conference Presentation)
Numerous image processing techniques have been developed for physical environment target recognition. This paper considers the application of these common image processing techniques to the field of cybersecurity and, in particular, cyber target recognition. It discusses the different types of data that a cyber system can use for assessment and assesses the efficacy of multiple image processing-based target recognition technique for cyberspace use. Further, these techniques are evaluated the context of particular real-world challenges. Hybrid systems, that perform real world visible light target recognition and also cyber sensing, are discussed and the efficacy of these hybrid systems is considered.
Nighttime periocular recognition at long standoffs with deep learned features
The periocular region is considered as a relatively new modality of biometrics and serves as a substitute solution for face recognition with occlusion. Moreover, many application scenarios occur at nighttime, such as nighttime surveillance. To address this problem, we study the topic of periocular recognition at nighttime using the infrared spectrum. Utilizing a simplified version of DeepFace, a convolutional neural networks designed for face recognition, we investigate nighttime periocular recognition at both short and long standoffs, namely 1.5 m, 50 m and 106 m. A subband of the active infrared spectrum { near-infrared (NIR) { is involved. During generation of the periocular dataset, preprocessing is conducted on the original face images, including alignment, cropping and intensity conversion. The verification results of the periocular region using DeepFace are compared with the results of two conventional methods { LBP and PCA. Experiments have shown that the DeepFace algorithm performs fairly well (with GAR over 90% at FAR=0.1%) using the periocular region as a modality even at nighttime. The framework also shows superiority to both LBP and PCA in all cases of different light wavelengths and standoffs.
The development of synthetic thermal image generation tools and training data at FLIR
Arthur Stout, Kedar Madineni, Louis Tremblay, et al.
The importance of training data in the development of highly accurate target classifiers/detectors is critical as the industry transitions from proof of concept to the deployment of intelligent sensors into autonomous driving systems in the commercial transportation market and automatic target recognition into military systems. We describe the development of synthetic thermal image generation software, its application in network training and target classifier performance results.
Real-time beacon identification using linear and kernel (non-linear) Support Vector Machine, Multiple Kernel Learning (MKL), and Light Detection and Ranging (LIDAR) 3D data
The target of this research is to develop a machine-learning classification system for object detection based on three-dimensional (3D) Light Detection and Ranging (LiDAR) sensing. The proposed real-time system operates a LiDAR sensor on an industrial vehicle as part of upgrading the vehicle to provide autonomous capabilities. We have developed 3D features which allow a linear Support Vector Machine (SVM), Kernel (non-linear) SVM, as well as Multiple Kernel Learning (MKL), to determine if objects in the LiDARs field of view are beacons (an object designed to delineate a no-entry zone) or other objects (e.g. people, buildings, equipment, etc.). Results from multiple data collections are analyzed and presented. Moreover, the feature effectiveness and the pros and cons of each approach are examined.
iECO learned matched filters for automatic target recognition in synthetic midwave infrared imagery
Object recognition is a critical component in most computer vision applications, specifically image classification tasks. Often, it is desired to design an approach that either learns from the data directly or extracts discriminative features from the imagery that can be used for object classification. Most active research in the field of computer vision is concerned with machine learning at some level, whether it be a completely automated process from start to finish via deep learning strategies, or the extraction of human-derived features from the imagery that is subjected to a machine learning-based classifier. However, there are numerous applications in which a particular known object is of interest. In such a setting where a relatively specific object and scene are known a priori, one can develop an extremely robust automatic target recognition (ATR) system using matched filtering. Herein, we consider the use of machine learning to help identify a near-optimal template for matched filtering for a given problem. Specifically, the improved Evolution Constructed (iECO) framework is employed to learn the discriminative target signature(s) to define the template that leads to improved ATR performance in terms of accuracy and a reduced false alarm rate (FAR). Experiments are conducted on ideal synthetic midwave infrared imagery, and results are reported via receiver operating characteristic curves.
Research on image processing and intelligent recognition of space debris
Linghua Guo, Haopeng Zhang, Hua Zhai, et al.
The space debris may cause catastrophic damage to spacecraft. The detection, monitoring and identification of space debris can improve space situational awareness, and implement debris avoidance and debris removal tasks. Therefore, this topic has become the basis and premise to ensure the safety of spacecraft operation. In order to meet the requirements of space debris detection and removal, this paper proposed the space debris image processing algorithms including fixed-mode noise estimation and removal, random noise suppression, image enhancement and so on. By using which, the ability of image SNR improved more than 10%, while the image detail resolution improved 10%. On the basis of image processing, the deep neural network mode has been established and combined with the artificial marker semi-supervised training method. The research of space debris feature extraction and automatic detection has been preliminarily carried out.