Proceedings Volume 11169

Artificial Intelligence and Machine Learning in Defense Applications

cover
Proceedings Volume 11169

Artificial Intelligence and Machine Learning in Defense Applications

Purchase the printed version of this volume at proceedings.com or access the digital version at SPIE Digital Library.

Volume Details

Date Published: 4 November 2019
Contents: 7 Sessions, 27 Papers, 15 Presentations
Conference: SPIE Security + Defence 2019
Volume Number: 11169

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Front Matter: Volume 11169
  • AI in Intelligence, Surveillance, and Reconnaissance (Joint Session 5)
  • Object Detection (Joint Session 6)
  • Segmentation (Joint Session 7)
  • AI for Defence Applications
  • Image Enhancement, Fusion, and Backgrounds
  • Posters-Tuesday
Front Matter: Volume 11169
icon_mobile_dropdown
Front Matter: Volume 11169
This PDF file contains the front matter associated with SPIE Proceedings Volume 11169 including the Title Page, Copyright information, Table of Contents, Introduction, and Conference Committee listing.
AI in Intelligence, Surveillance, and Reconnaissance (Joint Session 5)
icon_mobile_dropdown
Real-time CNN-based object detection and classification for outdoor surveillance images: daytime and thermal
Meir Zilkha, Assaf B. Spanier
Real time object detection and classification is essential for outdoor surveillance. Current state of the art real time object detection CNNs are trained on natural image datasets. However, outdoor surveillance images have very different characteristics: objects tend to be small and difficult to distinguish (averaging only 3% of image size). In addition, images come in different modalities, for example, nighttime surveillance images are grayscale thermal images representing heat emission not light reflection. Our dataset of images acquired from surveillance videos is comprised of ˜ 640 Daytime (DAY) color images and ˜ 360 nighttime grayscale THERMAL images. The dataset included three object categories: animals, people and vehicles. Because of the lack of large datasets for these scenarios, we evaluated using the much larger VOC dataset to augment our datasets. We conducted a study to determine the best combination of images to include in the training dataset, and how different types of images (i.e. DAY, THERMAL and VOC) affect each-others performance. We examined state of the art object detection and classification CNN architectures, focusing on accuracy and real time performance. By combining different images types THERMAL, DAY and 1200 VOC images in one dataset, the best results were obtained using transfer learning on YOLO-V3 with SPP, achieving 89.5 mAP for DAY images, and 79.53 mAP for THERMAL images, running at 35 fps. This setup provides a robust solution for many surveillance scenarios: night and daytime; far, small objects, as well as zoomed-in, large objects.
A deep neural network model for hazard classification
Hazard learning algorithms employing ground penetrating radar (GPR) data for purposes of discrimination, detection, and classification suffer from a pernicious robustness problem; models trained on a particular physical region using a given sensor (antenna system) typically do not transfer effectively to diverse regions interrogated with differing sensors. We implement a novel training paradigm using region-based stratified cross-validation that improves learning induction across disparate data sets. We test this training paradigm on a novel deep neueral network architecture (DNN) and report empirical results from testing/training on data collected from multiple sites. Furthermore, we discuss the relationship between penalty loss and evaluation metrics.
Edge vs. cloud computing: where to do image processing for surveillance?
Surveillance applications require transmission of video flows to a high performance remote server or perform onsite processing. In both cases, image analysis techniques are used to support tasks for situational awareness, such as: detection, classification, and tracking. Nowadays, neural networks are widely used as data driven Machine Learning methods. The hardware required for deploying state-of-the-art neural networks models implies massive and invasive installations. Embedded devices could be an alternative running with more modest settings. Then, it is important to understand which trade-off is better for this surveillance application. When a new camera angle lead to an unknown scenario, detection and classification deteriorates in terms of precision and recall. To solve this problem we analyze and predict the angles in different scenarios using homography techniques. We proposed the inclusion of more data sets and labeled images from different angles. This means that more images does not always mean better detection. We have a model by transfer of learning with images from scenarios with similar angles to the real environments of detection. We compare the performance of different implementations of an automated counter in an embedded device using a GPU versus a server performing the same task. Results show that an Edge computing implementation is possible with a good performance. Possible solutions involve regulating bit rate and skipping a specific number of frames (skip-n-frames) to rise the number of Frames per Second of the system. The results show that using both: an embedded device and a server is possible to perform real-time object detection and counting with precision values of around 92% in the embedded device and 93% in a server.
Object Detection (Joint Session 6)
icon_mobile_dropdown
Spatiotemporal detection of maritime targets using neural networks
Raimon H. R. Pruim, Annegreet Van Opbroek, Maarten Kruithof, et al.
Automatic detection and tracking of maritime targets in imagery can greatly increase situation awareness on naval vessels. Various methods for detection and tracking have been proposed so far, both for reasoning as well as for learning approaches. Learning approaches have the promise to outperform reasoning approaches. They typically detect targets in a single frame, followed by a tracking step in order to follow targets over time. However, such approaches are sub-optimal for detection of small or distant objects, because these are hard to distinguish in single frames. We propose a new spatiotemporal learning approach that detects targets directly from a series of frames. This new method is based on a deep learning segmentation model and is now applied to temporal input data. This way, targets are detected based not only on appearance in a single frame, but also on their movement over time. Detection hereby becomes more similar to how it is performed by the human eye: by focusing on structures that move differently compared to their surroundings. The performance of the proposed method is compared to both ground-truth detections and detections of a contrast-based detector that detects targets per frame. We investigate the performance on a variety of infrared video datasets, recorded with static and moving cameras, different types of targets, and different scenes. We show that spatiotemporal detection overall obtains similar to slightly better performance on detection of small objects compared to the state-of-the-art frame-wise detection method, while generalizing better with fewer adjustable parameters, and better clutter reduction.
Ship detection in synthetic aperture radar (SAR) images by deep learning
Öner Ayhan, Nigar Şen
In this paper, we propose a Convolutional Neural Network (CNN) based method to detect ships in Synthetic Aperture Radar (SAR) images. The architecture of proposed CNN has customized parts to detect small targets. In order to train, validate and test the CNN, TerraSAR-X Spot mode images are used. In the phase of data preparation, a GIS (Geographic Information System) specialist labels ships manually in all images. Later, image patches that contain ships are cropped and ground truths are also obtained from pre-labeled data. In the stage of train, data augmentation is used and the data divided into three parts: (i) train, (ii) validation, (iii) test. The training takes almost a day of duration with a NVIDIA GTX 1080 Ti graphic card. Results on test data shows that our method has promising detection performance for the ship targets on both open water and near harbors.
Multimodal object detection using unsupervised transfer learning and adaptation techniques
Rachael Abbott, Neil Robertson, Jesus Martinez del Rincon, et al.
Deep neural networks achieve state-of-the-art performance on object detection tasks with RGB data. However, there are many advantages of detection using multi-modal imagery for defence and security operations. For example, the IR modality offers persistent surveillance and is essential in poor lighting conditions and 24hr operation. It is, therefore, crucial to create an object detection system which can use IR imagery. Collecting and labelling large volumes of thermal imagery is incredibly expensive and time-consuming. Consequently, we propose to mobilise labelled RGB data to achieve detection in the IR modality. In this paper, we present a method for multi-modal object detection using unsupervised transfer learning and adaptation techniques. We train faster RCNN on RGB imagery and test with a thermal imager. The images contain object classes; people and land vehicles and represent real-life scenes which include clutter and occlusions. We improve the baseline F1-score by up to 20% through training with an additional loss function, which reduces the difference between RGB and IR feature maps. This work shows that unsupervised modality adaptation is possible, and we have the opportunity to maximise the use of labelled RGB imagery for detection in multiple modalities. The novelty of this work includes; the use of the IR imagery, modality adaption from RGB to IR for object detection and the ability to use real-life imagery in uncontrolled environments. The practical impact of this work to the defence and security community is an increase in performance and the saving of time and money in data collection and annotation.
Detecting and classifying small objects in thermal imagery using a deep neural network
Fredrik Hemström, Fredrik Nässtrom M.D., Jörgen Karlholm
In recent years the rise of deep learning neural networks has shown great results in image classification. Most of the previous work focuses on classification of fairly large objects in visual imagery. This paper presents a method of detecting and classifying small objects in thermal imagery using a deep learning method based on a RetinaNet network. The result shows that a deep neural network with a relative small set of labelled images can be trained to classify objects in thermal imagery. Objects from classes with the most training examples (cars, trucks and persons) can with relative high confidence be classified given an object size of 32×32 pixels or smaller.
Detecting unseen targets during inference in infrared imaging
Antoine d'Acremont, Alexandre Baussard, Ronan Fablet, et al.
Performing reliable target recognition in infrared imagery is a challenging problem due to the variation of the signatures of the targets caused by changes in the environment, the viewpoint or the state of the targets. Due to their state-of-the-art performance on several computer vision problems, Convolutional Neural Networks (CNNs) are particularly appealing in this context. However, CNNs may provide wrong classification results with high confidence. Robustness to disturbed inputs can be mitigated through the implementation of specific training strategies to improve classification performances. But they would generally require retraining or fine-tuning the CNN to face new forms of disturbed inputs. Besides, such strategies do not necessarily tackle novelty detection without training an auxiliary classifier. In this paper we propose two solutions to give the ability of a trained CNN to deal with both adversarial examples and novelty detection during inference. The first approach is based on one-class support vector machines (SVM) and the second one relies on the Local Outlier Factor (LOF) algorithm for example detection. We benchmark our contributions on SENSIAC database for a pre-trained network and evaluate how they may help mitigate false classifications on outliers and adversarial inputs.
Segmentation (Joint Session 7)
icon_mobile_dropdown
See clearer at night: towards robust nighttime semantic segmentation through day-night image conversion
In recent years, intelligent driving navigation and security monitoring have made considerable progress with the help of deep Convolutional Neural Networks (CNNs). As one of the state-of-the-art perception approaches, semantic segmentation unifies distinct detection tasks widely desired by both autonomous driving and security monitoring. Currently, semantic segmentation shows remarkable efficiency and reliability in standard scenarios such as daytime scenes with favorable illumination conditions. However, in face of adverse conditions such as the nighttime, semantic segmentation loses its accuracy significantly. One of the main causes of the problem is the lack of sufficient annotated segmentation datasets of nighttime scenes. In this paper, we propose a framework to alleviate the accuracy decline when semantic segmentation is taken to adverse conditions by using Generative Adversarial Networks (GANs). To bridge the daytime and nighttime image domains, we made key observation that compared to datasets in adverse conditions, there are considerable amount of segmentation datasets in standard conditions such as BDD and our collected ZJU datasets. Our GAN-based nighttime semantic segmentation framework includes two methods. In the first method, GANs were used to translate nighttime images to the daytime, thus semantic segmentation can be performed using robust models already trained on daytime datasets. In another method, we use GANs to translate different ratio of daytime images in the dataset to the nighttime but still with their labels. In this sense, synthetic nighttime segmentation datasets can be generated to yield models prepared to operate at nighttime conditions robustly. In our experiment, the later method significantly boosts the performance at the nighttime evidenced by quantitative results using Intersection over Union (IoU) and Pixel Accuracy (Acc). We show that the performance varies with respect to the proportion of synthetic nighttime images in the dataset, where the sweet spot corresponds to most robust performance across the day and night. The proposed framework not only makes contribution to the optimization of visual perception in intelligent vehicles, but also can be applied to diverse navigational assistance systems.
Semantic segmentation of panoramic images using a synthetic dataset
Panoramic images have advantages in information capacity and scene stability due to their large field of view (FoV). In this paper, we propose a method to synthesize a new dataset of panoramic image. We managed to stitch the images taken from different directions into panoramic images, together with their labeled images, to yield the panoramic semantic segmentation dataset denominated as SYNTHIA-PANO. For the purpose of finding out the effect of using panoramic images as training dataset, we designed and performed a comprehensive set of experiments. Experimental results show that using panoramic images as training data is beneficial to the segmentation result. In addition, it has been shown that by using panoramic images with a 180 degree FoV as training data the model has better performance. Furthermore, the model trained with panoramic images also has a better capacity to resist the image distortion. Our codes and SYNTHIA-PANO dataset are available: https://github.com/Francis515/SYNTHIA-PANO.
A comparative study of high-recall real-time semantic segmentation based on swift factorized network
Semantic Segmentation (SS) is the task to assign a semantic label to each pixel of the observed images, which is of crucial significance for autonomous vehicles, navigation assistance systems for the visually impaired, and augmented reality devices. However, there is still a long way for SS to be put into practice as there are two essential challenges that need to be addressed: efficiency and evaluation criterions for practical application. For specific application scenarios, different criterions need to be adopted. Recall rate is an important criterion for many tasks like autonomous vehicles. For autonomous vehicles, we need to focus on the detection of the traffic objects like cars, buses, and pedestrians, which should be detected with high recall rates. In other words, it is preferable to detect it wrongly than miss it, because the other traffic objects will be dangerous if the algorithm miss them and segment them as safe roadways. In this paper, our main goal is to explore possible methods to attain high recall rate. Firstly, we propose a real-time SS network named Swift Factorized Network (SFN). The proposed network is adapted from SwiftNet, whose structure is a typical U-shape structure with lateral connections. Inspired by ERFNet and Global Convolution Networks (GCNet), we propose two different blocks to enlarge valid receptive field. They do not take up too much calculation resources, but significantly enhance the performance compared with the baseline network. Secondly, we explore three ways to achieve higher recall rate, i.e loss function, classifier and decision rules. We perform a comprehensive set of experiments on state-of-the-art datasets including CamVid and Cityscapes. We demonstrate that our SS convolutional neural networks reach excellent performance. Furthermore, we make a detailed analysis and comparison of the three proposed methods on the promotion of recall rate.
AI for Defence Applications
icon_mobile_dropdown
A vision on hybrid AI for military applications
Judith Dijk, Klamer Schutte, Serena Oggero
Application of different Artificial Intelligence technologies is increasing over the past couple of years. At a high conceptual level, we can divide these technologies in two different categories: symbolic and sub-symbolic. The term “Hybrid AI” denotes the combination of symbolic and sub-symbolic AI. By combining both semantic reasoning and data-driven machine learning both human specified and data derived knowledge can be combined in one system. In this paper we explore the concept of Hybrid AI by the hand of architectural patterns from literature. The added value of the architectural patterns is that they provide a way to discuss the different elements in the processing pipeline. They stimulate discussion what the input and output of the different processing blocks are, and how they work together. When applying the available design patterns to real military imaging applications, we noticed that we needed more detail in the different blocks to specify the type of data or algorithms that are applied. In future work we will investigate how components such as online learning can be presented in this design pattern framework. We identified the need to further develop this approach with a more intertwined interaction between the reasoning and the data-driven part of the pipelines, and use more world knowledge, domain knowledge and relations between objects in the reasoning part. Improvements are also needed for online learning, where the knowledge of the system performance will be used to ask the users relevant information.
Applicability of AI methods to JISR resource planning
Roland Rodenbeck, Frank Reinert, Jennifer Sander, et al.
For maximizing the benefit of today’s ISR (Intelligence, Surveillance and Reconnaissance) systems, an improved collection planning is essential. In our paper we present an approach to apply artificial intelligence and machine learning in support of collection planning tasks. One subtask in collection planning requires matchmaking between ISR-resources (further referred as assets, combining sensors and their corresponding carriers) and collection requirements, taking additional operational constraints (for example mission risk) into account. This subtask requires high competence in assessment of asset capabilities in relation to collection requirements taking actual and future operational constraints into account and is mostly conducted in a time sensitive environment. We derive a general model of our matchmaking problem. This model serves, in combination with existing requirements derived from the operational domain, as input for the analysis of artificial intelligence and machine learning methods to work out their fundamental suitability and adaptability for our model. This subset will be further analyzed for its pros and cons, if only few operational data is available and the evolving knowledge of the use of resources during mission operation has to be taken into account.
Drone localization and identification using an acoustic array and supervised learning
Drones are well-known threats both in military and civil environments. Identifying them accurately and localizing their trajectory is an issue that more and more methods are trying to solve. Several modalities can be used to make it such as radar, optics, radio-frequency communications and acoustics. Nevertheless radar suffers from a lack of reflected signal for small targets, optical techniques can be very difficult to set in natural environments with small targets, and self-flying drones can avoid radio detection. Consequently, this paper deals with the remaining acoustic modality and aims to localize an acoustic source, then to identify it as a drone or a noise using array measurements and a supervised learning method. The acoustic array allows to determine the source direction of arrival and a spatial filtering is performed to improve the signal to noise ratio. A focused signal is then obtained and used for characterizing the source. The performances obtained to identify this source as a drone or not are compared for two different learning models. The first one uses two classes drone and noise with a classic Support Vector Machine model while the second one is based on an One Class Support Vector Machine algorithm where only the drone class is learned. A database is generated with 7001 observations of drone flights and 3818 observations of noise recordings within a controlled environment where signals are played one at a time, given that an observation is a sequence of 0.2 s of signal. Results of localization show an average error concerning the elevation angle bounded to 3.7° whereas identification results on this database give 99.5 % and 95.6 % accuracies for the two classes approach and the one class approach, respectively. It is shown that this high accuracy is reached thanks to the intrinsic separability of the created data obtained by the different features that have been chosen to compute.
Towards information extraction and semantic world modelling to support information management and intelligence creation in defense coalitions
Almuth Hoffmann, Achim Kuwertz, Jennifer Sander
This publication addresses a structured approach to support information management and intelligence creation in defense coalitions under consideration of the corresponding operational processes. From the methodical point of view, key aspects are the application of a semantic world modeling system and the dedicated combination of data-driven as well as knowledge-based Artificial Intelligence (AI) methods. In the context of this publication, during system operations, in particular Joint Intelligence, Surveillance and Reconnaissance (ISR) results in form of textual ISR reports being in accordance with NATO reporting standards and agreements serve as input to the world modeling system. To obtain maximum benefit from the respective information, relevant information elements have to be extracted from both, structured and unstructured parts of the reports and to be combined with information being already available in the semantic world modeling system. For structured parts of a report, a predefined mapping of the respective parts of the data model of the report to the target model of the semantic world modeling system can be applied. To extract the relevant information elements from unstructured parts of the report, Natural Language Processing (NLP) techniques are needed additionally. In this context, specific challenges with regard to the application of data-driven AI methods in the domain of defense are addressed through a two-step approach for information extraction from unstructured text based on an intermediate semantic representation.
Transparent object sensing with enhanced prior from deep convolutional neural network
Jing Wang, Jian Bai, Xiao Huang, et al.
In recent years, with the development of new materials, transparent objects are playing an increasingly important role in many fields, from industrial manufacturing to military technology. However, transparent objects sensing still remains a challenging problem in the area of computational imaging and optical engineering. As an indispensable part of 3-D modeling, transparent object sensing is a long-standing research topic, which aims to reconstruct the surface shape of a given transparent object using various kinds of measurement methods. In this paper, we put forward a new method for the sensing of such objects. Specifically, we focus on the sensing of thin transparent objects, including thin films and various kinds of nano-materials. The proposed method consists of two main steps. Firstly, we use a deep convolutional neural network to predict the original distribution of the objects from its recorded intensity pattern. Secondly, the predicted results are used as initial estimates, and the iterative projection phase retrieval algorithm is performed with the enhanced priors to obtain finer reconstruction results. The numerical experiment results turned out that, with the two steps, our method is able to reconstruct the surface shape of a given thin transparent object with a high speed and simple experimental setup. Moreover, the proposed method shows a new path of transparent object sensing with the combination of state-of-art deep learning technique and conventional computational imaging algorithm. It indicates that, following the same framework, the performance of such method can be significantly improved with more advanced hardware and software implementation.
Image Enhancement, Fusion, and Backgrounds
icon_mobile_dropdown
Deep learning for software-based turbulence mitigation in long-range imaging
Robert Nieuwenhuizen, Klamer Schutte
Optical imaging over long horizontal paths often suffers from the effects of atmospheric turbulence. Dynamic density variations in the air result in random spatiotemporally varying shifts and blurs in the recorded images. Software based turbulence mitigation algorithms provide a means of computationally reducing these turbulence effects in video. They provide camera operators with sharper and more stable imagery which supports them on visual recognition and identification tasks. Most turbulence mitigation algorithms rely on a form of temporal low pass filtering to remove the turbulence induced fluctuations. This filtering also suppresses high spatial frequency information and thus limits the ability of these algorithms to recover fine details. Here we propose a turbulence mitigation algorithm which employs a deep neural network to recover high spatial frequency information from turbulence degraded video. The proposed algorithm builds on our previous approach, where frame-to-frame estimates of image shifts are used to combine multiple frames. This approach is amended by using a deep neural network to deblur output images. For the related task of single image super-resolution, we have previously shown that such neural networks provide state-of-the-art image reconstruction performance. Here our neural network is trained using semi-synthetic image sequences of static scenes with simulated turbulence. We show that our deep learning based approach provides on semi-synthetic test data a substantial performance increase compared to our previous sharpening approach. We also apply our algorithm to real long range imagery of ships at sea and find a perceptually similar improvement in image quality as for semi-synthetic data.
Adversarial camouflage for naval vessels
The use of different types of camouflage is a longstanding technique employed by armed forces in order to avoid detection, classification or tracking of objects of military interest. Typically, the use of such camouflage is intended to fool human observers. However, in future battle theaters one must expect to face weapons that are ’artificially intelligent’ in some way, and the question then arises as to whether the same types of camouflage will be effective against such weapons. An equally important question is if it is possible to design camouflage in order to specifically confuse ’artificially intelligent’ adversaries and what such camouflage might look like. It is this latter question that is the object of the study reported here. In particular, we consider whether carefully designed patterns of camouflage will have a detrimental effect on the performance of neural networks trained to distinguish among different ship classes. We train a neural network to distinguish between different types of military and civilian vessels and specifically require the network to determine whether the vessel is military or civilian. We then use this network to train a second network, a generative adversarial network, that will generate patterns to overlay on parts of the vessels in such a way as to thwart the performance of the first network. We show that such adversarial camouflage is very effective in confusing the original classification network.
On the robustness of compressive sensing hyperspectral image reconstruction using convolutional neural network
Hyperspectral imaging is applied in a wide range of defense, security and law enforcement applications. The spectral data caries valuable information for tasks such as identification, detection, and classification. However, the capturing of the spectral information, together with the spatial information, requires a significant acquisition effort. In the recent years we have developed several compressive hyperspectral imaging techniques demonstrating reduction of the captured data by at least an order of magnitude. However, compressive sensing techniques typically require computational heavy and time consuming iterative reconstruction algorithms. The computational burden is even more prominent in compressive spectral imaging due to the large amount of data involved. In this work we demonstrate the utilization of a convolutional neural network (CNN) for the reconstruction of spectral images captured with our Compressive Sensing -Miniature Ultraspectral Imager (CS-MUSI). We discuss the challenges of training the CNN for CS-MUSI and analyze the CNNbased reconstruction performance.
Analysis of different background subtraction methods applied on drone imagery under various weather conditions in the UAE region
The following paper presents a benchmarking study of the performance of thirty state of the art background subtraction algorithms. In this work, we test the performance of multiple background subtraction methods using drone imageries taken under various weather conditions in the UAE region. This is done by comparing the quality of the foreground mask that is extracted when using these algorithms. Visual Studio and MATLAB has been used to perform the comparison simulations, which would give us a comprehensive background subtraction study to indicate the advantages and disadvantages of each of the algorithms. The algorithms must be robust to stabilization errors, able to cope with insufficient information as a result from various weather conditions such as wind, haze and heat, and able to cope with dynamic backgrounds.
A novel approach for spectrum decomposition for Raman spectroscopy
Takayuki Higo, Shuzo Eto, Yuji Ichikawa, et al.
As a fundamental study for improving the detection accuracy of Raman spectroscopy under noisy conditions, this paper proposes a novel spectrum decomposition method, where the observed spectrum from an unknown substance is decomposed into some known spectra. Raman spectroscopy can be used for a remote sensing method, where a laser is irradiated to the target and then the Raman scattering light is analyzed to detect the target constituents. The spectrum decomposition is the method to analyze the observed spectrum, that is the Raman scattering light, with some known spectra, which are previously developed as a database. The purpose of the decomposition is to find a linear combination of the known spectra so that the linear combination appropriately represents the observed spectrum. The coefficients of the linear combination show the density of molecules contained in the target. The coefficients can be found with multiple linear regression method. However, the coefficients can contain large errors under low signal-noise-ratio conditions. The proposed method tries to overcome the noise problem by using three techniques. The first technique is to employ the nonnegative least squares method, which is the least squares method with non-negative constraints for the coefficients. The second technique is to select the wavelengths of the observed and known spectra for the spectrum decomposition. The third technique is to select the wavelength of the laser irradiated to the target. This paper conducts numerical experiments to show the effectiveness of the proposed method.
Object-based multispectral image fusion method using deep learning
Hyunsung Jang, Namkoo Ha, Yoonmo Yeon, et al.
The goal of multispectral image fusion is to integrate complementary information from multispectral sensors to enhance human visual perception and object detection. Additionally, there are also cases when only the object needs to be emphasized with minimal background interference. This paper presents an object-based fusion method using deep learning to accomplish this objective. The proposed method uses information regarding the region of an object to perform fusion on the object. As we cannot provide labels for fusion results at the learning stage, we propose an unsupervised learning method. The proposed method simultaneously provides appropriate image information from the background and target for surveillance and reconnaissance.
Posters-Tuesday
icon_mobile_dropdown
The challenges and some thinking for the intelligentization of precision guidance ATR
Jinxiang Fan, Jia Liu
In recent years, with the raising of the effectiveness and importance of precision guided weapons in the modern high-technology war, the development of precision guidance system has made outstanding achievements. But because the targets, environment and mission of the precision guidance system have been changed significantly, the complexity of the battlefield environment and the uncertainty of the target characteristics bring new challenges to the development of precision guidance systems and technology. To make the missile adapt to the complex and varying battlefield environment and engage various targets accurately, the concepts of intelligent missile and the intelligentization of precise guidance system based on artificial intelligence technology are put forward. Although the concept of intelligent missile has been put forward for many years,the development of the existing precision guidance system still suffers from the lag of the capability of intelligentization. There is still no good solution to the problem of automatic target recognition and decision making in complex battlefield environment with high intelligence. It is difficult to meet the requirement to adapt to the complex and varying battlefield environment and engage various targets accurately under the fierce countermeasure conditions. Focusing on the precise guidance automatic target recognition, in this paper,the development process of the intelligent precise guidance ATR system is introduced. The challenges faced by the current precision guidance ATR system’s intellectualization is analyzed. Some superficial views on the development of the intelligent ATR are given.
Semantic scene understanding on mobile device with illumination invariance for the visually impaired
For Visually Impaired People (VIP), it’s very difficult to perceive their surroundings. To address this problem, we propose a scene understanding system to aid VIP in indoor and outdoor environments. Semantic segmentation performance is generally sensitive to the environment and illumination changes, including the change between indoor and outdoor environments and the change across different weather conditions. Meanwhile, most existing methods have paid more attention on either the accuracy or the efficiency, instead of the balance between both of them. In the proposed system, the training dataset is preprocessed by using an illumination-invariant transformation to weaken the impact of illumination changes and improve the robustness of the semantic segmentation network. Regarding the structure of semantic segmentation network, the lightweight networks such as MobileNetV2 and ShuffleNet V2 are employed as the backbone of DeepLabv3+ to improve the accuracy with little increasing of computation, which is suitable for mobile assistance device. We evaluate the robustness of the segmentation model across different environments on the Gardens Point Walking dataset, and demonstrate the extremely positive effect of the illumination-invariant pre-transformation in challenging real-world domain. The network trained on computer achieves a relatively high accuracy on ADE20K relabeled into 20 classes. The frame rate of the proposed system is up to 83 FPS on a 1080Ti GPU.
Rain and snow removal using multi-guided filter and anisotropic gradient in the quaternion framework
In many cases the rain and snow on an image significantly degrade the effectiveness of any computer vision algorithm, such as object recognition, tracking, retrieving and so on. The automated detection and removing such degradations in a color image is still a challenging task. This paper presents a new rain and snow removal method using low- and highfrequency parts of a single image. For this purpose, we use a color image multi-guided filter and anisotropic gradient in Hamiltonian quaternions. The quaternion framework is used to represent a color image to take into account all three channels simultaneously when inpainting the RGB image. Our results show that it has good performance in rain removal and snow removal.
Analysis of the controllers of the vessel course control systems in difficult navigation conditions
The article presents a comparison of a few controllers of the vessel's course. The mathematical model of the vessel was set as a transfer function with variable coefficients, which depend on vessel speed. We compared a classic PID controller, a PID controller with self-adjusting coefficients, an adaptive controller with implicit reference model, and an adaptive fuzzy controller. In the result of comparison the adaptive fuzzy controller demonstrated the best quality indicators in comparison with other controllers by set of criteria that allows to recommend this controller for implementation in the vessel's course control systems.