PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
This PDF file contains the front matter associated with SPIE Proceedings Volume 10648 including the Title Page, Copyright information, Table of Contents, Introduction, and Conference Committee listing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Object detection in aerial imagery is crucial for many applications in the civil and military domain. In recent years, deep learning based object detection frameworks significantly outperformed conventional approaches based on hand-crafted features on several datasets. However, these detection frameworks are generally designed and optimized for common benchmark datasets, which considerably differ from aerial imagery especially in object sizes. As already demonstrated for Faster R-CNN, several adaptations are necessary to account for these differences. In this work, we adapt several state-of-the-art detection frameworks including Faster R-CNN, R-FCN, and Single Shot MultiBox Detector (SSD) to aerial imagery. We discuss adaptations that mainly improve the detection accuracy of all frameworks in detail. As the output of deeper convolutional layers comprise more semantic information, these layers are generally used in detection frameworks as feature map to locate and classify objects. However, the resolution of these feature maps is insufficient for handling small object instances, which results in an inaccurate localization or incorrect classification of small objects. Furthermore, state-of-the-art detection frameworks perform bounding box regression to predict the exact object location. Therefore, so called anchor or default boxes are used as reference. We demonstrate how an appropriate choice of anchor box sizes can considerably improve detection performance. Furthermore, we evaluate the impact of the performed adaptations on two publicly available datasets to account for various ground sampling distances or differing backgrounds. The presented adaptations can be used as guideline for further datasets or detection frameworks.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Recognizing targets from infrared images is a very important task for defense system. Recently, deep learning becomes an important solution of the classification problems which can be used for target recognition. In this study, a machine learning approach SVM and a deep learning approach CNN are compared for target recognition on infrared images. This paper applies SVM to measure the linear separability of the classes and obtain the baseline performance for the classes. Then, the constructed CNN model is applied to the dataset. The experimental results show that CNN model increases the overall performance around % 7.7 than SVM on prepared infrared image datasets.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
There has been vast process in linking semantic information across the billions of web pages through the use of ontologies encoded in the Web Ontology Language (OWL) based on the Resource Description Framework (RDF). A prime example is the Wikipedia where the knowledge contained in its more than four million pages is encoded in an ontological database called DBPedia http://wiki.dbpedia.org/. Web-based query tools can retrieve semantic information from DBPedia encoded in interlinked ontologies that can be accessed using natural language. This paper will show how this vast context can be used to automate the process of querying images and other geospatial data in support of report changes in structures and activities. Computer vision algorithms are selected and provided with context based on natural language requests for monitoring and analysis. The resulting reports provide semantically linked observations from images and 3D surface models.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We present a performance comparison of mutliplexed field of view imagers with conventional scanning and staring imaging systems. The results are based on emulation of the different sensing modalities using data obtained with commercially available infra-red cameras. It is shown that using a single FPA, a computational imager can provide better performance than a scanning system for finding targets over a wide field of view.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A test system with four cameras in the infrared and visual spectra is under development at FFI (The Norwegian Defence Research Establishment). The system can be mounted on a high speed jet aircraft, but may also be used in a land-based version. It can be used for image acquisition as well as for development and test of automatic target recognition (ATR) algorithms. The sensors on board generate large amounts of data, and the scene may be rather cluttered or include anomalies (e.g. sun glare). This means we need image processing and pattern recognition algorithms which are robust, fast (real-time), and able to handle complex scenes. Algorithms based on order statistics are known to be robust and reliable. However, they are in general computationally heavy, and thus often unsuitable for real time applications. But approximations to order statistics do exist. Median of medians is one example. This is a technique where an approximation of the median of a sequence is found by first dividing the sequence in subsequences, and then calculating median (of medians) recursively. The algorithm is very efficient, the processing time is of order O(n). By utilizing such techniques for estimating image statistics, the computational challenge can be overcome. In this paper we present strategies for how approximations to order statistics can be applied for developing robust and fast algorithms for image processing, especially visualization and segmentation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Military ships commonly use infrared decoys for defending themselves against infrared anti-ship missile systems. For anti-ship missiles, having a fast and robust infrared counter – counter measure (IRCCM) algorithm is crucial for overall system performance. In this paper, a straight-forward yet effective IRCCM method for sea skimming missile systems is introduced. First, the common tactics for disposing decoy are analyzed. The analysis shows that the vertical component of optical flow may contain critical information on whether there is a decoy in the scene. This idea is supported by another intensity based criterion. However optical flow calculations create a significant computational load, and to speed up the algorithm and decrease the false alarm rate, the waterline is detected by using Hough transform and the optical flow calculations are only applied to the region that is above the waterline with a margin. Since the infrared signature of both military ships and decoys are classified, some scenarios are created by using infrared scene generator. A scene generator can model the objects by considering its material information, the atmospheric conditions, detector type and the territory. The performance of overall methodology is tested with this infrared test data.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Target recognition is a key aspect for many applications. Rapidly maturing small sensor platforms continually require better, more agile sensor performance coupled with smaller, lighter, and faster sensor implementations. Additionally, longer range applications necessitate more efficient use of photons received from active illumination. We describe a potential approach to overcoming both issues based on photon counting laser radar, which performs pattern recognition using images with very few detected photo-events. Previous work using intensity images show near ideal pattern recognition with as low as 50 photo-detections. We investigate through simulation an extension of prior work to 3D point cloud imagery.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
One of the historical and fundamental uses of the Edgeworth and Gram-Charlier series is to “correct” a Gaussian density when it is determined that the probability density under consideration has moments that do not correspond to the Gaussian [5, 6]. There is a fundamental difficulty with these methods in that if the series are truncated, then the resulting approximate density is not manifestly positive. The aim of this paper is to attempt to expand a probability density so that if it is truncated it will still be manifestly positive.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Local features with invariant descriptions are important for many tasks in image processing and computer vision. This paper presents a new local feature descriptor for 3D object and scene representation. The new descriptor, named 3DSSIM, explores the internal geometric property of layout similarity of 3D objects to produce efficient feature representation. The 3D-SSIM is highly distinctive, quick to compute, and shows superior advantages in terms of robustness to noises, invariance to viewpoints, and tolerance to geometric distortions. We extensively evaluated performance of the new descriptor with various datasets.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Pipeline right-of-way (ROW) monitoring and safety pre-warning is an important way to guarantee a safe operation of oil/gas transportation. Any construction equipment or heavy vehicle intrusion is a potential safety hazard to the pipeline infrastructure. Therefore, we propose a novel technique that can detect and classify an intrusion on oil/gas pipeline ROW. The detection part has been done based on our previous work, where we built a robust feature set using a pyramid histogram of oriented gradients in the Fourier domain with corresponding weights. Then a support vector machine (SVM) with radial basis kernel is used to distinguish threat objects from background. For the classification part, the object can be represented by an integrated color, shape and texture (ICST) feature set, which is a combination of three different feature extraction techniques viz. the color histogram of HSV (hue, saturation, value), histogram of oriented gradient (HOG), and local binary pattern (LBP). Then two decision making models based on K-nearest neighbor (KNN) and SVM classifier are utilized for automatic object identification. Using real-world dataset, it is observed that the proposed method provides promising results in identifying the objects that are present on the oil/gas pipeline ROW.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Finding an object in images with orientation invariance has many applications in computer vision and pattern recognition. Template matching is a typical approach for finding objects. However, template matching requires pixel additions and multiplications on each pixel in a template and for each pixel in images, which involves a significant amount of pixel operations. Many techniques have been investigated to reduce the computational cost including FFT techniques, partial illumination approaches, and coarse to fine methods. These techniques may work for finding objects with the same orientation. However, when the object orientation in templates is different from the orientation in images, the computational cost is prohibitive even for the latest fast template matching techniques. In this paper, by combining the ideas of moment invariants in pattern recognition, Green theorem from physics and Bresenham line algorithm from computer graphics, we propose a mask size independent and orientation invariant object finding technique. From theoretical analysis and experiments, we demonstrate that this new technique significantly reduces computational cost for orientation free object finding from 0(ܰN3M2) of the direct implementation to 0(ܰN2)
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents another outlook on image description, classification and retrieval. Some popular image description methods are Histogram of Oriented Gradients (HoG), Speed Up Robust Features (SURF) and Scale Invariant Feature Transform (SIFT). While SURF and SIFT both use ”interest points” to describe an image, HoG uses all of the points in the image. One of the goals of this paper is to improve HoG by creating a feature vector containing more information about the image. The proposed description method is called the Histogram of Second order Oriented Gradients (HSoG) and it was shown to perform better than HoG using a dataset comprising of airplanes, cars and motorbikes by supervised learning. The second goal is to tackle image clustering for aid in unsupervised learning and this paper explores a method called Localized Clustering with a comparison to K-Means. The localized clustering approach does not require the number of clusters as an input but it does return what it determines the number of clusters should be. Finally, The retrieval process presented involves training a linear SVM with known labels (supervised) to evaluate the effectiveness of HoG vs HSoG and HSoG out performs HoG.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Spotting a shooter from a drone has been the subject of great interest lately due to its many applications in the fields of defense and security and law enforcement. Using a drone can be an effective way to detect potential threats in many real-life scenarios. Nevertheless, acoustic signals recorded from a drone usually exhibit a very low SNR, mainly due to the distance to the source and the proximity of the sensors to the propellers. This is a serious limiting factor and, therefore, the use of signal enhancement techniques is required. This work addresses the problem of determining the Direction-of-Arrival (DoA) of the muzzle blast, captured using a planar microphone array mounted on a commercial DJI PHANTOM 4 drone in flight. This new shooter localization method that relies solely on detecting and estimating the DoA of the muzzle blast. However, the typical low SNR in this scenario requires the use of preprocessing techniques, such as signal clipping and median filtering, to enhance the signal of interest (muzzle blast). In addition, we employ a recently introduced improved data selection DoA estimation method suitable for gunshot signals recorded from a low to medium altitude mobile aerial platform. Positive results achieved indicate that this approach is effective and of practical interest.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we propose a new Automatic Target Recognition (ATR) system, based on Deep Convolutional Neural Network (DCNN), to detect the targets in Forward Looking Infrared (FLIR) scenes and recognize their classes. In our proposed ATR framework, a fully convolutional network (FCN) is trained to map the input FLIR imagery data to a fixed stride correspondingly-sized target score map. The potential targets are identified by applying a threshold on the target score map. Finally, corresponding regions centered at these target points are fed to a DCNN to classify them into different target types while at the same time rejecting the false alarms. The proposed architecture achieves a significantly better performance in comparison with that of the state-of-the-art methods on two large FLIR image databases.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Recent work has seen a surge of sparse representation based classification (SRC) methods applied to automatic target recognition problems. While traditional SRC approaches used l0 or l1 norm to quantify sparsity, spike and slab priors have established themselves as the gold standard for providing general tunable sparse structures on vectors. In this work, we employ collaborative spike and slab priors that can be applied to matrices to encourage sparsity for the problem of multi-view ATR. That is, target images captured from multiple views are expanded in terms of a training dictionary multiplied with a coefficient matrix. Ideally, for a test image set comprising of multiple views of a target, coefficients corresponding to its identifying class are expected to be active, while others should be zero, i.e. the coefficient matrix is naturally sparse. We develop a new approach to solve the optimization problem that estimates the sparse coefficient matrix jointly with the sparsity inducing parameters in the collaborative prior. ATR problems are investigated on the mid-wave infrared (MWIR) database made available by the US Army Night Vision and Electronic Sensors Directorate, which has a rich collection of views. Experimental results show that the proposed joint prior and coefficient estimation method (JPCEM) can: 1.) enable improved accuracy when multiple views vs. a single one are invoked, and 2.) outperform state of the art alternatives particularly when training imagery is limited.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Adaptive or sequential compressive sensing has received considerable attention recently. While some researchers argue that there are fundamental limits to adaptive sensing that prevents it from outperforming non-adaptive compressive sensing, others have shown that adaptive sensing of sparse signals may help speed up applications such as target detection and/or classification, and might even expedite the signal reconstruction process with less compressive measurements. This paper examines the benefits of adaptive compressive sensing of radar target backscatter with emphasis on target classification (which is accomplished using sequential hypotheses testing) and compares the results to non-adaptive compressive sensing of noisy radar signatures.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The important work of improving signal to noise ratios for improved target detection presents one way to improve the target detection process. Dimensionality analysis of the data and the removal of uninteresting data is an effective method for target detection especially since it does not correlate the existing data. The process of deciding whether an anomaly in the data is a target is also an important part of target detection and this process may be just as important as uncovering the target from buried noise through the analysis of high dimensional data sets and the interrelated frequency contents, said in a different way, the noise and clutter removal processing may not always be able to help pull the target out of the high dimensional data enough to be able to detect the target with a simple thresholding approach. In this paper, we utilize the random forest technique to try and improve the decision making process in the detection of targets buried in noise.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Performance Analysis of Deep Learning-based Automatic Target Recognition
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.