In addition to conventional camera networks, the deployment of drones allows for increased flexibility in surveillance tasks. Key components in modern analysis systems required to quickly assess a large amount of recorded aerial video data are the detection, tracking, and re-identification of persons. Each of the components is influenced by the characteristics of the aerial data and must be robust against challenges such as different flight altitudes and varying acquisition views and angles. In this work, we introduce a fast and efficient framework for person search which is specifically tailored to the characteristics of aerial data recorded by drones. In contrast to most of the works on person search and re-identification, we incorporate a tracking technique to add relevant context information about persons' movements to the retrieval results. In general, we focus on the three pipeline stages person detection, tracking, and re-identification as itself as well as the interplay between the components. For this, we adapt current state-of-the-art approaches for detection to the specific characteristics of aerial data and speed up the inference time by several modifications. Next, we apply a deep learning-based tracking approach, namely Deep SORT, to generate person tracks based on the detections. For the re-identification stage, we employ a lightweight re-identification model which is applied to generate features for both tracking and re-identification. To demonstrate the suitability of our proposed video analysis pipeline, we evaluate each component as well as their interplay on the P-DESTRE dataset.
As the technological advances of the last decade have led to increased performance and availability of video cameras, along with the rise of deep learning-based image recognition, the task of person re-identification has almost exclusively been studied on datasets with ground-based, static camera settings. Yet re-identification applications on aerial-based data captured by Unmanned Aerial Vehicles (UAVs) can be particularly valuable for monitoring public events, border protection, and law enforcement. For a long time no publicly available UAV-based re-identification datasets of sufficient size for modern machine learning techniques existed, which prevented research in this area. Recently, however, two new large-scale UAV-based datasets have been released. We examine re-identification performances of common neural networks on the newly released PRAI-1581 and P-DESTRE aerial-based datasets for UAV-related error sources and data augmentation strategies to increase robustness against them. Our findings of common error sources for these UAV-based datasets include occlusions, camera angles, bad poses, and low resolutions. Furthermore, data augmentation techniques such as rotating images during training prove to be a promising aid for training on the UAV-based data with varying camera angles. By carefully selecting robust networks in addition to choosing adequate training parameters and data augmentation strategies we are able to surpass the original re-identification accuracies published by the authors of the PRAI-1581 and the P-DESTRE dataset respectively.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.