Intelligent military systems require perception capabilities that are flexible, dynamic, and robust to unstructured environments and new situations. However, current state-of-the-art algorithms are based on deep learning, require large amounts of data, and require a proportionally large human effort in collection and annotation. To help improve this situation, we define a method of comparing 3D environment reconstructions without ground truth based on the exploitation of available reflexive information, and use the method to evaluate existing RGBD mapping algorithms in an effort to generate a large, fully-annotated data set for visual learning tasks. In addition, we describe algorithms and software that support rapid manual annotation of these reconstructed 3D environments for a variety of vision tasks. Our results show that we can use existing data sets as well as synthetic data to bootstrap tools that allow us to quickly and efficiently label larger data sets without ground truth, maximizing human effort without requiring crowd sourcing techniques.
|