Unmanned surveillance platforms have a ubiquitous presence in surveillance and reconnaissance operations. As the resolution and fidelity of the video sensors on these platforms increases, so does the bandwidth required to provide the data to the analyst and the subsequent analyst workload to interpret it. This leads to an increasing need to perform video processing on-board the sensor platform, thus transmitting only critical information to the analysts, reducing both the data bandwidth requirements and analyst workload.
In this paper, we present a system for object recognition in video that employs embedded hardware and CPUs that can be implemented onboard an autonomous platform to provide real-time information extraction. Called NEOVUS (NEurOmorphic Understanding of Scenes), our system draws inspiration from models of mammalian visual processing and is implemented in state-of-the-art COTS hardware to achieve low size, weight and power, while maintaining realtime processing at reasonable cost. We use visual attention methods for detection of stationary and moving objects from a moving platform based in motion and form, and employ multi-scale convolutional neural networks for classification, which has been mapped to FPGA hardware. Evaluation of our system has shown that we can achieve real-time speeds of thirty frames per second with up to five-megapixel resolution videos. Our system shows three to four orders of magnitude in power reduction compared to state of the art computer vision algorithms while reducing the communications bandwidth required for evaluation.