Human object articulation for CCTV video forensics

I. Zafar; M. Fraz; Eran A. Edirisinghe

doi:10.1117/12.2004956

19 March 2013 Human object articulation for CCTV video forensics

I. Zafar, M. Fraz, Eran A. Edirisinghe

Proceedings Volume 8663, Video Surveillance and Transportation Imaging Applications; 86630Z (2013) https://doi.org/10.1117/12.2004956
Event: IS&T/SPIE Electronic Imaging, 2013, Burlingame, California, United States

Abstract

In this paper we present a system which is focused on developing algorithms for automatic annotation/articulation of humans passing through a surveillance camera in a way useful for describing a person/criminal by a crime scene witness. Each human is articulated/annotated based on two appearance features: 1. primary colors of clothes in the head, body and legs region. 2. presence of text/logo on the clothes. The annotation occurs after a robust foreground extraction based on a modified approach to Gaussian Mixture model and detection of human from segmented foreground images. The proposed pipeline consists of a preprocessing stage where we improve color quality of images using a basic color constancy algorithm and further improve the results using a proposed post-processing method. The results show a significant improvement to the illumination of the video frames. In order to annotate color information for human clothes, we apply 3D Histogram analysis (with respect to Hue, Saturation and Value) on HSV converted image regions of human body parts along with extrema detection and thresholding to decide the dominant color of the region. In order to detect text/logo on the clothes as another feature to articulate humans, we begin with the extraction of connected components of enhanced horizontal, vertical and diagonal edges in the frames. These candidate regions are classified as text or non-text on the bases of their Local Energy based Shape Histogram (LESH) features combined with KL divergence as classification criteria. To detect humans, a novel technique has been proposed that uses a combination of Histogram of Oriented Gradients (HOG) and Contourlet transform based Local Binary Patterns (LBP) with Adaboost as classifier. Initial screening of foreground objects is performed by using HOG features. To further eliminate the false positives due to noise form background and improve results, we apply Contourlet-LBP feature extraction on the images. In the proposed method, we extract the LBP feature descriptor for Contourlet transformed high pass sub-images from vertical and diagonal directional bands. In the final stage, extracted Contourlet-LBP descriptors are applied to Adaboost for classification. The proposed frame work showed fairly fine performance when tested on a CCTV test dataset.

Citation Download Citation

I. Zafar, M. Fraz, and Eran A. Edirisinghe "Human object articulation for CCTV video forensics", Proc. SPIE 8663, Video Surveillance and Transportation Imaging Applications, 86630Z (19 March 2013); https://doi.org/10.1117/12.2004956

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available