Paper
22 March 2019 Predicting visual saliency via a dilated inception module-based model
Sheng Yang, Weisi Lin
Author Affiliations +
Proceedings Volume 11049, International Workshop on Advanced Image Technology (IWAIT) 2019; 110491D (2019) https://doi.org/10.1117/12.2521507
Event: 2019 Joint International Workshop on Advanced Image Technology (IWAIT) and International Forum on Medical Imaging in Asia (IFMIA), 2019, Singapore, Singapore
Abstract
With the advent of deep convolutional neural networks (DCNN), the improvements in visual saliency prediction research are impressive. Despite this, it is still needed to fully characterize the multi-scale saliency-influential factors into the current deep saliency framework for further improvement. However, the existing approaches aiming at capturing multi-scale contextual features either suffer from the heavy computation or limited performance gain. To overcome this, a lightweight yet powerful module for fully exploiting multi-scale contextual features is desired. In this paper, we propose a DCNN-based visual saliency prediction model to approach this goal. Our model is inspired by the GoogleNet, which use the inception module to capture multi-scale contextual features at various receptive fields. Specifically, we revise the original inception module to have more powerful multi-scale feature extraction capacity and less computation load by utilizing dilated convolutions to replace the original standard ones. The whole model is trained end-to-end and is efficient to achieve real-time performance. Experimental results on several challenging saliency benchmark datasets, including SALICON, MIT1003, and MIT300, demonstrate that our proposed saliency model can achieve state-of-the-art performance with competitive inference time.
© (2019) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Sheng Yang and Weisi Lin "Predicting visual saliency via a dilated inception module-based model", Proc. SPIE 11049, International Workshop on Advanced Image Technology (IWAIT) 2019, 110491D (22 March 2019); https://doi.org/10.1117/12.2521507
Lens.org Logo
CITATIONS
Cited by 1 scholarly publication.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Visual process modeling

Feature extraction

Image processing

Convolutional neural networks

Back to Top