Current research on multi-label image classification mainly focuses on exploring the correlation between labels to improve the classification accuracy of multi-label images. However, in the existing methods, the label correlation is calculated based on the statistical information of the data. This label correlation is global and depends on the data set, and is not suitable for all samples. In the process of extracting image features, the The characteristic information of small objects is easily lost, resulting in low classification accuracy of small objects. For this reason, this paper innovatively proposes a multi-label image classification model based on multi-scale semantic attention and graph attention network. vector, followed by feature fusion to enhance the feature information of small objects, and then use the self-attention mechanism in the graph attention module to adaptively mine the correlation between categories in the image, and propose an attention regularization loss. The mAP of the model on the two public datasets of VOC 2007 and MS-COCO 2014 reached 95.5% and 83.4%, respectively, and most of the indicators are better than the existing state-of-the-art methods.
|