Paper
7 August 2024 Cross modal sentiment analysis model based on modal representation learning
Jianguo Bai, Hai Yang, Cheng Feng, Shuxian Wang, Xue Li
Author Affiliations +
Proceedings Volume 13229, Seventh International Conference on Advanced Electronic Materials, Computers, and Software Engineering (AEMCSE 2024); 132292O (2024) https://doi.org/10.1117/12.3038252
Event: Seventh International Conference on Advanced Electronic Materials, Computers, and Software Engineering (AEMCSE 2024), 2024, Nanchang, China
Abstract
With the rapid development of Internet and multimedia technology, people tend to express their feelings and views through video and other media. The key to sentiment analysis in user videos on social media is to fully utilize the embedded multimodal features, such as text, audio, and facial expressions, to establish efficient deep learning models. The traditional processing methods of simply fusing feature vectors or using multiple models to comprehensively predict results cannot effectively extract the intra modal characteristics and inter modal commonalities of multiple modal data, resulting in unsatisfactory accuracy of sentiment analysis results. In response to the above issues, this article takes monologue videos posted by users on social media as the specific research object and proposes a cross modal sentiment analysis model CMRL based on modal representation learning. By establishing constraints for both independent and fused modal modules, the fused modal module can fully consider the intrinsic characteristics of the modes. In order to enable the model to fully learn the intra modal characteristics, a loss function based on Pearson correlation coefficient is established by combining the sentiment analysis results of the independent modal module's speech modality, text modality, and expression image modality data with the sentiment analysis results of the fusion modal module. In order to prevent loss or confusion of intra modal features after feature fusion, the speech modal features, text modal features, and expression image features extracted by the Transformer in the independent modal module are fused, and a loss function based on Spearman correlation coefficient is established with the fused features of the fused modal module.
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Jianguo Bai, Hai Yang, Cheng Feng, Shuxian Wang, and Xue Li "Cross modal sentiment analysis model based on modal representation learning", Proc. SPIE 13229, Seventh International Conference on Advanced Electronic Materials, Computers, and Software Engineering (AEMCSE 2024), 132292O (7 August 2024); https://doi.org/10.1117/12.3038252
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Data modeling

Feature extraction

Transformers

Feature fusion

Performance modeling

Image fusion

Correlation coefficients

Back to Top