Paper
10 October 2023 Speech visualization for hearing impairments: from speech to colour image sequence combining text and meaning
Xiaoyan Shi, Xu Wang
Author Affiliations +
Proceedings Volume 12799, Third International Conference on Advanced Algorithms and Signal Image Processing (AASIP 2023); 127995G (2023) https://doi.org/10.1117/12.3006706
Event: 3rd International Conference on Advanced Algorithms and Signal Image Processing (AASIP 2023), 2023, Kuala Lumpur, Malaysia
Abstract
Hearing-impaired children have difficult in learning and understand the active speakers. In particular, pronunciation isn’t related to character form. In order to enhance the communication between the hearing impairments and active speakers in this situation, this paper presents a new visible speech mode without speech recognition, which uses a color image sequence combining text and meaning to represent the pronunciation of a tonal word. First, the high-dimension combined feature, which consists of tone and Mel frequency cepstrum coefficient (MFCC) features extracted from speech signals, is mapped into a low-dimensional feature by an audio encoder. Second, due to the outputs of an image encoder, the low-dimensional feature is decoded as tone image, word-no-tone image, pictograph image, and meaning image in word decoder and image generator. Finally, an image sequence containing four types of images are synthesized and overlaid on colors due to the formant frequencies. When this is done, the text, semantics, and sound characteristics of the pronunciation of a tonal word are integrated into a color image sequence. In our visible speech experiments and perceptual tests, the recognition accuracy and subjective mean scores are all above 87%. The results show that our speech-to-image sequence mode is effective and feasible.
(2023) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Xiaoyan Shi and Xu Wang "Speech visualization for hearing impairments: from speech to colour image sequence combining text and meaning", Proc. SPIE 12799, Third International Conference on Advanced Algorithms and Signal Image Processing (AASIP 2023), 127995G (10 October 2023); https://doi.org/10.1117/12.3006706
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Visualization

Image visualization

Color image encoding

Acoustics

Image compression

Binary data

Semantics

RELATED CONTENT


Back to Top