Paper
20 December 2023 Neurorecognition visualization in multitask end-to-end speech
Author Affiliations +
Proceedings Volume 12985, Optical Fibers and Their Applications 2023; 129850G (2023) https://doi.org/10.1117/12.3022727
Event: Optical Fibers and Their Applications 2023, 2023, Lublin, Poland
Abstract
Nowadays, speech-processing technologies with different language systems are successfully used in mobile and stationary devices. Kazakh is considered a low-resource language, which poses various challenges for conventional speech recognition methods. This paper presents a proposed model capable of multitasking and handling concurrent speech recognition, dialect identification, and speaker identification, all in an end-to-end framework. The developed multitask model enables training three different tasks within a single model. A multitask recognition system is created based on the WaveNet-CTC model. Experiments show that for the concrete task end-to-end multitask model has better performance than other models.
(2023) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Orken Mamyrbayev, Sergii Pavlov, Akbayan Bekarystankyzy, Dina Oralbekova, Bagashar Zhumazhanov, Larysa Azarova, Dinara Mussayeva, Tetiana Koval, Konrad Gromaszek, Nurdaulet Issimov, and Kadrzhan Shiyapov "Neurorecognition visualization in multitask end-to-end speech", Proc. SPIE 12985, Optical Fibers and Their Applications 2023, 129850G (20 December 2023); https://doi.org/10.1117/12.3022727
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Education and training

Speech recognition

Speaker recognition

Data modeling

Performance modeling

Systems modeling

Machine learning

Back to Top