Convolution-augmented external attention model for time domain speech separation

Yuning Zhang; He Yan; Linshan Du; Mengxue Li

doi:10.1117/12.2671718

16 March 2023 Convolution-augmented external attention model for time domain speech separation

Yuning Zhang, He Yan, Linshan Du, Mengxue Li

Proceedings Volume 12593, Second Guangdong-Hong Kong-Macao Greater Bay Area Artificial Intelligence and Big Data Forum (AIBDF 2022); 125930T (2023) https://doi.org/10.1117/12.2671718
Event: 2nd Guangdong-Hong Kong-Macao Greater Bay Area Artificial Intelligence and Big Data Forum (AIBDF 2022), 2022, Guangzhou, China

Abstract

The ability of the separator to capture the context-detailed features of speech signals and the number of parameters directly affect the accuracy and efficiency of speech separation in time-domain speech separation network (TasNet). This paper combines lightweight external attention with convolution and extends external attention to channel dimension; while satisfying the fine-grained extraction and modeling of spatial-channel correlation, it maintains small parameters and computation. Convolutional position coding is also used to integrate the contextual relationship and relative position information of speech features better. The above module then applies as a separator in the encoder-decoder structure based on TasNet, and a new convolution-augment external attention model for time-domain speech separation is proposed: ExConNet. The comparative experimental results show that ExConNet achieves considerable accuracy of speech separation, while its model parameters and calculation amount are significantly reduced, which can better meet the need for efficiency of speech separation.

Citation Download Citation

Yuning Zhang, He Yan, Linshan Du, and Mengxue Li "Convolution-augmented external attention model for time domain speech separation", Proc. SPIE 12593, Second Guangdong-Hong Kong-Macao Greater Bay Area Artificial Intelligence and Big Data Forum (AIBDF 2022), 125930T (16 March 2023); https://doi.org/10.1117/12.2671718

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available

Members: $17.00

Non-members: $21.00 ADD TO CART

PROCEEDINGS
6 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

Convolution

Education and training

Feature extraction

Modeling

Performance modeling

Transformers

3D modeling

Show All Keywords

Keywords/Phrases

Search In:

Publication Years