Paper
8 November 2023 CTC_BERT: a chinese text correction model with multi-scale semantic feature caption
Guoqi Wang, Ran Li, Tianyu Li, Tongtong Xie, Jingwei Cao, Lifen Jiang, Fengbo Zheng
Author Affiliations +
Proceedings Volume 12923, Third International Conference on Artificial Intelligence, Virtual Reality, and Visualization (AIVRV 2023); 129231Z (2023) https://doi.org/10.1117/12.3011456
Event: 3rd International Conference on Artificial Intelligence, Virtual Reality and Visualization (AIVRV 2023), 2023, Chongqing, China
Abstract
Text correction aims to determine whether natural language text contains grammatical errors, text errors, etc., and correct sentences. Previous work usually adopts byte pair encoding (BPE), which may lead to semantically related Chinese characters being separated. In addition, the previous models can only extract the superficial semantic features, but cannot capture the global deep semantic relationships. In this paper, we introduce the time convolutional network (TCN) to capture multi-scale semantic information, so as to promote the development of global semantic information. The CTC_BERT model uses the synonym masking strategy to reduce the semantic segmentation of related words and adds a fully connection layer as the error detection layer. In order to verify the performance of the CTC_BERT model, a comparison experiment has been carried out on SIGHAN2015+Wang271K Chinese error correction dataset. The results show that the accuracy of this model can reach 81.4%, which is better than that of BERT, BART, ConvSeq2Seq and other conventional models, and effectively improves the performance of text error correction.
(2023) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Guoqi Wang, Ran Li, Tianyu Li, Tongtong Xie, Jingwei Cao, Lifen Jiang, and Fengbo Zheng "CTC_BERT: a chinese text correction model with multi-scale semantic feature caption", Proc. SPIE 12923, Third International Conference on Artificial Intelligence, Virtual Reality, and Visualization (AIVRV 2023), 129231Z (8 November 2023); https://doi.org/10.1117/12.3011456
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Error control coding

Semantics

Data modeling

Performance modeling

Error analysis

Data corrections

Education and training

Back to Top