Multi-document reading comprehension is an important and difficult task in natural language processing. To address the issue that ELECTRA pre-training model has length limitation and cannot be directly adapt to multi-document reading comprehension task, this paper proposes a novel model based on ELECTRA and document sliding windows. In the model multiple documents are split and merged through document sliding windows, new segmentation embedding is introduced, answer position in documents is modelled as a learning target, and ELECTRA is used for joint training in each window. After obtaining all prediction outcomes of each window, the results are comprehensively sorted to achieve the optimal answer. The experiments show that Rouge-L of this model reaches 51.28% on the multi-document reading comprehension dataset MS-MARCO, ranking the current best result.
|