Libraly Journal

Libraly Journal ›› 2025, Vol. 44 ›› Issue (415): 64-74.

Previous Articles     Next Articles

Machine Reading Comprehension and Lexicon-Enhanced BERT Based Named Entity Recognition Ancient Chinese Science and Technology Texts

Pan Jun,Xiao Mingcheng, Tao Xiangxing(School of Science Zhejiang University of Science and Technology)   

  • Online:2025-11-15 Published:2025-11-26
  • About author:Pan Jun,Xiao Mingcheng, Tao Xiangxing(School of Science Zhejiang University of Science and Technology)

Abstract:

Recognizing named entities in the texts of ancient Chinese science and technology haspresented a unique challenge in recent years. In response we introduce a novel named entity recognitionNER method DLEBERT-MRC which is grounded in a machine reading comprehension MRCframework and utilizes a domain lexicon-enhanced BERT model to extract contextual information fromboth questions and target texts. The introduction of domain-specific lexicons through a bilinear attentionmechanism between transformer layers significantly enriches contextual information. SoftMax is employedin the decoding layer to accurately predict the start and end positions of entities in the input text.Furthermore we constructed an ancient Chinese science and technology NER dataset employing theBIOES scheme. The dataset derived from Baidu Encyclopedia is designed to align with thespecifications of MRC tasks. Experimental evaluation verifies the effectiveness of the proposed methodwhile ablation experiments demonstrate the importance of each component of the model.

Key words:

Named entity recognition, Machine reading comprehension, Domain lexicon enhanced BERT, Ancient Chinese science and technology, Digital humanities