Libraly Journal

Libraly Journal ›› 2020, Vol. 39 ›› Issue (8): 75-81.

Previous Articles     Next Articles

Automatic Construction of Thesaurus of the Anti-JapaneseWar History of the Republic of China Period

Du Huiping   Xue Chunxiang   

  • Online:2020-08-25 Published:2020-08-25

Abstract: In view of the problems existing in the information retrieval of the Republic of China period, this
paper took the Anti-Japanese War history as an example and proposed a scheme for automatic construction
of thesaurus that can improve information searching efficiency. With Shenbao as the main corpus, this
paper provided the key technical solutions to the generation of thesaurus and showed examples, including
collecting vocabulary through various ways, assisting experts in determining the candidate terms included
in the thesaurus, and identifying the hyponym and hypernym relationship, synonymous relationship, and
associative relationship between terms using natural language processing techniques such as frequency
statistics and co-occurrence analysis. Finally, the macro-structure, the scope and methods of collecting
words, storage and publication of the thesaurus were discussed. Automating the process of thesaurus
construction assisted by manual judgement can save time, efforts and costs in addition to allowing easy
maintenance and expansion of the constructed thesaurus and promoting the application of thesaurus.