Libraly Journal

LIBRARY JOURNAL

Previous Articles     Next Articles

Application Research on Constructing a Vector Space Model of Classification based on Thesaurus for the Judgment of Relevance of Chinese Literatures

Li Qin, Yan Wenjian , Tan Lin   

  • Online:2016-12-15 Published:2016-12-25

Abstract:

In this paper, the characteristics of three mechanisms are explored to build a more effective database of Chinese literature of science and technology. Drawing upon complete content algorithm, and using Vector Space Model(VSM) based on thesaurus, the relevant literature is preprocessed. Then taking the
metallurgy industry as an example, it constructs database of Chinese literature of science and technology.By comparing the system judgment and artificial judgment of relevance, the system judgment and the other two systems’ judgment of relevance, the VSM of classification based on thesaurus is evaluated to have high accuracy. The judgment of related articles based on complete content feature algorithm is conducive to improving the function of knowledge discovery system and to improving the knowledge service level.

Key words: VSM of Classification, Classification-SIM, Chinese word segmentation with thesaurus, Judgment of Relevance, Related articles