Libraly Journal

Libraly Journal ›› 2022, Vol. 41 ›› Issue (10): 25-34.

Previous Articles     Next Articles

Research on Building Vocabulary in the Field of Public Culture Based on Multi-Source Data Fusion

Wang Xiaoxue1, Hua Bolin2, 3 (1 School of Software and Microelectronics of Peking University; 2 Department of Information Management of Peking University; 3 Key Laboratory of Culture and Tourism of#br# Ministry of Public Cultural Services Big Data Application)   

  • Online:2022-10-15 Published:2022-10-19
  • About author:Wang Xiaoxue1, Hua Bolin2, 3 (1 School of Software and Microelectronics of Peking University; 2 Department of Information Management of Peking University; 3 Key Laboratory of Culture and Tourism of Ministry of Public Cultural Services Big Data Application)

Abstract:

With the rapid development of public culture cloud and the endless emergence of the smartmodels of public culture, building a vocabulary about this field is necessary for carrying out real timemonitoring and in-depth analysis and mining, as using a vocabulary can increase the accuracy of analysisand mining as well as the readability of data analysis results. Therefore, how to generate an up-to-dateand complete vocabulary in the field of public culture based on policies, laws and regulations, activityreports and other text data is an important content of public culture big data development. We collectedgovernment policy documents, legal documents, government announcements, cultural activity data,newspaper and periodicals in the field of public culture, obtained terms from these texts through automaticextraction and artificial tagging, and classified the terms by rules, K-means, KNN and other methods. Thedictionary now contains 19 categories and about 28,000 entries related to public culture, which can beexpanded in the future.