• 工作研究 •

基于机器学习的高校图书馆高利用率教参书识别研究

• 出版日期:2020-11-25 发布日期:2020-11-25
• 作者简介:徐梦宇 女，复旦大学文献信息中心，硕士研究生。 研究方向：数据管理与应用。作者贡献：数据收集、 实验、论文撰写。 E-mail：17210830025@fudan. edu.cn 上海 200433 成伟华 复旦大学图书馆，副研究馆员。研究方向： 数字图书馆。作者贡献：研究方案设计、论文修改。 上海 200433 张计龙 复旦大学图书馆，研究馆员。研究方向：数 字图书馆、科学数据管理。作者贡献：主题策划、论 文修改定稿。上海 200433

Research on the Identification of University Library’s Frequently Utilizated Reference Books Based on Machine Learning Methods

Xu Mengyu, Cheng Weihua, Zhang Jilong

• Online:2020-11-25 Published:2020-11-25

Abstract: The support services related to reference books is one important task of university libraries. The prerequisite of carrying out that is to identify the frequently utilized reference books. Based on machine learning methods and massive dynamic data about user behaviors, this paper first constructs seven-dimensional feature sets about borrower types, borrowing time and utilization rate to establish the identification model of frequently utilized reference books. In the experiment session, Support Vector Machine, Decision Tree, Random Forest and XGBoost algorithm are selected for comparison. Among them, XGBoost algorithm yields the best experimental performance. To be specific, its precision, recall and F1-score are 0.849, 0.906, 0.876. In summary, we obtain good identification result in this paper.